AMD Carrizo Part 2: A Generational Deep Dive into the Athlon X4 845 at $70
by Ian Cutress on July 14, 2016 9:00 AM ESTIn this review, we took the newest member of AMD’s desktop processor line, the Athlon X4 845, and pitted it against similar comparison points dating back to the first Bulldozer based desktop processors for the mainstream segment. This new processor uses AMD’s latest microarchitecture, Excavator, to create Carrizo based cores. The Athlon X4 845 uses two Carrizo modules for four total threads, wrapped into a 65W thermal design power window, and would appear to be the only Carrizo based processor AMD is going to release for the FM2+ socket.
The Athlon X4 845 is actually a dressed up laptop processor, modified for the desktop platform. As a result we only get 2 MB of L2 cache rather than the 4 MB for all of our comparison points, but also there is only eight PCIe 3.0 lanes rather than sixteen, which can also have some knock on effects. In this review we wanted to do a direct performance comparison, clock for clock, between the new processor and the older processors. However, some of the design decisions made above the core logic have had an impact in results.
Know Your Comparison
Typically in a review like this, we talk about IPC or ‘instructions per clock’. This is a measure of how efficient the processor is at processing instructions – either a fixed set of instructions in a quicker time or more instructions in a fixed time. There are two main components to the core design that play major roles: the front/back end that actually performs the calculations, and the memory sub-system that provides the data for calcualtions. In order to get the peak IPC for a given test, both of these components need to be running near their limit or be able to compensate if waiting for the other. However, this is often test dependent – some probe the logic more than the memory, and for others the reverse is true. It depends on what you are testing.
In most circumstances, generational processor updates have similar or improved memory sub-system arrangements which makes most comparisons in IPC directly related to the logic in the core. When we compare the Excavator design to Steamroller or Piledriver however, the memory sub-system has changed for better and for worse in our benchmark suite. This makes comparisons between the two sets of core logic difficult, as the memory plays a significant part in the performance. This is wholly benchmark dependent as well. A number of professional benchmark tests are designed specifically to either test one or other of the two segments, so it becomes really important to consider what each benchmark is doing in every case. When doing a good analysis, we can determine if the core-logic has improved (either the processing latency, scheduler, prefetch or other), or if the memory subsystem is the main catalyst for improvements.
That being said, users cannot buy one set of core logic with a different memory sub-system. They come in complete packages, and as a result the full top-down result might only of interest for users wanting to buy today. This requires both the core and the memory to work together to give better performance, so it can be striking if decisions are made to affect that. It also pains both the reviewer and the user if in fact something like the memory sub-system comes in different flavors, depending on how much is spent or if the manufacturer is just trying to sell excess parts.
The March on IPC
Nonetheless, time for the conclusions to this review. Here are the main processors we tested:
AMD CPUs | ||||||||||||
µArch / Core |
Cores | Base Turbo |
TDP | DDR3 | L1 (I) Cache |
L1 (D) Cache |
L2 Cache |
|||||
Athlon X4 845 |
Excavator Carrizo |
4 | 3500 3800 |
65 W | 2133 | 192KB 3-way |
128KB 8-way |
2 MB 16-way |
||||
Athlon X4 860K |
Steamroller Kaveri |
4 | 3700 4000 |
95 W | 1866 | 192KB 3-way |
64KB 4-way |
4 MB 16-way |
||||
Athlon X4 760K |
Piledriver.v2 Richland |
4 | 3800 4100 |
100 W | 1866 | 128KB 2-way |
64KB 4-way |
4 MB 16-way |
||||
Athlon X4 750K |
Piledriver Trinity |
4 | 3400 4000 |
100 W | 1866 | 128KB 2-way |
64KB 4-way |
4 MB 16-way |
The main points of comparison are the caches: the AMD Athlon X4 845 has a double-size L1 data cache with an improved prefetch, but a half-size L2 cache, compared to the Kaveri based X4 860K. It is worth noting that we were not able to source 65W parts to match the X4 845, however one of the most poignant results out of the testing is our IPC performance analysis table after the 3 GHz testing. We set all the processors to 3 GHz, with maximum official supported memory for each, and it went a bit like this:
AMD Average IPC Increases | |||||||
Benchmark Suite | Richland over Trinity | Kaveri over Richland | Carrizo over Kaveri | ||||
Real World | 0.8% | 8.0% | 8.8% | ||||
Office | -0.1% | 11.1% | 4.1% | ||||
Legacy | 0.1% | 11.8% | 8.5% | ||||
Overall Windows |
0.3% | 10.3% | 7.3% | ||||
Linux | 10.4% | 10.5% | -12.1% | ||||
Gaming | -0.4% | 12.5% | -5.8% |
The AMD Athlon X4 845 is a Janus-like product: powerful, yet two-faced. In practically all of our Windows based CPU benchmarks, it scored increases over the previous generation in most part due to the larger L1 data cache but also the improved logic.
The benchmarks that required more memory, such as Agisoft or WinRAR, saw minor decreases, which could be predicted before we started.
However, two major segments saw significant decreases in performance. For our Linux tests, most of these were highly memory sensitive. NPD and NAMD are both scientific matrix solvers, requiring lots of memory accesses, and Redis is a key-value load store known to be highly cache size and latency sensitive – this bought these results down.
The gaming side of the equation is a different story, and the results were fairly consistent across all benchmarks and all GPUs: the X4 845 performs worse than the X4 860K clock for clock. There are two ways to attribute this, as mentioned above: PCIe 3.0 x8 and 2MB of L2 cache. Given previous experience with PCIe lane bandwidth requirements resulting in only a tiny difference in performance, it would seem that the latter has more of an effect on gaming (at this level of CPU power) than one might expect. It means a 6% decrease in performance when clock speeds are identical compared to Kaveri, but still ends up 5% over Trinity and Richland.
Wanting The Full Package
The AMD Athlon X4 845, as mentioned earlier in the review, is outside the regular efficiency range for the Carrizo core design. It was designed to be operated at 15W for the total chip, or 35W for the high power mode. AMD even noted in their slides that at 35W, the Carrizo and Kaveri designs would be similar for efficiency. So to push it to 65W would suggest that Kaveri might even be ahead, given the wider window that Kaveri was designed for. The result of pushing Carrizo to 65W means that there is no integrated graphics, and the frequencies are near but below the competing Kaveri parts, and overclocking is next to zero. What Carrizo relies on is the microarchitectural advances more than anything else.
Our new Athlon, at $70 launch price, competes mainly against the Intel Pentium G3258, known as the overclockable Haswell-based dual core Pentium that was launched for $72. Depending on the retailer, the time of day, how the wind is blowing, or what sale is on, these prices can be as low as $50, along with other Athlon and Pentium processors. The typical price/performance metric becomes more focused on just the performance in this case, and the battle between the two trades blows.
In single threaded environments, the G3258 wins out hands down, by having a 25-50% performance advantage despite having lower clock speeds.
However, due to having four threads rather than two, the X4 845 wins in any fully multithreaded test, particularly for heavy workloads such as video encoding. The G3258 lacks accelerated AES encryption as well, meaning the X4 845 gets a result 800% higher in that case.
Where the waters are muddied is in variable threaded workloads, or memory dependent workloads. The Pentium has larger and quicker caches, meaning that it can take the lead in some multithreaded workloads. But taking into account some benchmarks, like Google Octane, the difference is minimal:
When it comes to gaming, it depends on which benchmark/configuration you choose, but for GTA and GRID, when the Athlon is paired up with an AMD graphics card, the Athlon wins, but with an NVIDIA graphics card, the Pentium wins. For Shadow of Mordor and Alien Isolation however, the higher IPC for the Intel processor wins out no matter which GPU is used.
Carrizo, 7th Generation and the Future
When we benchmarked a number of laptops using Carrizo processors, and compared them to a Kaveri laptop, we could instantly tell that the Carrizo microarchitecture was a sufficient jump in the mobile space for performance and power, as long as OEMs would actually use dual channel memory. This was bolstered by the fact that any graphics tests relied on the integrated GPU, which saw enhancements with the new design as well. On the desktop side of the equation, the results are less clear cut. Here we have a microarchitecture with good performance characteristics for compute, but it gets let down in discrete gaming. Moreover, the competition provided by the Pentium G3258 is hard to ignore. The fact that the two processors, at stock, performed similarly for web use is an interesting element in our testing for sure.
AMD’s future will be with Bristol Ridge, using an updated Excavator microarchitecture, and the new line of high-end processors using Zen cores. Both of these are slated for the tail end of the year and/or Q1, anything from 4-8 months ahead. Is it really worth investing in a Carrizo (or Pentium) platform now only to find it has been passed later in the year? While it’s an interesting question, in my opinion it’s probably the wrong question to ask.
Bristol Ridge, using the updated Excavator core, is likely to perform similarly (within single digit %) of Carrizo in raw performance, but it will also have DDR4 and new chipsets to help deal with things like PCIe SSDs, NVMe, upgraded Ethernet and new features (features unknown at this point). For some users, especially building simple machines that just need base storage and some oomph, that will not matter much. If you are a user that slowly upgrades over time (by buying one big upgrade every now and again rather than a full system replacement), then going in for Carrizo (or Kaveri) now should be par for the course. The interesting element is whether to go for Kaveri (X4 880K) or Carrizo (X4 845), especially if the difference is only $20.
Carrizo comes with AMD’s new 95W near-silent cooler, whereas the X4 880K uses the new 125W solution. If the difference is only $20, pitch for the faster Kaveri every time. What you lose in microarchitecture will be made up by frequency and overclocking ability.
If you want to make that jump from Athlon to Zen, from mid-range to AMD’s high-end, then it might be worth investing a summer to earning more for a future system. Even if Zen doesn’t pan out completely (most users have their fingers crossed that Intel will have some competition at last), a bigger system with more storage or a better graphics card is never a bad thing.
Ultimately, the X4 845’s main let down, for gaming at least, would seem to be that 2 MB of L2 cache, and the base processor design aiming at 15W. Bristol Ridge is also aimed around 15W, and should come in 65W flavors (with integrated graphics), and it will be interesting to see what level of cache it has compared to the mobile counterparts.
AMD CPU L2 Cache Levels | ||||
Core | uArch | Cores | L2 Cache Mobile | L2 Cache Desktop |
Excavator v2 | Bristol Ridge | 4 | 2x1 MB, 16-way | ...? |
Excavator | Carrizo | 4 | 2x1 MB, 16-way | 2x1 MB, 16-way |
Steamroller | Kaveri | 4 | 2x2 MB, 16-way | 2x2 MB, 16-way |
Piledriver v2 | Richland | 4 | 2x2 MB, 16-way | 2x2 MB, 16-way |
Piledriver | Trinity | 4 | 2x2 MB, 16-way | 2x2 MB, 16-way |
131 Comments
View All Comments
artk2219 - Thursday, July 14, 2016 - link
They had too many parts that weren't hitting their mobile TDP's, or they just bakes too many chips than was needed on the mobile side. Either way, why let them sit in a warehouse or toss them at a loss, when for a very smalla mount you can just throw them into your standard desktop package and make some extra sales.TheinsanegamerN - Thursday, July 14, 2016 - link
Carrizo and kaveri did not use hypertransport. They would have to re-engineer their chip to work on AM3+, and to be frank, the AM3+ market is just too small to justify the tiny margins they would get.That money is better spent on getting zen out of the door.
neblogai - Thursday, July 14, 2016 - link
Why invest into upgrading bad product, when you can sell the same Bulldozer cores till Zen comes? And this Carriso Athlon is just a by-product of a mobile part and can only be sold for desktop. It all makes sense financially. By the way, new Bristol Ridge AMD 15W APUs are really nice and competitive, but laptop manufacturers are failing again- for example, HP Envy x360 comes with FX-9800P APU- again in single channel memory memory configuration, also with HDD installed and without possibility to use SSD. https://hardforum.com/threads/unboxing-1st-impress...TheinsanegamerN - Friday, July 15, 2016 - link
AMD doesnt take the mobile market seriously. If they did, they would be partnering up with the likes of MSI or clevo to produce a good laptop line for their APUs, or at the very least make dual channel a strict requirement.The_Countess - Tuesday, July 19, 2016 - link
AMD unfortunately can't demand much of anything from OEM's currently.and as intel still has a defacto monopoly no OEM wants to piss of intel by making a better AMD laptop.
nathanddrews - Thursday, July 14, 2016 - link
So... will there ever be a desktop Carrizo w/IGP? Much of the hype around Carrizo was focused on its very low power video playback, including H.265 hardware encode/decode.stardude82 - Thursday, July 14, 2016 - link
Isn't that what Bristol Ridge is? But on the new AM3 socket.Arnulf - Thursday, July 14, 2016 - link
AM4.Pissedoffyouth - Thursday, July 14, 2016 - link
Why not bang 8 of these cores into a 125w TDP and make it for FM2+ or AM3+? Finally an upgrade for Piledriver on AM3KAlmquist - Friday, July 15, 2016 - link
If you compare the Athlon 845 with the FX-4350 (link below), the Athlon wins on some benchmarks and loses on others. The Athlon has better IPC, but the FX has a faster clock and a 3rd level cache, leaving no clear-cut winner. If we added an L3 cache to the Athlon chip, that would speed it up, but not by a lot. In other words, Excavator is a big improvement over Piledriver in terms of performance per watt, but not much in terms of absolute performance. An Excavator based FX chip (by which I mean a chip with 8 Excavator cores and 8 MB of L3 cache) would probably be a very marginal improvement over the existing FX lineup at stock frequency, and would have less overclocking potential. I can see why AMD decided not to spend the resources to develop such a chip.http://www.anandtech.com/bench/product/1684?vs=127...