In this review, we took the newest member of AMD’s desktop processor line, the Athlon X4 845, and pitted it against similar comparison points dating back to the first Bulldozer based desktop processors for the mainstream segment. This new processor uses AMD’s latest microarchitecture, Excavator, to create Carrizo based cores. The Athlon X4 845 uses two Carrizo modules for four total threads, wrapped into a 65W thermal design power window, and would appear to be the only Carrizo based processor AMD is going to release for the FM2+ socket.

The Athlon X4 845 is actually a dressed up laptop processor, modified for the desktop platform. As a result we only get 2 MB of L2 cache rather than the 4 MB for all of our comparison points, but also there is only eight PCIe 3.0 lanes rather than sixteen, which can also have some knock on effects. In this review we wanted to do a direct performance comparison, clock for clock, between the new processor and the older processors. However, some of the design decisions made above the core logic have had an impact in results.

Know Your Comparison

Typically in a review like this, we talk about IPC or ‘instructions per clock’. This is a measure of how efficient the processor is at processing instructions – either a fixed set of instructions in a quicker time or more instructions in a fixed time. There are two main components to the core design that play major roles: the front/back end that actually performs the calculations, and the memory sub-system that provides the data for calcualtions. In order to get the peak IPC for a given test, both of these components need to be running near their limit or be able to compensate if waiting for the other. However, this is often test dependent – some probe the logic more than the memory, and for others the reverse is true. It depends on what you are testing.

In most circumstances, generational processor updates have similar or improved memory sub-system arrangements which makes most comparisons in IPC directly related to the logic in the core. When we compare the Excavator design to Steamroller or Piledriver however, the memory sub-system has changed for better and for worse in our benchmark suite. This makes comparisons between the two sets of core logic difficult, as the memory plays a significant part in the performance. This is wholly benchmark dependent as well. A number of professional benchmark tests are designed specifically to either test one or other of the two segments, so it becomes really important to consider what each benchmark is doing in every case. When doing a good analysis, we can determine if the core-logic has improved (either the processing latency, scheduler, prefetch or other), or if the memory subsystem is the main catalyst for improvements.

That being said, users cannot buy one set of core logic with a different memory sub-system. They come in complete packages, and as a result the full top-down result might only of interest for users wanting to buy today. This requires both the core and the memory to work together to give better performance, so it can be striking if decisions are made to affect that. It also pains both the reviewer and the user if in fact something like the memory sub-system comes in different flavors, depending on how much is spent or if the manufacturer is just trying to sell excess parts.

The March on IPC

Nonetheless, time for the conclusions to this review. Here are the main processors we tested:

AMD CPUs
  µArch /
Core
Cores Base
Turbo
TDP DDR3 L1 (I)
Cache
L1 (D)
Cache
L2
Cache
Athlon
X4 845
Excavator
Carrizo
4 3500
3800
65 W 2133 192KB
3-way
128KB
8-way
2 MB
16-way
 
Athlon
X4 860K
Steamroller
Kaveri
4 3700
4000
95 W 1866 192KB
3-way
64KB
4-way
4 MB
16-way
 
Athlon
X4 760K
Piledriver.v2
Richland
4 3800
4100
100 W 1866 128KB
2-way
64KB
4-way
4 MB
16-way
 
Athlon
X4 750K
Piledriver
Trinity
4 3400
4000
100 W 1866 128KB
2-way
64KB
4-way
4 MB
16-way

The main points of comparison are the caches: the AMD Athlon X4 845 has a double-size L1 data cache with an improved prefetch, but a half-size L2 cache, compared to the Kaveri based X4 860K. It is worth noting that we were not able to source 65W parts to match the X4 845, however one of the most poignant results out of the testing is our IPC performance analysis table after the 3 GHz testing. We set all the processors to 3 GHz, with maximum official supported memory for each, and it went a bit like this:

AMD Average IPC Increases
Benchmark Suite Richland over Trinity Kaveri over Richland Carrizo over Kaveri
Real World 0.8% 8.0% 8.8%
Office -0.1% 11.1% 4.1%
Legacy 0.1% 11.8% 8.5%
Overall
Windows
0.3% 10.3% 7.3%
 
Linux 10.4% 10.5% -12.1%
Gaming -0.4% 12.5% -5.8%

The AMD Athlon X4 845 is a Janus-like product: powerful, yet two-faced. In practically all of our Windows based CPU benchmarks, it scored increases over the previous generation in most part due to the larger L1 data cache but also the improved logic.

The benchmarks that required more memory, such as Agisoft or WinRAR, saw minor decreases, which could be predicted before we started.

However, two major segments saw significant decreases in performance. For our Linux tests, most of these were highly memory sensitive. NPD and NAMD are both scientific matrix solvers, requiring lots of memory accesses, and Redis is a key-value load store known to be highly cache size and latency sensitive – this bought these results down.

The gaming side of the equation is a different story, and the results were fairly consistent across all benchmarks and all GPUs: the X4 845 performs worse than the X4 860K clock for clock. There are two ways to attribute this, as mentioned above: PCIe 3.0 x8 and 2MB of L2 cache. Given previous experience with PCIe lane bandwidth requirements resulting in only a tiny difference in performance, it would seem that the latter has more of an effect on gaming (at this level of CPU power) than one might expect. It means a 6% decrease in performance when clock speeds are identical compared to Kaveri, but still ends up 5% over Trinity and Richland.

Wanting The Full Package

The AMD Athlon X4 845, as mentioned earlier in the review, is outside the regular efficiency range for the Carrizo core design. It was designed to be operated at 15W for the total chip, or 35W for the high power mode. AMD even noted in their slides that at 35W, the Carrizo and Kaveri designs would be similar for efficiency. So to push it to 65W would suggest that Kaveri might even be ahead, given the wider window that Kaveri was designed for. The result of pushing Carrizo to 65W means that there is no integrated graphics, and the frequencies are near but below the competing Kaveri parts, and overclocking is next to zero. What Carrizo relies on is the microarchitectural advances more than anything else.

Our new Athlon, at $70 launch price, competes mainly against the Intel Pentium G3258, known as the overclockable Haswell-based dual core Pentium that was launched for $72. Depending on the retailer, the time of day, how the wind is blowing, or what sale is on, these prices can be as low as $50, along with other Athlon and Pentium processors. The typical price/performance metric becomes more focused on just the performance in this case, and the battle between the two trades blows.

In single threaded environments, the G3258 wins out hands down, by having a 25-50% performance advantage despite having lower clock speeds.

Cinebench R15 - Single Threaded

However, due to having four threads rather than two, the X4 845 wins in any fully multithreaded test, particularly for heavy workloads such as video encoding. The G3258 lacks accelerated AES encryption as well, meaning the X4 845 gets a result 800% higher in that case.

Hybrid x265, 4K Video

Where the waters are muddied is in variable threaded workloads, or memory dependent workloads. The Pentium has larger and quicker caches, meaning that it can take the lead in some multithreaded workloads. But taking into account some benchmarks, like Google Octane, the difference is minimal:

Google Octane v2

When it comes to gaming, it depends on which benchmark/configuration you choose, but for GTA and GRID, when the Athlon is paired up with an AMD graphics card, the Athlon wins, but with an NVIDIA graphics card, the Pentium wins. For Shadow of Mordor and Alien Isolation however, the higher IPC for the Intel processor wins out no matter which GPU is used.

Carrizo, 7th Generation and the Future

When we benchmarked a number of laptops using Carrizo processors, and compared them to a Kaveri laptop, we could instantly tell that the Carrizo microarchitecture was a sufficient jump in the mobile space for performance and power, as long as OEMs would actually use dual channel memory. This was bolstered by the fact that any graphics tests relied on the integrated GPU, which saw enhancements with the new design as well. On the desktop side of the equation, the results are less clear cut. Here we have a microarchitecture with good performance characteristics for compute, but it gets let down in discrete gaming. Moreover, the competition provided by the Pentium G3258 is hard to ignore. The fact that the two processors, at stock, performed similarly for web use is an interesting element in our testing for sure.

AMD’s future will be with Bristol Ridge, using an updated Excavator microarchitecture, and the new line of high-end processors using Zen cores. Both of these are slated for the tail end of the year and/or Q1, anything from 4-8 months ahead. Is it really worth investing in a Carrizo (or Pentium) platform now only to find it has been passed later in the year? While it’s an interesting question, in my opinion it’s probably the wrong question to ask.

Bristol Ridge, using the updated Excavator core, is likely to perform similarly (within single digit %) of Carrizo in raw performance, but it will also have DDR4 and new chipsets to help deal with things like PCIe SSDs, NVMe, upgraded Ethernet and new features (features unknown at this point). For some users, especially building simple machines that just need base storage and some oomph, that will not matter much. If you are a user that slowly upgrades over time (by buying one big upgrade every now and again rather than a full system replacement), then going in for Carrizo (or Kaveri) now should be par for the course. The interesting element is whether to go for Kaveri (X4 880K) or Carrizo (X4 845), especially if the difference is only $20.

Carrizo comes with AMD’s new 95W near-silent cooler, whereas the X4 880K uses the new 125W solution. If the difference is only $20, pitch for the faster Kaveri every time. What you lose in microarchitecture will be made up by frequency and overclocking ability.

If you want to make that jump from Athlon to Zen, from mid-range to AMD’s high-end, then it might be worth investing a summer to earning more for a future system. Even if Zen doesn’t pan out completely (most users have their fingers crossed that Intel will have some competition at last), a bigger system with more storage or a better graphics card is never a bad thing.

Ultimately, the X4 845’s main let down, for gaming at least, would seem to be that 2 MB of L2 cache, and the base processor design aiming at 15W. Bristol Ridge is also aimed around 15W, and should come in 65W flavors (with integrated graphics), and it will be interesting to see what level of cache it has compared to the mobile counterparts.

AMD CPU L2 Cache Levels
Core uArch Cores L2 Cache Mobile L2 Cache Desktop
Excavator v2 Bristol Ridge 4 2x1 MB, 16-way ...?
Excavator Carrizo 4 2x1 MB, 16-way 2x1 MB, 16-way
Steamroller Kaveri 4 2x2 MB, 16-way 2x2 MB, 16-way
Piledriver v2 Richland 4 2x2 MB, 16-way 2x2 MB, 16-way
Piledriver Trinity 4 2x2 MB, 16-way 2x2 MB, 16-way
AMD's Desktop Future: AM4, Bristol Ridge and Summit Ridge
POST A COMMENT

131 Comments

View All Comments

  • lefty2 - Thursday, July 14, 2016 - link

    I'm predicting Bristol Ridge will be just as bad a failure as Carrizo. I.e. the few design wins will only have single DIMM memory and be universally unavailable, buried somewhere in a dark corner of the OEM's website. It's a pity, because both SoCs are very good in their own right. Reply
  • nandnandnand - Thursday, July 14, 2016 - link

    If it's not Zen, it can be thrown straight in the garbage. Reply
  • Samus - Friday, July 15, 2016 - link

    I still rock a few Kaveri desktops and they are incredibly powerful for the price. The 860K is half the cost of a comparable Intel chip, which supporting faster memory and a lower cost platform.

    Carizo on the desktop is an anomaly. I'd like to see what it could do with 4MB cache (would require an entirely new die)
    Reply
  • Lolimaster - Saturday, July 16, 2016 - link

    They were nice in 2014.

    We should have a nice 20nm 768SP APU in 2015 with a full L2 cache Excavator and fully mature 896SP 20nm early this year.

    Remember the A8 3870K? That APU was a damn monster only hold back from being godly cause of their sub 3Ghz cpu speed, what we had after?

    400SP VLIW5 2011 --> 384 VLIW4 2012 --> 384VLIW4 2013 --> 512SP GCN 2015 --> 512SP GCN 2016

    Intel improved way faster (non "e" + edram igp's are near A8 level from being utter trash when the A8 3850 was release).
    Reply
  • The_Countess - Tuesday, July 19, 2016 - link

    yes being able to thrown in a extra billion transistors compared to AMD (1.7 vs 0.75 billion transistors for a quad core with GPU) because of 14nm really does help intel along a lot.

    but as nobody has been able to make a 20nm class process for anything but flash and ram besides intel, AMD's hands were tied. there is nothing AMD could have done to change that.
    Reply
  • BlueBlazer - Friday, July 15, 2016 - link

    Formula for failure: FM2 socket (with limited CPU upgradeability), only PCI Express x8 lanes available (which can bottleneck GPUs), and only "4 cores" (which performs more like 2C/4T Core i3 processor). Reply
  • neblogai - Friday, July 15, 2016 - link

    Bristol Ridge is not FM2; PCI-E x8 can not bottleneck midrange GPUs; ultra low power mobile APU also sold as desktop chip is not a failure, just additional revenue Reply
  • BlueBlazer - Friday, July 15, 2016 - link

    The results in the article shows otherwise, where AMD's Bristol Ridge was slower in most gaming tests, despite having better performance in some applications. Both FM2 and FM2+ are still the same (legacy) socket. AMD will be probably selling these chips at a loss. Note that these are the same (large) dies as Carrizo chips, and at 250mm^2 coupled with low prices typically meant razor thin margins or none at all. Reply
  • silverblue - Friday, July 15, 2016 - link

    That L2 cache is probably making more difference than you realise. Reply
  • evolucion8 - Saturday, July 16, 2016 - link

    The PCI-E is busted, even at PCI E 2.0 @ 4X, it barely makes a difference on the Fury X and the GTX 980 Ti. Reply

Log in

Don't have an account? Sign up now