Analyzing Generational Updates

Going through the benchmark data for our Carrizo part compared to Kaveri, Richland and Trinity gives two very different sides of the same story. Simply put, it would come across that Carrizo is overall better at CPU tasks when you compare clock for clock, but performs worse when a discrete graphics card is in play for gaming. There are some slight exceptions for both sides of this story, especially when larger memory accesses comes in, but this comes down to the design choices when Carrizo for desktop was made. The fact that we have a laptop CPU in desktop clothing is going to be a main detractor when it comes to gaming, but the CPU compute side of the equation is very promising indeed.

In our generational testing, we compared the following four processors at 3 GHz and running the highest supported JEDEC memory speeds for each:

AMD CPUs
  µArch /
Core
Cores Base
Turbo
TDP DDR3 L1 (I)
Cache
L1 (D)
Cache
L2
Cache
Athlon
X4 845
Excavator
Carrizo
4 3500
3800
65 W 2133 192KB
3-way
128KB
8-way
2 MB
16-way
 
Athlon
X4 860K
Steamroller
Kaveri
4 3700
4000
95 W 1866 192KB
3-way
64KB
4-way
4 MB
16-way
 
Athlon
X4 760K
Piledriver.v2
Richland
4 3800
4100
100 W 1866 128KB
2-way
64KB
4-way
4 MB
16-way
 
Athlon
X4 750K
Piledriver
Trinity
4 3400
4000
100 W 1866 128KB
2-way
64KB
4-way
4 MB
16-way

It is worth noting that for the most part the X4 750K and X4 760K are essentially equal, using a slightly modified Piledriver v2 microarchitecture for the X4 760K that in most cases performs similarly to the other processor at the same frequency. This will come through in almost all of our benchmark comparisons. However, the main battle will be between the top two.

Comparing the Upgrade: 2012 to 2016

Our results are going to be compared in two different ways. Firstly, we are going to look at the absolute improvement of each processor compared to the lowest one in the test: Trinity. This gives a direct analysis of the performance increase per clock total increase for every generation from 2012 to 2016. What follows is a series of graphs for each of our benchmark sections showing the results of each benchmark as a percentage improvement over Trinity. We'll analyze each one in turn.

From our Real World benchmarks, Carrizo gets a good showing in three of the benchmarks, showing a sizeable jump over Kaveri, however WinRAR and WebXPRT are a little lower.

For the office tests, Carrizo takes the biggest gain for CineBench and Handbrake, but sits behind in Photoscan and Hybrid. HandBrake shows a sizable gain in both tests compared to Trinity.

The Linux-Bench tests shows Carrizo behind Kaveri in each instance, and behind Richland for all three Redis tests. As we explained in that section, Redis is very memory dependent and as a result, despite having the larger L1 cache, only having 2 MB of L2 cache is a blow to the Carrizo part.

So here is where it is interesting. If you were only looking at synthetic and legacy tests in isolation, like many other review websites do, then you could be forgiven that it shows Carrizo taking a distinct lead in every benchmark (except 7-zip). In many cases there is a 10-20% gain over Kaveri.

For gaming, as explained in the testing, despite the improvement over Trinity that Carrizo offers, the deficit to Kaveri is consistent across the board.

Comparing IPC

Next, we have the generational updates moving from Trinity to Richland to Kaveri to Carrizo. This is where we typically expect to see single-digit percentage increases moving through the generations, with double digits for large gains or introduction of new IP blocks into the silicon (e.g. encryption or video conversion). Again, we go through each of our five benchmark sections for this.

3DPM v2 takes the biggest gain, a massive 32% over Kaveri, due to better memory management and a larger L1 cache. WinRAR, being memory dependent, loses due to the smaller L2.

The office tests are a mixed bag - we see a regression in Photoscan due to large memory accesses, but it is clear that Kaveri was a bigger jump for a number of things than Carrizo.

Our Linux tests get a poor showing across the board from Carrizo, which we saw in the results. In each case, the IPC for Carrizo is lower than that of Kaveri.

Back with the previous legacy results graph, we saw that Carrizo had a better performance than Kaveri across the board, except 7-zip. Translating this to IPC improvements and we see that in half the cases, moving to Kaveri was better than moving to Carrizo, with CineBench single threaded tests being the exception showing the capability of the core logic in Carrizo.

However, the big result will be for gaming. Clock for Clock, Carrizo gives an average 5.8% decrease in performance to Kaveri.

Conclusions

Wrapping all the numbers together, we get the following average IPC improvements for a Carrizo with 2MB of L2 cache over Kaveri with 4MB of L2 cache for each section:

AMD Average IPC Increases
Benchmark Suite Richland over Trinity Kaveri over Richland Carrizo over Kaveri
Real World 0.8% 8.0% 8.8%
Office -0.1% 11.1% 4.1%
Legacy 0.1% 11.8% 8.5%
Overall
Windows
0.3% 10.3% 7.3%
 
Linux 10.4% 10.5% -12.1%
Gaming -0.4% 12.5% -5.8%

The headline figure, for CPU compute benchmarks (real world, office and legacy), is that Carrizo offers a +7.3% improvement over AMD's previous microarchitecture, Kaveri. It comes with the caveat that Linux and Gaming performance, which in our tests tend to rely more on memory accesses, perform 6-12% worse.

Gaming at 3 GHz: Shadow of Mordor Stock Comparison: Real World
Comments Locked

131 Comments

View All Comments

  • The_Countess - Tuesday, July 19, 2016 - link

    actually bulldozer on 14nm would have been a completely different beast. it would have allowed AMD to use far more transistors per core while still making it way smaller in terms of size. that would have allowed AMD to create a far wider execution core, eliminating most of its bottlenecks.

    the high latency cache would probably still means it wouldn't be great for games but for everything else it would be a far more competitive design.

    it is also 14nm that will allow zen to make such a massive leap in IPC's as it will be a very wide Core, while still being pretty small, something that just can't be done on 28nm.

    bulldozer might not have been the best idea, but being stuck on 32/28nm for so long made all it's issues infinitely worse.
  • nandnandnand - Thursday, July 14, 2016 - link

    "Well better late than never for Andantech,"

    There was no point in Adanantech writing this review, because it is a chip for those people too stupid to wait until Zen. Zen is the only thing that matters.
  • BurntMyBacon - Friday, July 15, 2016 - link

    @nandnandnand: "There was no point in Adanantech writing this review, because it is a chip for those people too stupid to wait until Zen. Zen is the only thing that matters."

    Now, because this review exists, people as yet uninformed have concrete data to avoid decisions that might make them look (as you put it) stupid. There is very much a point.
  • Byte - Thursday, July 14, 2016 - link

    Zen will probably be the RX480 in the CPU world. Better performance, still trounced by the competition, but competently priced.
  • looncraz - Friday, July 15, 2016 - link

    That would be an improvement on the current situation. AMD is pricing their CPUs quite poorly right now.

    An Intel Celeron G3900 is $50 right now. AMD's closest competition is the A6-7400k - at $55.

    Both are dual cores, both are 65W, both have middling (but usable) graphics performance... quite similar at first glance... except the Intel runs at 2.8Ghz and the AMD runs at 3.5Ghz w/ 3.9Ghz turbo and can rather easily exceed 4Ghz when overclocked.

    Sounds like AMD should be taking home the gold on that one, until you find that the Celeron is nearly 25% faster in single threaded programs and is ~40% faster in multi-threaded programs... Bad deal going for the AMD... especially since the same board that hosts the Celeron can accept much faster CPUs and the AMD board simply doesn't have notably more powerful options available - you can upgrade to a quad core, but you won't be getting better single threaded performance no matter how hard you try. You might break even around 5Ghz, if you can manage it...

    AMD has a 40% clock-speed advantage out the gate, but loses by a large margin.
  • bananaforscale - Friday, July 15, 2016 - link

    You know what's funny? The fact that if I want to get a CPU that's faster than the FX-6100 I bought almost 5 years ago I still have to pay more than what I paid for it. Sure, Intel gives better single thread performance but I'd get fewer cores and no overclockability. Then there's the fact that I've been running that original Bulldozer with a 20% OC and it seems more stable than at stock clocks.

    Comparing single data points tells nobody a thing. Anyway, isn't that A6 in your comparison unlocked? :P
  • wumpus - Friday, July 15, 2016 - link

    I'm sure you missed an FX-8320 sale, or you really nailed the low point. Unfortunately Intel can match AMD's performance at nearly the same price, and is cutting off AMD's air supply that way.
  • artk2219 - Monday, July 18, 2016 - link

    Whats crazy is that Microcenter sells the FX 8320E's for $89.99. They also have a motherboard bundle option that you can get for $125 to $170 depending on which board you choose. Theoretically you can get a processor, motherboard, cooler, and memory for the price of a non-K core I5, or just a motherboard and processor for the price of an I3. The unfortunate thing is that not everyone has a microcenter near them, but for the ones that do you can get quite the deal, especially since those 8320E's will easily OC to FX 8350 levels, and more likely 4.2 to 4.6 from a stock clock of 3.2
  • BlueBlazer - Friday, July 15, 2016 - link

    From the leaks plus AMD's vague announcements, all points to AMD's Zen is going to be quite late (right into 2017). Why put use 28nm "placeholder" for AM4 if Zen is due soon? Also Global Foundries only has 14nm LPP which is a low power process. That may mean the frequency is going to be low (just look at the chips made on 14nm LPP like Qualcomm's Snapdragon 820, or even AMD's latest Radeon RX480). Reference http://semiengineering.com/high-performance-and-lo... quote "The “LP” processes are optimized for low power and feature design rules targeted for the lowest leakage, support lower operative voltages, and tend to have the slowest transistors of the three options".
  • wiboonsin - Sunday, November 12, 2017 - link

    I have read your article, it is very informative and helpful for me.I admire the valuable information you offer in your articles. Thanks for posting http://www.fanaticrunningwear.com/

Log in

Don't have an account? Sign up now