Original Link: http://www.anandtech.com/show/6347/amd-a10-5800k-a8-5600k-review-trinity-on-the-desktop-part-2
AMD A10-5800K & A8-5600K Review: Trinity on the Desktop, Part 2by Anand Lal Shimpi on October 2, 2012 1:45 AM EST
Although AMD's second-generation mainstream APU platform, codename Trinity, launched months ago in notebooks the official desktop launch is today. Rumor has it that AMD purposefully delayed the desktop Trinity launch to clear out unsold Llano inventories in the channel. Although selling APUs in notebooks is pretty easy, convincing desktop users to forgo the discrete GPU option (and ignore Intel) has been a tough battle for AMD. I keep going back to two slides that show us where AMD wants to go and the cores it'll take to get there:
The ultimate goal is this beautiful cohesive operation between CPU and GPU on a single die. That future will require a lot of software support, not only at the application level but also at the OS level. And I'm not talking about Windows 8. We're still far away from this APU dominated future, but AMD is marching in that direction. The second slide shows the x86 cores that we'll see from AMD along the way. AMD is still playing catch-up in the x86 CPU space and it's got a lot of lost time to make up for. There's no hiding the fact this is going to be a multi-year effort to simply get close to Intel's single-threaded x86 performance. Through pricing, leveraging its GPU technology and throwing more transistors at the problem AMD can still deliver competitive solutions, but it's not going to be a walk in the park.
Last week we took a look at the GPU side of the desktop Trinity APUs. We looked at the top end 384-core Radeon HD 7660D configuration as well as the slightly slower 256-core Radeon HD 7560D GPU, both of which easily outperformed Intel's HD 4000 and HD 2500. As far as processor graphics go, Trinity on the desktop maintains a healthy lead over Intel. There's still a place for discrete GPUs but that's pretty much at the $100 and above price points.
Today we're able to talk about pricing and x86 CPU performance among other things. The good news on that front is the most expensive Trinity APU is fully unlocked and is priced at $122:
|AMD Socket-FM2 Lineup|
|Modules/Cores||CPU Clock Base/Turbo||L2 Cache||GPU||TDP||Price|
|A10-5800K||2 / 4||3.8 / 4.2 GHz||4MB||384 cores @ 800MHz||100W||$122|
|A10-5700||2 / 4||3.4 / 4.0 GHz||4MB||384 cores @ 760MHz||65W||$122|
|A8-5600K||2 / 4||3.6 / 3.9 GHz||4MB||256 cores @ 760MHz||100W||$101|
|A8-5500||2 / 4||3.2 / 3.7 GHz||4MB||256 cores @ 760MHz||65W||$101|
|A6-5400K||1 / 2||3.6 / 3.8 GHz||1MB||192 cores @ 760MHz||65W||$67|
|A4-5300||1 / 2||3.4 / 3.6 GHz||1MB||128 cores @ 724MHz||65W||$53|
|Athlon X4 750K||2 / 4||3.4 / 4.0 GHz||4MB||N/A||100W||$81|
|Athlon X4 740||2 / 4||3.2 / 3.7 GHz||4MB||N/A||65W||$71|
Compare this to Llano's launch where the top end SKU launched at $135 and you'll see that AMD is somewhat getting with the times. I would still like to see something closer to $100 for the A10-5800K, but I find that I'm usually asking for a better deal than what most CPU makers are willing to give me.
AMD's competitive target is Intel's newly released Ivy Bridge Core i3 processors. There are only five Core i3s on the market today, four of which use Intel's HD 2500 graphics. The cheapest of the lineup is the Core i3 3220 with two cores running at 3.3GHz for $125. Intel disables turbo and other features (there's effectively no overclocking on these parts), which AMD is attempting to exploit by pitting its Trinity K-series SKUs (fully unlocked) against them. AMD's TDPs are noticeably higher (100W for the higher end K-series parts compared to 55W for the Core i3s). Intel will easily maintain the power advantage as a result under both CPU and GPU load, although AMD's GPU does deliver more performance per watt. Power consumption is a major concern of AMD's at this point. Without a new process node to move to for a while, AMD is hoping to rely on some design tricks to improve things in the future.
At the low end of the stack there are also two Athlon X4s without any active GPU if you just want a traditional Trinity CPU.
This will be our last CPU/APU review on the current test platform/software configuration. The next major CPU review will see a move to a brand new testbed running Windows 8. As always you can get access to far more numbers than what we report here if you use our performance comparison engine: Bench. Of course if you want to see the GPU and GPU Compute performance of AMD's Trinity APU check out part one of our coverage.
ASUS P8Z68-V Pro (Intel Z68)
ASUS Crosshair V Formula (AMD 990FX)
Gigabyte GA-F2A85X-UP4 (AMD A85X)
Intel DZ77GA-70K (Intel Z77)
Intel X25-M SSD (80GB)
Crucial RealSSD C300
OCZ Agility 3 (240GB)
|Memory:||2 x 4GB G.Skill Ripjaws X DDR3-1600 9-9-9-20|
ATI Radeon HD 5870 (Windows 7)
AMD Processor Graphics
Intel Processor Graphics
|Video Drivers:||AMD Catalyst 12.8|
|Desktop Resolution:||1920 x 1200|
|OS:||Windows 7 x64|
Trinity CPU Performance: The Good and the Bad
We're going to start our performance investigation a little out of order. The big question on everyone's mind is how much single threaded performance has improved over Bulldozer, and whether it's enough to actually make Trinity faster than Llano across the board. We'll use Cinebench 11.5 as it has both single and multithreaded test options:
The good news is that single threaded performance is definitely up compared to Llano. Piledriver likely has some to do with this, but so does the fact that the A10 can run at up to 4.2GHz (~4GHz typically) with one of its cores active compared to the 2.9GHz clock speed of the A8-3850. Compared to the Bulldozer based FX-8150 there's a slight (~6%) increase in single threaded performance. Although I don't expect anyone will be cross shopping a Trinity APU and a FX CPU, it's important to keep an eye on progress here as we'll eventually get a high-end quad-module/eight-core Piledriver CPU.
Note that compared to even previous generation, low-end Intel CPUs without turbo there's a huge gap in single threaded performance. If we look at the gap AMD has to make up vs. Ivy Bridge it's not pretty. Intel's Core i3 3220 manages a 27% performance advantage over the A10-5800K. Even if Steamroller is able to deliver a 15% increase in performance at the same clock speed, there will still be a gap. And we're not even talking about how Haswell will grow this gap. For the foreseeable future I don't see AMD closing the single threaded performance gap. Jim Keller's job is to fix this problem, but it'll probably take 2 - 3 years to get there.
The multithreaded test shows the other end of reality: in heavily threaded foating point workloads it's possible that we'll see a regression compared to Llano. Remember the Bulldozer/Piledriver architecture prioritizes integer over floating point performance. Truth be told this regression is pretty rare in our tests, but until we get to Steamroller we will still see these types of situations.
Throw more threads at the problem and even with a floating point workload Intel can't pull ahead however. The A10 offers similar performance to the Core i3 3220 at a lower price. Your decision here would come down to the rest of the factors: single threaded performance, processor graphics performance, overclocking capabilities and power consumption. Intel and AMD both win two of those each, it's really a matter of what matters most to you.
A heavily threaded FP workload doesn't really play to AMD's advantages though, what happens when you get a heavily threaded integer workload however? The 7-zip benchmark gives us just that:
Here AMD manages a 16% performance advantage over the Core i3 3220. I'd even go as far as to say that Trinity would likely beat any dual-core Intel machine here. The performance advantage is somewhat artificial as Intel purposefully removes turbo from its dual-core desktop CPUs. This should be AMD's best foot forward, but once again it'll likely take Steamroller for this design to start to make sense.
Speaking of artificial product segmentation, one major feature Intel takes away when you get down to the dual-core desktop i3 level is AES-NI support. Hardware accelerated AES support is something that you get only with the more expensive Core i5/i7s. With Trinity, you get AES-NI support for the entire stack. The result is much better performance in those applications that depend on it:
Like most of the advantages we've talked about thus far, there are really very specific use cases where Trinity makes sense over a similarly priced Intel CPU.
Although not the best indication of overall system performance, the SYSMark 2012 suite does give us a good idea of lighter workloads than we're used to testing.
AMD does surprisingly well here in SYSMark 2012. The Core i3 3220 manages a 12% advantage over the 5800K, but that's not as much as we'd normally expect given the significant single threaded performance deficit we pointed out earlier. Once again, whether or not Trinity makes sense for you depends on how much you value processor graphics performance.
Content Creation Performance
Adobe Photoshop CS4
To measure performance under Photoshop CS4 we turn to the Retouch Artists’ Speed Test. The test does basic photo editing; there are a couple of color space conversions, many layer creations, color curve adjustment, image and canvas size adjustment, unsharp mask, and finally a gaussian blur performed on the entire image.
The whole process is timed and thanks to the use of Intel's X25-M SSD as our test bed hard drive, performance is far more predictable than back when we used to test on mechanical disks.
Time is reported in seconds and the lower numbers mean better performance. The test is multithreaded and can hit all four cores in a quad-core machine.
Our Photoshop workload still runs better on Intel hardware, but the gap in performance between the 5800K and 3220 is smaller than it was between the FX-8150 and 2500K last year. While Bulldozer was pretty much unrecommendable, Trinity approaches tradeoff territory.
3dsmax 9 & POV-ray
Today's desktop processors are more than fast enough to do professional level 3D rendering at home. To look at performance under 3dsmax we ran the SPECapc 3dsmax 8 benchmark (only the CPU rendering tests) under 3dsmax 9 SP1. The results reported are the rendering composite scores.
Once again in a heavily threaded FP benchmark, the A10 and Core i3 perform very similarly. POV-Ray is another example of this below:
File Compression/Decompression Performance
Par2 is an application used for reconstructing downloaded archives. It can generate parity data from a given archive and later use it to recover the archive
Chuchusoft took the source code of par2cmdline 0.4 and parallelized it using Intel’s Threading Building Blocks 2.1. The result is a version of par2cmdline that can spawn multiple threads to repair par2 archives. For this test we took a 708MB archive, corrupted nearly 60MB of it, and used the multithreaded par2cmdline to recover it. The scores reported are the repair and recover time in seconds.
Our multithreaded Par2 recovery test shows AMD with a small advantage over the Core i3 3220, although it obviously can't touch any of the more expensive quad-core parts.
Excel Math Performance
Not all heavily threaded FP applications are easy wins for AMD. In our Monte Carlo simulation benchmark the 3220 manages a decent lead over the A10-5800K.
Our old Sorenson Squeeze test is one area where we see a slight regression compared to Llano. Like I mentioned earlier, this isn't super common but it does happen from time to time given the dramatic architecture difference between Llano and Trinity.
Video Transcoding Performance
x264 HD 3.03 Benchmark
Graysky's x264 HD test uses x264 to encode a 4Mbps 720p MPEG-2 source. The focus here is on quality rather than speed, thus the benchmark uses a 2-pass encode and reports the average frame rate in each pass.
CPU based video transcode performance is as good as it can get from AMD here given the 2/4 module/core setup of these Trinity APUs. Intel's Core i3 3220 is a bit slower than the A10-5800K. We're switching to a much newer version of the x264 HD benchmark for our new test suite (5.0.1). Some early results are below if you want to see how things change under the new test:
|x264 HD 5.0.1 Benchmark|
|1st Pass||2nd Pass|
|AMD A10-5800K (3.8GHz)||33.5 fps||7.41 fps|
|AMD A8-5600K (3.6GHz)||32.2 fps||7.12 fps|
|Intel Core i3 3220 (3.3GHz)||35.2 fps||6.61 fps|
Discrete GPU Gaming Performance
Although likely not the target market for someone buying a Trinity APU, we looked at performance of AMD's latest APU when paired with a high-end discrete GPU. The end result is a total loss for Trinity. If you're going to use processor graphics Trinity is a clear winner, but if you plan on pairing the APU with a high end discrete GPU you're much better off with the Core i3 3220.
With Sandy Bridge Intel killed budget overclocking by completely clock locking all CPUs without turbo boost enabled. While you used to be able to buy an entry level CPU and overclock it quite nicely, Intel moved all overclocking to its higher priced parts. As a gift to the overclocking community, Intel ramped up the presence of its fully unlocked K-series parts. Anything with a K at the end shipped with a fully unlocked clock multiplier, at a small price premium. Given that Intel hadn't shipped unlocked CPUs since the days of the original Pentium, this was a welcome move on its part. What would really be nice is the addition of some lower priced K SKUs, unfortunatley we won't get that unless there's significant competitive pressure from AMD.
Trinity doesn't have what it takes to really force Intel into doing such a thing, but that doesn't mean AMD won't try. The Trinity lineup includes AMD's own K-series SKUs that, like their Intel counterparts, ship fully unlocked. From $67 all the way up to $122, AMD is offering unlocked Trinity APUs. The value of these parts really depends on just how overclockable Trinity is to begin with. The Bulldozer/Piledriver architecture is designed to push frequency, however AMD is already shipping these things at very close to 4GHz to begin with. Take AMD's turbo frequencies into account and you're already at 4.2GHz with the A10-5800K. How much additional headroom is there?
With a stock cooler and not a ton of additional voltage, it looks like there's another 5 - 15% depending on whether you're comparing base clocks or max turbo clocks. With an extra 0.125V (above the 1.45V standard core voltage setting) I was able to hit 4.4GHz on the A10-5800K. I could boot into Windows at 4.5GHz however the system wasn't stable. Although I could post at 4.6GHz, Windows was highly unstable at that frequency. With more exotic cooling I do believe I could probably make 4.5 work on the A10-5800K.
The extra frequency isn't enough to erase the single threaded performance gap between the A10 and Intel's Core i3 3220 however:
The only way AMD is going to close this gap is through a serious focus on improving single threaded performance in future architectures.
Intel has a full process node advantage when you compare Ivy Bridge and Trinity, as a result of that plus an architectural efficiency advantage you just get much better power consumption from the Core i3 than you do with Trinity. Idle power is very good but under heavy CPU load Trinity consumes considerably more power. You're basically looking at quad-core Ivy Bridge levels of power usage under load but performance closer to that of a dual-core Ivy Bridge. AMD really needs a lot of design level efficiency improvements to get power consumption under control. Compared to Llano, Trinity is a bit more efficient it seems so there's an actual improvement there.
Of course on the processor graphics side the story is much closer with Trinity being a bit more power hungry than Ivy Bridge, but not nearly by this margin.
I have to admit, Trinity's CPU performance made it a lot closer to Intel's Core i3 3220 than I expected it to. In the worst case there's still a huge gap in single threaded performance, but even SYSMark 2012 only shows Intel's Core i3 3220 with a 12% performance advantage. Multithreaded workloads do reasonably well on Trinity as well. Intel pulls ahead in some, while AMD does in others and there's another selection of applications/workloads where we see performance parity between similarly priced Trinity and Ivy Bridge parts. A big part of all of this is Intel disabling features on its Core i3 (the lack of turbo hurts), but Piledriver's high clock speeds and AMD's pricing strategy both play a role here as well.
The big exception to all of this is high-end gaming performance. If you're planning on pairing a beefy GPU with a cheap CPU, you're much better off going with Intel than AMD at this point. Single threaded performance is still far too important to most gaming workloads for the recommendation to be anything different.
As I mentioned earlier, Trinity's CPU performance puts the buying decision squarely in the tradeoff evaluation zone. Once again what matters the most is how important Trinity's GPU is to you. AMD holds a clear advantage there if you're going to use it, otherwise the decision is heavily weighted towards Intel. Intel holds a power consumption advantage and a clear single threaded performance advantage, while there are some specific workloads that will do better on Trinity (e.g. AES-NI accelerated apps, heavily threaded integer applications).
Overall Trinity is a step forward from Llano. It's not enough to make the job of recommending the APU any less complex than what I've outlined above however. Depending on what you plan on doing with your system, Trinity is either going to be perfect or a distant second.
What I am happy to see is AMD putting a little competitive pressure on Intel here. Offering unlocked K SKUs, features like AES-NI and great GPU performance at these price points is important. I don't believe Trinity is strong enough on the CPU side to really force Intel to do the same with the Core i3, but we do need AMD to keep doing this and getting better each time.