Improved Turbo

Trinity features a much improved version of AMD's Turbo Core technology compared to Llano. First and foremost, both CPU and GPU turbo are now supported. In Llano only the CPU cores could turbo up if there was additional TDP headroom available, while the GPU cores ran no higher than their max specified frequency. In Trinity, if the CPU cores aren't using all of their allocated TDP but the GPU is under heavy load, it can exceed its typical max frequency to capitalize on the available TDP. The same obviously works in reverse.

Under the hood, the microcontroller that monitors all power consumption within the APU is much more capable. In Llano, the Turbo Core microcontroller looked at activity on the CPU/GPU and performed a static allocation of power based on this data. In Trinity, AMD implemented a physics based thermal calculation model using fast transforms. The model takes power and translates it into a dynamic temperature calculation. Power is still estimated based on workload, which AMD claims has less than a 1% error rate, but the new model gets accurate temperatures from those estimations. The thermal model delivers accuracy at or below 2C, in real time. Having more accurate thermal data allows the turbo microcontroller to respond quicker, which should allow for frequencies to scale up and down more effectively.

At the end of the day this should improve performance, although it's difficult to compare directly to Llano since so much has changed between the two APUs. Just as with Llano, AMD specifies nominal and max turbo frequencies for the Trinity CPU/GPU. 

A Beefy Set of Interconnects

The holy grail for AMD (and Intel for that matter) is a single piece of silicon with CPU and GPU style cores that coexist harmoniously, each doing what they do best. We're not quite there yet, but in pursuit of that goal it's important to have tons of bandwidth available on chip.

Trinity still features two 64-bit DDR3 memory controllers with support for up to DDR3-1866 speeds. The controllers add support for 1.25V memory. Notebook bound Trinities (Socket FS1r2 and Socket FP2) support up to 32GB of memory, while the desktop variants (Socket FM2) can handle up to 64GB.

Hyper Transport is gone as an external interconnect, leaving only PCIe for off-chip IO. The Fusion Control Link is a 128-bit (each direction) interface giving off-chip IO devices access to system memory. Trinity also features a 256-bit (in each direction, per memory channel) Radeon Memory Bus (RMB) direct access to the DRAM controllers. The excessive width of this bus likely implies that it's also used for CPU/GPU communication as well.

IOMMU v2 is also supported by Trinity, giving supported discrete GPUs (e.g. Tahiti) access to the CPU's virtual memory. In Llano, you used to take data from disk, copy it to memory, then copy it from the CPU's address space to pinned memory that's accessible by the GPU, then the GPU gets it and brings it into its frame buffer. By having access to the CPU's virtual address space now the data goes from disk, to memory, then directly to the GPU's memory—you skip that intermediate mem to mem copy. Eventually we'll get to the point where there's truly one unified address space, but steps like these are what will get us there.

The Trinity GPU

Trinity's GPU is probably the most well understood part of the chip, seeing as how its basically a cut down Cayman from AMD's Northern Islands family. The VLIW4 design features 6 SIMD engines, each with 16 VLIW4 arrays, for a total of up to 384 cores. The A10 SKUs get 384 cores while the lower end A8 and A6 parts get 256 and 192, respectively. FP64 is supported but at 1/16 the FP32 rate.

As AMD never released any low-end Northern Islands VLIW4 parts, Trinity's GPU is a bit unique. It technically has fewer cores than Llano's GPU, but as we saw with AMD's transition from VLIW5 to VLIW4, the loss didn't really impact performance but rather drove up efficiency. Remember that most of the time that 5th unit in AMD's VLIW5 architectures went unused.

The design features 24 texture units and 8 ROPs, in line with what you'd expect from what's effectively 1/4 of a Cayman/Radeon HD 6970. Clock speeds are obviously lower than a full blown Cayman, but not by a ton. Trinity's GPU runs at a normal maximum of 497MHz and can turbo up as high as 686MHz.

Trinity includes AMD's HD Media Accelerator, which includes accelerated video decode (UVD3) and encode components (VCE). Trinity borrows Graphics Core Next's Video Codec Engine (VCE) and is actually functional in the hardware/software we have here today. Don't get too excited though; the VCE enabled software we have today won't take advantage of the identical hardware in discrete GCN GPUs. AMD tells us this is purely a matter of having the resources to prioritize Trinity first, and that discrete GPU VCE support is coming.

Introduction and Piledriver Overview Mobile Trinity Lineup
Comments Locked

271 Comments

View All Comments

  • Taft12 - Tuesday, May 15, 2012 - link

    He said "better".

    http://ir.amd.com/phoenix.zhtml?c=74093&p=irol...

    "Linux OS supports manual switching which requires restart of X-Server to switch between graphics solutions."

    They ain't there yet!
  • JarredWalton - Tuesday, May 15, 2012 - link

    Enduro sounds like it's just a renamed "AMD Dynamic Switchable Graphics" solution. I haven't had a chance to test it yet, unfortunately, but I can say that the previous solution is still very weak. And you still don't get separate driver updates from AMD and Intel.
  • Spunjji - Wednesday, May 16, 2012 - link

    Drivers is the big deal here. I like that I get standard drivers using my Optimus laptop.

    What I don't like is that it f#@!s up Aero constantly and occasionally performs other bizarre, unpredictable manoeuvres.
  • ToTTenTranz - Tuesday, May 15, 2012 - link

    Greetings,

    Is it possible to provide some battery life results with gaming?

    It's true that an Intel+nVidia Optimus solution should be better for both plugged-in gaming and wireless productivity (more expensive too, but that's been covered in the review).
    However, a 35W Trinity should consume quite a bit less power than a 35W Intel CPU + 35W nVidia GPU, so it might be a worthy tradeoff for some.

    Furthermore, when are we to expect Hybrid Crossfire results with Trinity+Turks? Is there any laptop OEM with that on the roadmap?
    That should give us a better comparison to Ivy Bridge + GK107 solutions, as it would provide better gaming performance at a rather small price premium ($50 the most?).
  • x264fan - Tuesday, May 15, 2012 - link

    thanx for the nice review author, but let me write you some very important information regarding your test.

    1. x264 HD Benchmark Ver. 4.0 you used is using quite old x264.exe for encoding. It is important for Bulldozer/Piledriver to replace it with the newer once which contain specific assembler optimisation, which gives nice performance boost for AMD processor by using new instructions introduced in those CPUs. You can find how many they are here:
    http://git.videolan.org/gitweb.cgi?p=x264.git;a=sh...

    I would suggest to download new x264 build from x264.nl and replace it, then run the benchmark again. It would also show you how beneficial new isntructions are.

    Another suggestion would be to run this benchmark using x64 build of the x264 throught x86 avisynth wrapper avs4x264mod.exe In this way you can see how much difference x64 uinstructions give.

    iN FACT X264 IS SO NICELLY OPTIMISED IT CAN BE USED FOR CPU TESTING.

    2. You have used Media Player Classic Home Cinema Edition for measuring playback of h264 streams and battery life. So am I, unfortunatelly every time I want to use it with DXVA acceleration on my i7-2630 laptop I end up with terrible artefacts on smaller bitrate content. Blocks are floating and destroying picture quality. It is not as much visible on Blu-Ray content where the picture is more recommpressed than recreated using x264 transformations, but it is still there. My point is that if the INTEL decoding/drivers are so buggy which makes this dxva mode so unusable, how can anyone would like to measure battery life with this mode?
    Without DXVA intel numbers would not be so good, but so far this mode is only usable.

    3. I must say i am amased how good hd4000 is, but what about picture quality. From time to time we see the reports that nvidia or amd has cheated in drivers sacrifacing picture quality, so how about intel...

    I hope you read my comment and update your test.
  • JarredWalton - Tuesday, May 15, 2012 - link

    So, help me out here: where do I get the actual x264 executables if I want to run an updated version of the x264 HD test? We've tried to avoid updating to newer releases just so that we could compare results with previously tested CPUs, but perhaps it's time to cut the strings. What I'd like is a single EXE that works optimally for Sandy Bridge, Ivy Bridge, Llano, and Trinity architectures. And I'm not interested in downloading source code, trying to get a compiled version to work, etc. -- I gave up being a software developer over a decade ago and haven't looked back. :-)
  • x264fan - Wednesday, May 16, 2012 - link

    http://x264.nl it is newest semi-official build. It contains all current optimisations for every CPU, but since its command line you can turn on and off them. I also heard that this week there will be new hd benchmark 5.0 which would have the newest build in it.
  • plonk420 - Monday, July 9, 2012 - link

    the problem with this is that then the test isn't strictly "x264 hd benchmark version x.00" ... and would be harder to compare to other runs of the same test.

    if they did this in ADDITION to v4.00 or whatever (and VERY clearly noted the changes), that might be some useful data.
  • jabber - Tuesday, May 15, 2012 - link

    ....how about adding a line/area to the benchmark graphs that stands for "Beyond this point performance is pointless/unnoticeable to the user".

    That way we can truly tell if we can save ourselves a boat load of cash. All out performance is great and all but I don't run benchmarks all day like some here so it's not so important. I just need to know will it do the job.

    Or would that be bad for the sponsors?
  • bji - Tuesday, May 15, 2012 - link

    It is an interesting idea but it would such incredible fodder for fanboys to flame about, and even reasonable people would have a hard time deciding where that line should be drawn.

    I think the answer to your basic question is that, any mobile CPU in the Llano/Trinity/Sandy Bridge/Ivy Bridge lines will be more than sufficient for you or any other user *unless* you have a specific task that you know is highly CPU intensive and requires all of the CPU you can get.

Log in

Don't have an account? Sign up now