Improved Turbo

Trinity features a much improved version of AMD's Turbo Core technology compared to Llano. First and foremost, both CPU and GPU turbo are now supported. In Llano only the CPU cores could turbo up if there was additional TDP headroom available, while the GPU cores ran no higher than their max specified frequency. In Trinity, if the CPU cores aren't using all of their allocated TDP but the GPU is under heavy load, it can exceed its typical max frequency to capitalize on the available TDP. The same obviously works in reverse.

Under the hood, the microcontroller that monitors all power consumption within the APU is much more capable. In Llano, the Turbo Core microcontroller looked at activity on the CPU/GPU and performed a static allocation of power based on this data. In Trinity, AMD implemented a physics based thermal calculation model using fast transforms. The model takes power and translates it into a dynamic temperature calculation. Power is still estimated based on workload, which AMD claims has less than a 1% error rate, but the new model gets accurate temperatures from those estimations. The thermal model delivers accuracy at or below 2C, in real time. Having more accurate thermal data allows the turbo microcontroller to respond quicker, which should allow for frequencies to scale up and down more effectively.

At the end of the day this should improve performance, although it's difficult to compare directly to Llano since so much has changed between the two APUs. Just as with Llano, AMD specifies nominal and max turbo frequencies for the Trinity CPU/GPU. 

A Beefy Set of Interconnects

The holy grail for AMD (and Intel for that matter) is a single piece of silicon with CPU and GPU style cores that coexist harmoniously, each doing what they do best. We're not quite there yet, but in pursuit of that goal it's important to have tons of bandwidth available on chip.

Trinity still features two 64-bit DDR3 memory controllers with support for up to DDR3-1866 speeds. The controllers add support for 1.25V memory. Notebook bound Trinities (Socket FS1r2 and Socket FP2) support up to 32GB of memory, while the desktop variants (Socket FM2) can handle up to 64GB.

Hyper Transport is gone as an external interconnect, leaving only PCIe for off-chip IO. The Fusion Control Link is a 128-bit (each direction) interface giving off-chip IO devices access to system memory. Trinity also features a 256-bit (in each direction, per memory channel) Radeon Memory Bus (RMB) direct access to the DRAM controllers. The excessive width of this bus likely implies that it's also used for CPU/GPU communication as well.

IOMMU v2 is also supported by Trinity, giving supported discrete GPUs (e.g. Tahiti) access to the CPU's virtual memory. In Llano, you used to take data from disk, copy it to memory, then copy it from the CPU's address space to pinned memory that's accessible by the GPU, then the GPU gets it and brings it into its frame buffer. By having access to the CPU's virtual address space now the data goes from disk, to memory, then directly to the GPU's memory—you skip that intermediate mem to mem copy. Eventually we'll get to the point where there's truly one unified address space, but steps like these are what will get us there.

The Trinity GPU

Trinity's GPU is probably the most well understood part of the chip, seeing as how its basically a cut down Cayman from AMD's Northern Islands family. The VLIW4 design features 6 SIMD engines, each with 16 VLIW4 arrays, for a total of up to 384 cores. The A10 SKUs get 384 cores while the lower end A8 and A6 parts get 256 and 192, respectively. FP64 is supported but at 1/16 the FP32 rate.

As AMD never released any low-end Northern Islands VLIW4 parts, Trinity's GPU is a bit unique. It technically has fewer cores than Llano's GPU, but as we saw with AMD's transition from VLIW5 to VLIW4, the loss didn't really impact performance but rather drove up efficiency. Remember that most of the time that 5th unit in AMD's VLIW5 architectures went unused.

The design features 24 texture units and 8 ROPs, in line with what you'd expect from what's effectively 1/4 of a Cayman/Radeon HD 6970. Clock speeds are obviously lower than a full blown Cayman, but not by a ton. Trinity's GPU runs at a normal maximum of 497MHz and can turbo up as high as 686MHz.

Trinity includes AMD's HD Media Accelerator, which includes accelerated video decode (UVD3) and encode components (VCE). Trinity borrows Graphics Core Next's Video Codec Engine (VCE) and is actually functional in the hardware/software we have here today. Don't get too excited though; the VCE enabled software we have today won't take advantage of the identical hardware in discrete GCN GPUs. AMD tells us this is purely a matter of having the resources to prioritize Trinity first, and that discrete GPU VCE support is coming.

Introduction and Piledriver Overview Mobile Trinity Lineup
Comments Locked

271 Comments

View All Comments

  • Khato - Tuesday, May 15, 2012 - link

    Really? The A10-4600m is going to be a $126 chip? 'Cause that's what a third of the tray price for an i7-3720QM is.
  • BSMonitor - Tuesday, May 15, 2012 - link

    You get 1/3 the performance on the CPU side.
  • bji - Tuesday, May 15, 2012 - link

    I don't know why I am bothing to respond to you, because your comments are all worthless, but I'd like to point out to anyone else who might be reading, that the CPU performance numbers are alot closer to 1/2 to 2/3 of the performance on the CPU side than 1/3.

    And 1/2 to 2/3 of Ivy Bridge CPU performance is *definitely* fast enough for 95% of users in 95% of circumstances, despite what trolls are claiming.
  • bji - Tuesday, May 15, 2012 - link

    Sorry, forget I said 2/3. That was just one benchmark. Let's just leave it at 1/2.

    I think my point is still valid. 1/2 of Ivy Bridge performance at 1/3 cost is going to be very acceptable to the vast majority of people.
  • JarredWalton - Tuesday, May 15, 2012 - link

    But the problem is you have to buy the whole laptop. If IVB goes for $350 and Trinity for $115, but the rest of the laptop ends up being $400, that means you get half the performance for 70% of the cost. And when Intel ships DC IVB chips for $150, we might be looking at 70% of the performance for 90% of the cost.

    My biggest fear with Trinity (if you couldn't tell from the conclusion) is that the laptop OEMs will price it too high. I think A10 is a decent part, provided you can get a reasonable set of laptop hardware for $600 or less. Anyway, we'll have to see what actually comes out and how much it costs.
  • bji - Tuesday, May 15, 2012 - link

    Very good points. Then we have to throw in the question of how much the extra performance is worth to the user. We'd all take extra performance for free (assuming that it didn't come at a cost of heat or battery life or other features), but would you pay 10% more for more performance that you knew you didn't need? I don't think most consumers really think in these terms of course, marketing will sell these parts, not logic, but if we're trying to make price and value comparisons, we need to be aware that the goal is to get what you need for the least money, not more than you need for the least amount more money.
  • JarredWalton - Tuesday, May 15, 2012 - link

    I'd take Trinity with an SSD over Sandy Bridge with an HDD, provided I could get a good LCD and build quality thrown into the mix. Maybe HP will deliver with the upcoming Envy Sleekbooks?
  • mrdude - Tuesday, May 15, 2012 - link

    HP offered this with the Llano, granted they charge $150 for a 1080p screen... You can also opt to buy an aftermarket 1080p screen and DIY. The Asus Llano line was extremely popular because you can buy a $70 1080p matte finish screen and upgrade a crossfired Llano. For ~$600 you got great gaming performance and a 1080p screen. Those things sold like hotcakes too.

    Jarred, I think you neglected quite a bit in this review. The improvements we've seen in Llano > Trinity actually outweigh the improvements we've seen in SB > Ivy yet the latter also has the advantage of a die shrink. The perf-per-watt improvements are by far the biggest shocker here and are nothing short of unbelievable if you consider Bulldozer's power consumption.

    While I understand using the 3720QM for the HD4000 benchmarks, why not delve into examining the Piledriver cores? There's very little info at all there with respect to what changed and what got better. What we got instead were synthetic benchmarks and a re-cap of the scores instead of some actual info. Hell, a monkey can run a benchmark but can that monkey run some meaningful benchmarks that test cache latency? AVX performance? Stress the IMC?Instead you're stating something that should be obvious (the weird multi-threaded cinebench score that actually makes sense when you consider it's a CMT design in Trinity therefore it lacks 2 FPUs compared to Llano) and that's supposed to be surprising?

    I can understand wanting to get a review out in time and giving us a rough idea of performance, but this is Anandtech. We expect a bit more than "these are the scores and these are the numbers. Onto the next benchmark."
  • Spunjji - Wednesday, May 16, 2012 - link

    I hear this.
  • mikato - Wednesday, May 16, 2012 - link

    I agree too (though the monkey part was a bit much). Maybe we can see a more in depth analysis of results, similar to Anandtech's treatment of AMD's new architecture but with hard results leading the analysis.

Log in

Don't have an account? Sign up now