PCI Express 3.0: More Bandwidth For Compute

It may seem like it’s still fairly new, but PCI Express 2 is actually a relatively old addition to motherboards and video cards. AMD first added support for it with the Radeon HD 3870 back in 2008 so it’s been nearly 4 years since video cards made the jump. At the same time PCI Express 3.0 has been in the works for some time now and although it hasn’t been 4 years it feels like it has been much longer. PCIe 3.0 motherboards only finally became available last month with the launch of the Sandy Bridge-E platform and now the first PCIe 3.0 video cards are becoming available with Tahiti.

But at first glance it may not seem like PCIe 3.0 is all that important. Additional PCIe bandwidth has proven to be generally unnecessary when it comes to gaming, as single-GPU cards typically only benefit by a couple percent (if at all) when moving from PCIe 2.1 x8 to x16. There will of course come a time where games need more PCIe bandwidth, but right now PCIe 2.1 x16 (8GB/sec) handles the task with room to spare.

So why is PCIe 3.0 important then? It’s not the games, it’s the computing. GPUs have a great deal of internal memory bandwidth (264GB/sec; more with cache) but shuffling data between the GPU and the CPU is a high latency, heavily bottlenecked process that tops out at 8GB/sec under PCIe 2.1. And since GPUs are still specialized devices that excel at parallel code execution, a lot of workloads exist that will need to constantly move data between the GPU and the CPU to maximize parallel and serial code execution. As it stands today GPUs are really only best suited for workloads that involve sending work to the GPU and keeping it there; heterogeneous computing is a luxury there isn’t bandwidth for.

The long term solution of course is to bring the CPU and the GPU together, which is what Fusion does. CPU/GPU bandwidth just in Llano is over 20GB/sec, and latency is greatly reduced due to the CPU and GPU being on the same die. But this doesn’t preclude the fact that AMD also wants to bring some of these same benefits to discrete GPUs, which is where PCI e 3.0 comes in.

With PCIe 3.0 transport bandwidth is again being doubled, from 500MB/sec per lane bidirectional to 1GB/sec per lane bidirectional, which for an x16 device means doubling the available bandwidth from 8GB/sec to 16GB/sec. This is accomplished by increasing the frequency of the underlying bus itself from 5 GT/sec to 8 GT/sec, while decreasing overhead from 20% (8b/10b encoding) to 1% through the use of a highly efficient 128b/130b encoding scheme. Meanwhile latency doesn’t change – it’s largely a product of physics and physical distances – but merely doubling the bandwidth can greatly improve performance for bandwidth-hungry compute applications.

As with any other specialized change like this the benefit is going to heavily depend on the application being used, however AMD is confident that there are applications that will completely saturate PCIe 3.0 (and thensome), and it’s easy to imagine why.

Even among our limited selection compute benchmarks we found something that directly benefitted from PCIe 3.0. AESEncryptDecrypt, a sample application from AMD’s APP SDK, demonstrates AES encryption performance by running it on square image files.  Throwing it a large 8K x 8K image not only creates a lot of work for the GPU, but a lot of PCIe traffic too. In our case simply enabling PCIe 3.0 improved performance by 9%, from 324ms down to 297ms.

Ultimately having more bandwidth is not only going to improve compute performance for AMD, but will give the company a critical edge over NVIDIA for the time being. Kepler will no doubt ship with PCIe 3.0, but that’s months down the line. In the meantime users and organizations with high bandwidth compute workloads have Tahiti.

Video & Movies: The Video Codec Engine, UVD3, & Steady Video 2.0 Managing Idle Power: Introducing ZeroCore Power
Comments Locked

292 Comments

View All Comments

  • gevorg - Thursday, December 22, 2011 - link

    37.9dB is a horrible testbed for noise testing! WTF!
  • mavere - Thursday, December 22, 2011 - link

    Seriously!

    With the prevalence of practically silent PSUs, efficient tower heatsinks, and large quiet fans, I cannot fathom why the noise floor is 37.9 dB.
  • Finally - Thursday, December 22, 2011 - link

    As usual, AT is shooting straight for the brain-dam, I mean, ENTHUSIAST crowd feat. a non-mentioned power supply that should be well around 1000W in order to drive over-priced CPUs as well as quadruple GPU setups.
    If you find that horrendous they will offer you not to read this review, but their upcoming HTPC review where they will employ the same 1000W power supply...
  • B3an - Thursday, December 22, 2011 - link

    *face palm*

    1: 1000+ Watt PSU's are normally more quiet if anything as they're better equipped to deal with higher power loads. When a system like this uses nowhere near the PSU's full power the fan often spins at a very low RPM. Some 1000+ PSU's will just shut the fan off completely when a system uses less than 30% of it's power.

    2: It's totally normal for a system to be around 40 dB without including the graphics cards. Two or 3 fans alone normally cause this much noise even if they're large low RPM fans. Then you have noise levels from surroundings which even in a "quiet" room are normally more than 15 dB.

    3: Grow some fucking brain cells kids.
  • andymcca - Thursday, December 22, 2011 - link

    1) If you were a quiet computing enthusiast, you would know that the statement
    "1000+ Watt PSU's are normally more quiet if anything"
    is patently false. 1000W PSUs are necessarily less efficient at realistic loads (<600W at full load in single GPU systems). This is a trade-off of optimizing for efficiency at high wattages. There is no free lunch in power electronics. Lower efficiency yields more heat yields more noise, all else being equal. And I assure you that a high end silent/quiet PSU is designed for low air flow and uses components at least as high in quality as their higher wattage (non-silent/non-quiet) competitors. Since the PSU is not decribed (a problem which has been brought up many times in the past concerning AT reviews), who knows?

    2) 40dB is fairly loud if you are aiming for quiet operation. Ambient noise in a quiet room can be roughly 20dB (provided there is not a lot of ambient outdoor noise). 40dB is roughly the amplitude of conversation in a quiet room (non-whispered). A computer that hums as loud as I talk is pretty loud! I'm not sure if you opinion is informed by any empirical experience, but for precise comparison of different sources the floor should be at minimum 20dB below the sources in question.

    3) You have no idea what the parent's age or background is, but your comment #3 certainly implies something about your maturity.
  • formulav8 - Tuesday, February 21, 2012 - link

    Seriously grow up. Your a nasty mouth as well.
  • piroroadkill - Thursday, December 22, 2011 - link

    Haha, yeah.

    Still, I guess we have to leave that work to SPCR.
  • Kjella - Thursday, December 22, 2011 - link

    High-end graphics cards are even noisier, so who cares? A 250W card won't be quiet no matter what. Using an overclocked Intel Core i7 3960X is obviously so the benchmarks won't be CPU limited, not to make a quiet PC.
  • Ryan Smith - Thursday, December 22, 2011 - link

    Our testing methodology only has us inches from the case (an open case I should add), hence the noise from our H100 closed loop radiator makes itself known. In any case these numbers aren't meant to be absolutes, we only use them on a relative basis.
  • MadMan007 - Thursday, December 22, 2011 - link

    [AES chart] on page 7?

Log in

Don't have an account? Sign up now