Musing About Memory Bandwidth & The Test

As we discussed in our introduction, NVIDIA is launching the GeForce GT 640 exclusively as a DDR3 part. Because of the lack of memory bandwidth this is going to hold back the performance of the card, and while we don’t have a GDDR5 card at this time it will still be pretty easy to see what the performance impact is once we jump into our gaming performance section and compare it to other cards like the GTS 450.

In the meantime however we wanted to try to better visualize just how little memory bandwidth the DDR3 GT 640 had. It’s one thing to say that the card has 28GB/sec of memory bandwidth, but how does that compare to other cards? For the answer to that we drew up a couple of graphs based on the ratio of theoretical memory bandwidth to theoretical performance.

The first graph is the ratio of memory bandwidth to color pixel ROPs (bytes per color operation), which is an entirely synthetic metric but a great illustration of the importance of memory bandwidth. Render operations are the biggest consumer of memory bandwidth so this is where DDR3 cards typically choke.

Because of the nature of this graph cards are all over the place, with cards with particularly unusual configurations (such as the 4 ROP GT 440) appearing near the top. Still, high-end cards such as the GTX 680 and Radeon HD 7970 have among the highest ratio of memory bandwidth to color render operations, while lower-end cards generally have a lower ratio. At 1.97 B/cOP, the DDR3 GT 640 has the lowest ratio by far, and is only 66% of the ratio found on the next-lowest card, the GeForce GT 440 OEM. The fact of the matter is that because of its high clockspeed and 16 ROPs, the DDR3 GT 640 has by far the lowest ratio than any other card before it.

Our second graph is the ratio of memory bandwidth to shader operations (bytes per FLOP). Shaders aren’t nearly as bandwidth constrained as ROPs thanks to liberal use of register files and caches, but I wanted to take a look at more than just one ratio. Compared to our ROP chart the GT 640 isn’t nearly as much of an outlier, but it still has the lowest ratio of memory bandwidth per FLOP out of all of the cards in our charts.

Now to be clear not all of this is NVIDIA’s fault. Memory speeds have not kept pace with Moore’s Law, so GPU performance has been growing faster than memory speeds for quite some time leading to a general downwards trend. But the GT 640 is a special card in this respect in that there has never been a card this starved for memory bandwidth. This is further impacted by the fact that while GDDR5 speeds have at least been increasing as a modest rate over the years as GPU memory controllers and memory chip production have improved, DDR3 memory speeds have been locked in the 1.6GHz-1.8GHz range for years. Simply put, the gap between GDDR5 and DDR3 has never been greater. Even a conservative memory clock of 4.5GHz would give a GDDR5 card 2.5 times the memory bandwidth of a typical DDR3 card.

Of course there’s still a place in the world for DDR3 cards, particularly in very low power situations, but that place is shrinking in size every day. If and when it arrives, we expect that the GDDR5 GT 640 will quickly trounce the DDR3 version in virtually all gaming scenarios. DDR3 for a card this power hungry (relatively speaking) and with this many ROPs just doesn’t look like it makes a lot of sense.

The Test

For our test we’re using the latest NVIDIA drivers at the time our benchmarks were taken (301.42), and for AMD’s cards we’re using the new Catalyst 12.6 betas. For analysis purposes we’ve thrown in a couple of additional cards that we don’t normally test, such as the GDDR5 GeForce GT 240. The GDDR5 GT 240 has 43.5GB/sec of memory bandwidth but with a far older and less powerful GPU, which makes for an interesting comparison on progress.

CPU: Intel Core i7-3960X @ 4.3GHz
Motherboard: EVGA X79 SLI
Chipset Drivers: Intel 9.​2.​3.​1022
Power Supply: Antec True Power Quattro 1200
Hard Disk: Samsung 470 (256GB)
Memory: G.Skill Ripjaws DDR3-1867 4 x 4GB (8-10-9-26)
Case: Thermaltake Spedo Advance
Video Cards: AMD Radeon HD 7770
AMD Radeon HD 7750-800
AMD Radeon HD 6670
AMD Radeon HD 5750
NVIDIA GeForce GT 640 DDR3
NVIDIA GeForce GTX 550 Ti
NVIDIA GeForce GTS 450
NVIDIA GeForce GT 440
NVIDIA GeForce GT 240
Video Drivers: NVIDIA ForceWare 301.42
AMD Catalyst 12.6 Beta
OS: Windows 7 Ultimate 64-bit

 

HTPC Aspects : Decoding and Rendering Benchmarks Crysis, Metro, DiRT 3, Shogun 2, & Batman: Arkham City
Comments Locked

60 Comments

View All Comments

  • cjs150 - Thursday, June 21, 2012 - link

    "God forbid there be a technical reason for it.... "

    Intel and Nvidia have had several generations of chip to fix any technical issue and didnt (HD4000 is good enough though). AMD have been pretty close to the correct frame rate for a while.

    But it is not enough to have the capability to run at the correct frame rate is you make it too difficult to change the frame rate to the correct setting. That is not a hardware issue just bad design of software.
  • UltraTech79 - Wednesday, June 20, 2012 - link

    Anyone else really disappointed in 4 still being standardized around 24 fps? I thought 60 would be the min standard by now with 120 in higher end displays. 24 is crap. Anyone that has seen a movie recorded at 48+FPS know whats I'm talking about.

    This is like putting shitty unleaded gas into a super high-tech racecar.
  • cjs150 - Thursday, June 21, 2012 - link

    You do know that Blu-ray is displayed at 23.976 FPS? That looks very good to me.

    Please do not confuse screen refresh rates with frame rates. Screen refresh runs on most large TVs at between 60 and 120 Hz, anything below 60 tends to look crap. (if you want real crap trying running American TV on an European PAL system - I mean crap in a technical sense not creatively!)

    I must admit that having a fps of 23.976 rather than some round number such as 24 (or higher) FPS is rather daft and some new films are coming out with much higher FPS. I have a horrible recollection that the reason for such an odd FPS is very historic - something to do with the length of 35mm film that would be needed per second, the problem is I cannot remember whether that was simply because 35mm film was expensive and it was the minimum to provide smooth movement or whether it goes right back to days when film had a tendency to catch light and then it was the maximum speed you could put a film through a projector without friction causing the film to catch light. No doubt there is an expert on this site who could explain precisely why we ended up with such a silly number as the standard
  • UltraTech79 - Friday, June 22, 2012 - link

    You are confusing things here. I clearly said 120(fps) would need higher end displays (120Hz) I was rounding up 23.976 FPS to 24, give me a break.

    It looks good /to you/ is wholly irrelevant. Do you realize how many people said "it looks very good to me." Referring to SD when resisting the HD movement? Or how many will say it again referring to 1080p thinking 4k is too much? It's a ridiculous mindset.

    My point was that we are upping the resolution, but leaving another very important aspect in the dust that we need to improve. Even audio is moving faster than framerates in movies, and now that most places are switching to digital, the cost to goto the next step has dropped dramatically.
  • nathanddrews - Friday, June 22, 2012 - link

    It was NVIDIA's choice to only implement 4K @ 24Hz (23.xxx) due to limitations of HDMI. If NVIDIA had optimized around DisplayPort, you could then have 4K @ 60Hz.

    For computer use, anything under 60Hz is unacceptable. For movies, 24Hz has been the standard for a century - all film is 24fps and most movies are still shot on film. In the next decade, there will be more and more films that will use 48, 60, even 120fps. Cameron was cock-blocked by the studio when he wanted to film Avatar at 60fps, but he may get his wish for the sequels. Jackson is currently filming The Hobbit at 48fps. Eventually all will be right with the world.
  • karasaj - Wednesday, June 20, 2012 - link

    If we wanted to use this to compare a 640M or 640M LE to the GT640, is this doable? If it's built on the same card, (both have 384 CUDA cores) can we just reduce the numbers by a rough % of the core clock speed to get rough numbers that the respective cards would put out? I.E. the 640M LE has a clock of 500mhz, the 640M is ~625Mhz. Could we expect ~55% of this for the 640M LE and 67% for the 640M? Assuming DDR3 on both so as not to have that kind of difference.
  • Ryan Smith - Wednesday, June 20, 2012 - link

    It would be fairly easy to test a desktop card at a mobile card's clocks (assuming memory type and functional unit count was equal) but you can't extrapolate performance like that because there's more to performance than clockspeeds. In practice performance shouldn't drop by that much since we're already memory bandwidth bottlenecked with DDR3.
  • jstabb - Wednesday, June 20, 2012 - link

    Can you verify if creating a custom resolution breaks 3D (frame packed) blu-ray playback?

    With my GT430, once a custom resolution has been created for 23/24hz, that custom resolution overrides the 3D frame-packed resolution created when 3D vision is enabled. The driver appeared to have a simple fall through logic. If a custom resolution is defined for the selected resolution/refresh rate it is always used, failing that it will use a 3D resolution if one is defined, failing that it will use the default 2D resolution.

    This issue made the custom resolution feature useless to me with the GT430 and pushed me to an AMD solution for their better OOTB refresh rate matching. I'd like to consider this card if the issue has been resolved.

    Thanks for the great review!
  • MrSpadge - Wednesday, June 20, 2012 - link

    It consumes about just as much as the HD7750-800, yet performs miserably in comparison. This is an amazing win for AMD, especially comparing GTX680 and HD7970!
  • UltraTech79 - Wednesday, June 20, 2012 - link

    This preform about as well as an 8800GTS for twice the price. Or half the preformance of a 460GTX for the same price.

    These should have been priced at 59.99.

Log in

Don't have an account? Sign up now