Synthetics

As always we’ll also take a quick look at synthetic performance. These tests mainly serve as a canary for finding important architectural and configuration changes.

Synthetic: TessMark, Image Set 4, 64x Tessellation

Tessellation performance has scaled very closely with the change in SMMs and clock speeds, just as we would expect here.

Synthetic: 3DMark Vantage Texel Fill

Texel throughput has also taken a hit in accordance with the loss of SMMs and clock speed. Based on gaming performance the GTX 970 doesn’t appear to be too badly handicapped here, but it definitely doesn’t have much in the way of texel throughput to spare.

Synthetic: 3DMark Vantage Pixel Fill

Pixel throughput on the other hand ends up being extremely odd and not at all what we were expecting. The GTX 970 takes an incredible dive here, with its pixel fillrate dropping by 26%. At a high level this test is bounded by memory bandwidth and ROP throughput, and both of these factors should be identical between GTX 980 and GTX 970. Instead we see GTX 970 lose more performance than should theoretically be possible, as the 26% drop is more than the accumulated difference between the clock speed and SMM differences.

At this point we’re still trying to figure out exactly what’s going on. We have no other evidence that there’s a difference in ROP throughput or memory bandwidth between the GTX 980 and GTX 970 so it is not clear to us where the difference lies. One possibility is that this is somehow bottlenecked at the Raster Engine level – where each of the four engines accounts for 25% of the work – but the pigeonhole principle means that NVIDIA can’t disable a GPC since at least 1 SMM must be active in each GPC partition. This matter will require further research.

GRID 2 Compute
Comments Locked

155 Comments

View All Comments

  • Casecutter - Friday, September 26, 2014 - link

    I’m confident in if we had two of what where the normal "AIB OC customs" of both a 970 and 290 things between might not appear so skewed. First as much as folks want this level of card to get them into 4K, there not... So it really just boils down to seeing what similarly generic OC custom offer and say "spar back and forth" @2560x1440 depending on the titles.

    As to power I wish these reviews would halt the inadequate testing like it’s still 2004! The power (complete PC) should for each game B-M’d, and should record in retime the oscillation of power in milliseconds, then output the 'mean' over the test duration. As we know each title fluctuates boost frequency across every title, the 'mean' across each game is different. Then each 'mean' can be added and the average from the number of titles would offer to most straight-forward evaluation of power while gaming. Also, as most folk today "Sleep" their computers (and not many idle for more than 10-20min) I believe the best calculation for power is what a graphics card "suckles" while doing nothing like 80% each month. I’d more like to see how AMD ZeroCore impacts a machines power usage over a months’ time, verse the savings only during gaming. Consider gaming 3hr a day which constitutes 12.5% of a month, does the 25% difference in power gaming beat the 5W saved with Zerocore 80% of that month. Saving energy while using and enjoying something is fine, although wasting watts while doing nothing is incomprehensible.
  • Impulses - Sunday, September 28, 2014 - link

    Ehh, I recently bought 2x custom 290, but I've no doubt that even with a decent OC the 970 can st the very least still match it in most games... I don't regret the 290s, but I also only paid $350/360 for my WF Gigabyte cards, had I paid closer to $400 I'd be kicking myself right about now.
  • Iketh - Monday, September 29, 2014 - link

    most PCs default to sleeping during long idles and most people shut it off
  • dragonsqrrl - Friday, September 26, 2014 - link

    Maxwell truly is an impressive architecture, I just wish Nvidia would stop further gimping double precision performance relative to single precision with each successive generation of their consumer cards. GF100/110 were capped at 1/8, GK110 was capped at 1/24, and now GM204 (and likely GM210) is capped at 1/32... What's still yet to be seen is how they're capping the performance on GM204, whether it's a hardware limitation like GK104, or a clock speed limitation in firmware like GK110.

    Nvidia: You peasants want any sort of reasonable upgrade in FP64 performance? Pay up.
  • D. Lister - Friday, September 26, 2014 - link

    "Company X: You peasants want any sort of reasonable upgrade in product Y? Pay up."

    Well, that's capitalism for ya... :p. Seriously though, if less DP ability means a cheaper GPU then as a gamer I'm all for it. If a dozen niche DP hobbyists get screwed over, and a thousand gamers get a better deal on a gaming card then why not? Remember what all that bit mining nonsense did to the North American prices of the Radeons?
  • D. Lister - Friday, September 26, 2014 - link

    Woah, it seems they do tags differently here at AT :(. Sorry if the above message appears improperly formatted.
  • Mr Perfect - Friday, September 26, 2014 - link

    It's not you, the italic tag throws in a couple extra line breaks. Bold might too, I seem to remember that mangling a post of mine in the past.
  • D. Lister - Sunday, September 28, 2014 - link

    Oh, okay, thanks for the explanation :).
  • wetwareinterface - Saturday, September 27, 2014 - link

    this^

    you seem to be under the illusion that nvidia intended to keep shooting themselves in the foot forever in regards to releasing their high end gpgpu chip under a gaming designation and relying on the driver (which is easy to hack) to keep people from buying a gamer card for workstation loads. face it they wised up and charge extra for fp64 and the higher ram count now. no more cheap workstation cards. the benefit as already described is cheaper gaming cards that are designed to be more efficient at gaming and leave the workstation loads to the workstation cards.
  • dragonsqrrl - Saturday, September 27, 2014 - link

    This is only partially true, and I think D. Lister basically suggested the same thing so I'll just make a single response for both. The argument for price and efficiency would really only be the case for a GK104 type scenario, where on die FP64 performance is physically limited to 1/24 FP32 due to there being 1/24 the Cuda cores. But what about GK110? There is no reason to limit it to 1/24 SP other than segmentation. There's pretty much no efficiency or price argument there, and we see proof of that in the Titan, no less efficient at gaming and really no more expensive to manufacture outside the additional memory and maybe some additional validation. In other words there's really no justification (or at least certainly not the justification you guys are suggesting) for why the GTX780 Ti couldn't have had 1/12 SP with 3GB GDDR5 at the same $700 MSRP, for instance. Of course other than further (and in my opinion unreasonable) segmentation.

    This is why I was wondering how they're capping performance in GM204.

Log in

Don't have an account? Sign up now