Compute

Shifting gears, let’s take a look at compute performance on the GTX 1070 Ti.

As the GTX 1070 Ti is another GP104 SKU – and a fairly straightforward one at that – there shouldn’t be any surprises here. Relative to the GTX 1070, all of NVIDIA's performance improvements actually favor compute performance, so we should see a decent bump in performance here. However it won't change the fact that ultimately the GTX 1070 Ti will still come in below the GTX 1080, which has more SMs and a higher average clockspeed (never mind the benefits of more memory bandwidth).

Starting us off for our look at compute is Blender, the popular open source 3D modeling and rendering package. To examine Blender performance, we're running BlenchMark, a script and workload set that measures how long it takes to render a scene. BlechMark uses Blender's internal Cycles render engine, which is GPU accelerated on both NVIDIA (CUDA) and AMD (OpenCL) GPUs.

Compute: Blender 2.79 - BlenchMark

As you might expect, the GTX 1070 Ti's performance shoots ahead of the GTX 1070's due to the additional enabled SMs of this new video card SKU. In fact it technically outpaces the GTX 1080 by a single second, which although eye-popping, is within our margin of error. However what it can't do is overtake AMD's lead here, with the NVIDIA cards trailing the Vega family by quite a bit.

For our second set of compute benchmarks we have CompuBench 2.0, the latest iteration of Kishonti's GPU compute benchmark suite. CompuBench offers a wide array of different practical compute workloads, and we’ve decided to focus on level set segmentation, optical flow modeling, and N-Body physics simulations.

Compute: CompuBench 2.0 - Level Set Segmentation 256Compute: CompuBench 2.0 - N-Body Simulation 1024KCompute: CompuBench 2.0 - Optical Flow

In all 3 sub-tests, the GTX 1070 Ti makes modest gains. Overall, performance is now quite close to the GTX 1080, which makes sense given the relatively small gap in on-paper compute performance between the two cards. This also means that at least in the case of these benchmarks, the lack of additional memory bandwidth isn't hurting the GTX 1070 Ti too much. However looking at the broader picture, all of the NVIDIA GP104 cards are trailing AMD's Vega family outside of the more equitable level set segmentation sub-test.

Moving on, our 3rd compute benchmark is the next generation release of FAHBench, the official Folding @ Home benchmark. Folding @ Home is the popular Stanford-backed research and distributed computing initiative that has work distributed to millions of volunteer computers over the internet, each of which is responsible for a tiny slice of a protein folding simulation. FAHBench can test both single precision and double precision floating point performance, with single precision being the most useful metric for most consumer cards due to their low double precision performance.

Compute: Folding @ Home Single Precision

The GTX 1080 and GTX 1070 were already fairly close on this benchmark, so there's not a lot of room for the GTX 1070 Ti to stand out. Interestingly this is another case where performance actually slightly exceeds the GTX 1080 – though again within the margin of error – which further affirms just how close the compute performance of the new card is to the GTX 1080.

Our final compute benchmark is Geekbench 4's GPU compute suite. A multi-faceted test suite, Geekbench 4 runs seven different GPU sub-tests, ranging from face detection to FFTs, and then averages out their scores via their geometric mean. As a result Geekbench 4 isn't testing any one workload, but rather is an average of many different basic workloads.

Compute: Geekbench 4 - GPU Compute - Total Score

As with our other benchmarks, the GTX 1070 Ti more or less bridges the gap between the GTX 1080 and GTX 1070, falling just a few percent short of the GTX 1080 in performance. This is a test where NVIDIA was already doing better than average at, and now with its increased SM count, the GTX 1070 Ti has enough compute performance to surpass AMD's RX Vega 64, something the regular GTX 1070 could not do.

Total War: Warhammer Synthetics
Comments Locked

78 Comments

View All Comments

  • Morawka - Thursday, November 2, 2017 - link

    Nvidia has a ton of flawless GP104 dies stockpiled and the 1080's are selling because they use GDDR5X memory which is slower for mining. This makes perfect sense if what i describe is true. You get rid of all those extra GP104 dies by paring it with lower latency GDDR5. This card was built with miners in mind, particularly with the GDDR5 implementation. .
  • Morawka - Thursday, November 2, 2017 - link

    **Miners are not buying gtx 1080 due to slower GDDR5X. Nvidia re-engineers 1080 for better mining performance.
  • CiccioB - Friday, November 3, 2017 - link

    Your logic is somehow faulty.
    The chip mounted on this 1070Ti is far for allowing them to recycle any stockpile of defective chips: it requires the chip to be fully functional but a single SM (5% of the chip).
    nvidia could sell much more faulty chips with the original 1070 card at whatever price seen the miners do ask for them as if they were slices of bread.
    What nvidia is doing here is just creating a card that on benchmarks runs better than the concurrent card using slightly faulty chips (or disabling them on purpose), selling the card at higher price than the 1070 and just a little below the 1080.

    If nvidia had lots of defective GP104 to get rid they could just have created a 1060Ti. But that would be a useless card that would compete with none but 1070, that is they would lose money by doing do.
  • Kevin G - Thursday, November 2, 2017 - link

    AMD has had a long road to get everything integrated together but things finally seem to be falling into place with a common on-die fabric. Their previous SoC designs still had a proprietary bus for the on-die GPU. Infinity fabric is also being pushed to the GPU team for scaling up their designs as well. Long term, I would predict some GPUs falling into the same sockets as their processors for HPC workloads. This long term idea probably won't happen until they use multiple GPU dies on a single interposer.

    AMD had to do all this work while the company was in in the red but it looks like the results are paying off with a competitive CPU architecture again and some gains on the GPU side too. They're back in the black but I don't think AMD can afford to allocate the resources to a pure enterprise compute project. They won't ignore that market, but the base architecture will stem from the gaming side.
  • cwolf78 - Thursday, November 2, 2017 - link

    I agree it's too expensive as well. I think this should have been set at $399. The sad part is that it's going to be impossible to find at even the MSRP in short order. I'm content to stick with my overclocked GTX 970 until this mining fad is over - or until the current crypto formats become resistant to GPU-based mining.
  • extide - Thursday, November 2, 2017 - link

    Honestly, I see nvidia phasing out the vanilla 1070 over the next few months and then sliding the 1070 Ti into the $399 price slot.
  • CiccioB - Thursday, November 2, 2017 - link

    This means nvidia is not going to sell GP104 with more than a single SM broken... which is a great waste of money for them. So no, they will keep the old card in stock and this Ti version, in fact, will be rarely found, as it is created just for looking better on charts against a no available card from the concurrent. So it does not need to be produced in mass (and probably cutting that single SM on purpose on a perfectly good die, as I do not think that the availability of GP104 with 1 bad SM can be higher that with 5 bad SM).
  • zepi - Thursday, November 2, 2017 - link

    Such chips could still be used sa GTX1070M, so there is still a product where they can be used.

    Not to mention that at this point of the GP104 manufacturing they should be having very nice yields already.
  • DanNeely - Thursday, November 2, 2017 - link

    There's also the Quadro P4000 which at 14 SM enabled is the lowest end GP104 part on the market.

    But yeah the 1070 TI only having a single disabled SM almost certainly speaks to much higher yields since the product first launched a year and a half ago.
  • znd125 - Thursday, November 2, 2017 - link

    Great writing and review.

Log in

Don't have an account? Sign up now