The GeForce GTX 1060 Founders Edition & ASUS Strix GTX 1060 Reviewby Ryan Smith on August 5, 2016 2:00 PM EST
Shifting gears, let’s take a look at compute performance on GTX 1060.
As we already had the chance to categorize the Pascal architecture’s compute performance in our GTX 1080 review, there shouldn’t be any surprises here. But it will be interesting to see whether the GTX 1060’s higher ratio of memory bandwidth per FLOP materially impacts overall compute performance.
Starting us off for our look at compute is LuxMark3.1, the latest version of the official benchmark of LuxRender. LuxRender’s GPU-accelerated rendering mode is an OpenCL based ray tracer that forms a part of the larger LuxRender suite. Ray tracing has become a stronghold for GPUs in recent years as ray tracing maps well to GPU pipelines, allowing artists to render scenes much more quickly than with CPUs alone.
While GTX 1060 could hang with GTX 980 in gaming benchmarks, we don’t start off the same way with compute benchmarks, with the last-generation flagship holding about 17% ahead. Unfortunately for NVIDIA, this is about where GTX 1060 needed to be to best RX 480; instead it ends up trailing the AMD competition. Otherwise the performance gain versus the GTX 960 stands at 65%.
For our second set of compute benchmarks we have CompuBench 1.5, the successor to CLBenchmark. CompuBench offers a wide array of different practical compute workloads, and we’ve decided to focus on face detection, optical flow modeling, and particle simulations.
Like with GTX 1080, relative performance is all over the place. GTX 1060 wins with face detection, loses at optical flow, and wins again at particle simulation. Even the gains versus GTX 960 are a bit more uneven, though at the end of the day GTX 1060 ends up being significantly faster than its predecessor with all 3 sub-benchmarks.
Moving on, our 3rd compute benchmark is the next generation release of FAHBench, the official Folding @ Home benchmark. Folding @ Home is the popular Stanford-backed research and distributed computing initiative that has work distributed to millions of volunteer computers over the internet, each of which is responsible for a tiny slice of a protein folding simulation. FAHBench can test both single precision and double precision floating point performance, with single precision being the most useful metric for most consumer cards due to their low double precision performance. Each precision has two modes, explicit and implicit, the difference being whether water atoms are included in the simulation, which adds quite a bit of work and overhead. This is another OpenCL test, utilizing the OpenCL path for FAHCore 21.
Finally, in Folding@Home, we see the usual split between single precision and double precision performance. GTX 1060 is solidly in the lead when using FP32, but NVIDIA’s poor FP64 rate means that if double precision is needed, RX 480 will pull ahead.