Compute Performance

With Haswell, Intel enables full OpenCL 1.2 support in addition to DirectX 11.1 and OpenGL 4.0. Given the ALU-heavy GPU architecture, I was eager to find out how well Iris Pro did in our compute suite.

As always we'll start with our DirectCompute game example, Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. While DirectCompute is used in many games, this is one of the only games with a benchmark that can isolate the use of DirectCompute and its resulting performance.

Compute: Civilization V

Iris Pro does very well here, tying the GT 640 but losing to the 650M. The latter holds a 16% performance advantage, which I can only assume has to do with memory bandwidth given near identical core/clock configurations between the 650M and GT 640. Crystalwell is clearly doing something though because Intel's HD 4600 is less than 1/3 the performance of Iris Pro 5200 despite having half the execution resources.

Our next benchmark is LuxMark2.0, the official benchmark of SmallLuxGPU 2.0. SmallLuxGPU is an OpenCL accelerated ray tracer that is part of the larger LuxRender suite. Ray tracing has become a stronghold for GPUs in recent years as ray tracing maps well to GPU pipelines, allowing artists to render scenes much more quickly than with CPUs alone.

Compute: LuxMark 2.0

Moving to OpenCL, we see huge gains from Intel. Kepler wasn't NVIDIA's best compute part, but Iris Pro really puts everything else to shame here. We see near perfect scaling from Haswell GT2 to GT3. Crystalwell doesn't appear to be doing much here, it's all in the additional ALUs.

Our 3rd benchmark set comes from CLBenchmark 1.1. CLBenchmark contains a number of subtests; we’re focusing on the most practical of them, the computer vision test and the fluid simulation test. The former being a useful proxy for computer imaging tasks where systems are required to parse images and identify features (e.g. humans), while fluid simulations are common in professional graphics work and games alike.

Compute: CLBenchmark 1.1 Computer Vision

Compute: CLBenchmark 1.1 Fluid Simulation

Once again, Iris Pro does a great job here, outpacing everything else by roughly 70% in the Fluid Simulation test.

Our final compute benchmark is Sony Vegas Pro 12, an OpenGL and OpenCL video editing and authoring package. Vegas can use GPUs in a few different ways, the primary uses being to accelerate the video effects and compositing process itself, and in the video encoding step. With video encoding being increasingly offloaded to dedicated DSPs these days we’re focusing on the editing and compositing process, rendering to a low CPU overhead format (XDCAM EX). This specific test comes from Sony, and measures how long it takes to render a video.

Compute: Sony Vegas Pro 12 Video Render

Iris Pro rounds out our compute comparison with another win. In fact, all of the Intel GPU solutions do a good job here.

3DMarks & GFXBenchmark Quick Sync & CPU Performance
Comments Locked

177 Comments

View All Comments

  • Phrontis - Wednesday, June 5, 2013 - link

    I can't wait for one on these on a mITX board such with 3 decent monitor outputs. Theres enough power for the sort of things I do if not for gaming.

    Phrontis
  • khanov - Friday, June 7, 2013 - link

    Without a direct comparison between HD 5000/5100 and Iris Pro 5200 with Crystalwell,
    how can we conclude that Crystalwell has any effect in any of the game benchmarks? While it clearly is of benefit in some compute tasks, in the game benchmarks you only compare to HD 4600 with half as many EU's and to Nvidia and AMD with their different architectures.

    We really need to see Iris Pro 5200 vs HD5100 to get an apples to apples comparison and be able to determine if Crystalwell is worth the extra money.
  • MODEL3 - Sunday, June 9, 2013 - link

    Haswell ULT GT3 (Dual-Core+GT3) = 181mm2 and 40 EU Haswell GPU is 174mm^2.
    7mm^2 for everything else except GT3?
  • n13L5 - Tuesday, June 11, 2013 - link

    " An Ultrabook SKU with Crystalwell would make a ton of sense, but given where Ultrabooks are headed (price-wise) I’m not sure Intel could get any takers."

    They sure seem to be going up in price, rather than down at the moment...
  • anandfan86 - Tuesday, June 18, 2013 - link

    Intel has once again made their naming so confusing that even their own marketing weasels can't get it right. Notice that the Intel slide titled "4th Gen Intel Core Processors H-Processors Line" calls the graphics in the i7-4950HQ and i7-4850HQ "Intel HD Graphics 5200" instead of the correct name which is "Intel Iris Pro Graphics 5200". This slide calls the graphics in the i7-4750HQ "Intel Iris Pro Graphics 5200" which indicates that the slide was made after the creation of that name. It is little wonder that most media outlets are acting as if the biggest tech news of the month is the new pastel color scheme in iOS 7.
  • Myoozak - Wednesday, June 26, 2013 - link

    The peak theoretical GPU performance calculations shown are wrong for Intel's GFLOPS numbers. Correct numbers are half of what is shown. The reason is that Intel's execution units are made of of an integer vec4 processor and a floating-point vec4 processor. This article correctly states it has a 2xvec4 SIMD, but does not point out that half is integer and half is floating-point. For a GFLOPS computation, one should only include the floating-point operations, which means only half of that execution unit's silicon is getting used. The reported computation performance would only be correct if you had an algorithm with a perfect mix of integer & float math that could be co-issued. To compare apples to apples, you need to stick to GFLOPS numbers, and divide all the Intel numbers in the table by 2. For example, peak FP ops on the Intel HD4000 would be 8, not 16. Compared this way, Intel is not stomping all over AMD & nVidia for compute performance, but it does appear they are catching up.
  • alexcyn - Tuesday, August 6, 2013 - link

    I heard that Intel 22nm process equals TSMS 26nm, so the difference is not that much.
  • alexcyn - Tuesday, August 6, 2013 - link

    I heard that Intel 22nm process equals TSMC 26nm, so the difference is not that big.
  • Doughboy(^_^) - Friday, August 9, 2013 - link

    I think Intel could push their yield way up by offering 32MB and 64MB versions of Crystalwell for i3 and i5 processors. They could charge the same markup for the 128, but sell the 32/64 for cheaper. It would cost Intel less and probably let them take even further market share from low-end dGPUs.
  • krr711 - Monday, February 10, 2014 - link

    It is funny how a non-PC company changed the course of Intel forever for the good. I hope that Intel is wise enough to use this to spring-board the PC industry to a new, grand future. No more tick-tock nonsense arranged around sucking as many dollars out of the customer as possible, but give the world the processing power it craves and needs to solve the problems of tomorrow. Let this be your heritage and your profits will grow to unforeseen heights. Surprise us!

Log in

Don't have an account? Sign up now