IGP Compute

One of the touted benefits of Haswell is the compute capability afforded by the IGP.  For anyone using DirectCompute or C++ AMP, the compute units of the HD 4600 can be exploited as easily as any discrete GPU, although efficiency might come into question.  Shown in some of the benchmarks below, it is faster for some of our computational software to run on the IGP than the CPU (particularly the highly multithreaded scenarios). 

Grid Solvers - Explicit Finite Difference on IGP

As before, we test both 2D and 3D explicit finite difference simulations with 2n nodes in each dimension, using OpenMP as the threading operator in single precision.  The grid is isotropic and the boundary conditions are sinks.  We iterate through a series of grid sizes, and results are shown in terms of ‘million nodes per second’ where the peak value is given in the results – higher is better.

Explicit Finite Difference Solver (2D) on IGPExplicit Finite Difference Solver (3D) on IGP

N-Body Simulation on IGP

As with the CPU compute, we run a simulation of 10240 particles of equal mass - the output for this code is in terms of GFLOPs, and the result recorded was the peak GFLOPs value.

N-Body Simulation on IGP

Matrix Multiplication on IGP

Matrix Multiplication occurs in a number of mathematical models, and is typically designed to avoid memory accesses where possible and optimize for a number of reads and writes depending on the registers available to each thread or batch of dispatched threads.  He we have a crude MatMul implementation, and iterate through a variety of matrix sizes to find the peak speed.  Results are given in terms of ‘million nodes per second’ and a higher number is better.

Matrix Multiplication on IGP

3D Particle Movement on IGP

Similar to our 3DPM Multithreaded test, except we run the fastest of our six movement algorithms with several million threads, each moving a particle in a random direction for a fixed number of steps.  Final results are given in million movements per second, and a higher number is better.

3D Particle Movement on IGP

CPU Compute Overclocking Results
Comments Locked

26 Comments

View All Comments

  • Hairs_ - Saturday, December 14, 2013 - link

    A crossfire test is absolutely a valid metric, and testing with older generations of cpu and gpu is something I'm wholly in favour of. My issue is that if someone was buying a dual gpu setup, they would match the cards. If they upgraded after a time differential due to a budget concern, they might mix generations. I don't see a scenario where someone is buying dual+single, however, because if you had the budget to go for a top of the line dual card, you'd be better off with two cheaper cards in crossfire.

    Would people still have 4 gen old setups? Absolutely! Would they mix and match cards to suit a tight budget? Sure. Would any retail buyer follow *this* pattern? Vanishingly unlikely.
  • Egg - Sunday, December 15, 2013 - link

    Wait, why wouldn't you use a 5970 and a 5870? Sure you lose a bit on clock speeds but it's exactly the same as triple 5870 otherwise, isn't it? And you save some slots.

    Perhaps someone wanted to do a really expensive water cooled microATX build with 4 slots. IDK, but it doesn't sound that farfetched in practice.

    I haven't done this, or CF'ed anything at all, so this is just my two cents...
  • Hairs_ - Sunday, December 15, 2013 - link

    Why? Like I said, there's no physical reason why you *couldn't*, but people use crossfire/sli for the same reason people overclock: to get top-line performance out of a smaller budget. The other use scenario is someone who wants bragging rights and doesn't care about the cost.

    For the primary, that user is not going to pay for a top line dual card when they could get similar numbers from two lower end cards. For the secondary use case, if money is no object (which it would have to be), why wouldn't you have bought two 5970's?

    This example doesn't fit into any consumer behavior, so I have no idea why it would be used as a test.
  • Giffs - Friday, December 13, 2013 - link

    No idea if there is any point in using winrar 4.2
    But there is winrar 5.01 why not using the latest version? wouldn't it bring improvements on the software side and more accurate results of some sorte??
  • sinPiEqualsZero - Saturday, December 14, 2013 - link

    Thanks for the writeup. I'm shopping for memory and am happy I got to read this first.

    Also, I noticed an issue: " More expensive kits do not always equal performance, and as our benchmarks go, higher specification kits might also have little affect" should be effect with an e.

    Thanks, Ian!
  • Hairs_ - Saturday, December 14, 2013 - link

    There's no "might" about it. High spec kits make NO difference.
  • Hairs_ - Saturday, December 14, 2013 - link

    I wonder how it is that the entirely arbitrary "Performance Index" which was devised for memory testing isn't at all borne out by real-world data. Yet the bald statement

    "From the data in our memory overview, it was clear that any kit with a performance index of less than 200 was going to have issues on certain benchmarks. The Corsair kit has a PI of 240, which is at the higher end of the spectrum."

    is still maintained.

    There are no facts to back this statement up, as proved in the tests. Are Anandtech reviews going to continue to ignore factual data in favour of preconceived assumptions? I hope not.
  • Ytterbium - Saturday, December 14, 2013 - link

    Explicit Finite Difference, in this graph you have 1333 C9 mid pack and 1866 C9 at bottom, I assume this is typo?
  • Hairs_ - Saturday, December 14, 2013 - link

    if you look at all the graphs, the results aren't consistent at all. The kit that top one graph can be bottom of the next. Furthermore, the differences between top and bottom scoring kits is negligible in almost all tests, so many of the differences in rank can be due to statistical variance rather than a meaningfully measured performance difference.

    E.g. in many tests, the fastest kit in terms of headline mhz (3ghz) is beaten by theoretically slower stuff.
  • Gen-An - Saturday, December 14, 2013 - link

    I wonder if Corsair has purposely set out to make this kit look bad. Every single review I've seen of the Vengeance Pro 2x8GB 2400C10 kit has been Ver4.21, which uses Samsung 4Gbit B-die ICs and are infamous for not being able to clock much higher than about DDR3-2500 or so. I have four sticks of these and they are Ver5.29 using Hynix 4Gbit MFR and I've done Super Pi 32M runs at DDR3-3000 12-15-15-45 and they are rock solid at DDR3-2666 11-13-13.

Log in

Don't have an account? Sign up now