OpenFoam

Several of our readers have already suggested that we look into OpenFoam. That's easier said than done, as good benchmarking means you have to master the sofware somewhat. Luckily, my lab was able to work with the professionals of Actiflow. Actiflow specialises in combining aerodynamics and product design. Calculating aerodynamics involves the use of CFD software, and Actiflow uses OpenFoam to accomplish this. To give you an idea what these skilled engineers can do, they worked with Ferrari to improve the underbody airflow of the Ferrari 599 and increase its downforce.

The Ferrari 599: an improved product thanks to Openfoam.

We were allowed to use one of their test cases as a benchmark, but we are not allowed to discuss the specific solver. All tests were done on OpenFoam 2.2.1 and openmpi-1.6.3.

Many CFD calculations do not scale well on clusters, unless you use InfiniBand. InfiniBand switches are quite expensive and even then there are limits to scaling. We do not have an InfiniBand switch in the lab, unfortunately. Although it's not as low latency as InfiniBand, we do have a good 10G Ethernet infrastructure, which performs rather well.

So we added a fifth configuration to our testing: the quad-node Intel Server System H2200JF. The only CPU that we have eight of right now is the Xeon E5-2650L 1.8GHz. Yes, it is not perfect, but this is the start of our first clustered HPC benchmark. This way we can get an of idea whether or not the Xeon E7 v2 platform can replace a complete quad-node cluster system and at the same time offer much higher RAM capacity.

OpenFoam test

The results are pretty amazing: the quad Xeon E7-4980 v2 runs circles around our quad-node HPC cluster. Even if we were to outfit it with 50% higher clocked Xeons, the quad Xeon E7 v2 would still be the winner. Of course, there is no denying that our quad-node cluster is a lot cheaper to buy. Even with an InfiniBand switch, an HPC cluster with dual socket servers is a lot cheaper than a quad socket Intel Xeon E7 v2.

However, this bodes well for the soon to be released Xeon E5-46xx v2 parts. QPI links are even lower latency than InfiniBand. But since we do not have a lot of HPC testing experience, we'll leave it up to our readers to discuss this in more detail.

Another interesting detail is that the Xeon 2650L at 1.8GHz is about twice as fast as a Xeon L5650. We found AVX code inside OpenFoam 2.2.1, so we assume that this is one of the cases where AVX improves FP performance tremendously. Seasoned OpenFoam users, let us know whether is the accurate assessment.

SAP S&D Benchmark Conclusion
Comments Locked

125 Comments

View All Comments

  • Kevin G - Saturday, February 22, 2014 - link

    Not 100% sure since I'm not an IEEE member to view it, but this paper maybe the source for the POWER7+ figures:
    http://ieeexplore.ieee.org/xpl/articleDetails.jsp?...
  • Phil_Oracle - Monday, February 24, 2014 - link

    TDP is great for comparing chip to chip, but what really matters is system performance/watt. And although Intel's latest Xeon E7 v2 may have better TDP specs than either Power7+ or SPARC T5, when you look at the total system performance/watt, SPARC T5 actually leads today due to its higher throughput, core count, 4 x more threads, built-in encryption engines and higher optimization with the Oracle SW stack.
  • Flunk - Friday, February 21, 2014 - link

    8 core consumer chips now please. If you have to take the GPU off go for it.
  • DanNeely - Friday, February 21, 2014 - link

    Assuming you mean 8 identical cores, until mainstream consumer apps appear that can use more CPU resources than the 4HT cores in Intel's high end consumer chips but which can't benefit from GPU acceleration become common it's not going to happen.

    I suppose Intel could do a big.little type implementation with either core and atom or atom and the super low power 486ish architecture they announced a few months ago in the future. But in addition to thinking it was worthwhile for the power savings, they'd also need to license/work around arm's patents. I suppose a mobile version might happen someday; but don't really see a plausible benefit for laptop/desktop systems that don't need continuous connected standby like phones do.
  • Kevin G - Friday, February 21, 2014 - link

    Intel hasn't announced any distinct plans to go this route, they're at least exploring the idea at some level. The SkyLake and Knights Landing are to support the same ISA extensions and in principle a program could migrate between the two types of cores.
  • StevoLincolnite - Saturday, February 22, 2014 - link

    Er. You don't need apps to use more than 4 threads to make use of an 8 core processor.
    Whatever happened to running several demanding applications at once? Surely I am not the only one who does this...
    My Sandy-Bridge-E processor being a few years old is starting to show it's age in such instances, I would cry tears of blood for an 8-Core Haswell based processor to replace my current 6-core chip.
  • psyq321 - Monday, March 10, 2014 - link

    Well, you can buy bigger Ivy Bridge EP Xeon CPU and fit it in your LGA2011 system.

    This way you can go up to 12 cores and not have to wait for 8-core Haswell E.
  • SirKnobsworth - Friday, February 21, 2014 - link

    8 core Haswell-E chips are due out later this year. You can already buy 6 core Ivy Bridge-E chips with no integrated graphics.
  • TiGr1982 - Friday, February 21, 2014 - link

    Did you know:
    Haswell-E is supposed to be released in Q3 this year, to have up to 8 Haswell cores with HT, fit in the new revision of Socket LGA2011 (incompatible with the current desktop LGA2011), and work with DDR4 and X99 chipset. No GPU there, since it's a byproduct of server Haswell-EP.
  • Harry Lloyd - Friday, February 21, 2014 - link

    That will not help much, unless they release a 6-core chip for around 300 $, replacing the lowest LGA2011 4-core chips. It is about time.

Log in

Don't have an account? Sign up now