Energy and HPC

AVX/FP intensive applications are known to be real power hogs. How bad can it get? We used the OpenFOAM test and measured both average and maximum power (the 95th percentile). Average power tells us how much energy will be consumed for each HPC job while maximum power is important as you have to allocate enough amps to your rack to feed your HPC server/cluster.

HPC maximum performance power consumption

This confirms there is more going on than just the fact that our "Wildcat Pass" server consumes more than the Supermicro server in this test. At peak, the Xeon E5-2699 v3 consumes almost 450W (!!) more than at idle. Even if we assume that the fans take 100W, that means that 350W is going to the CPUs. That's around 175W per socket, and even though it's measured at the wall and thus includes the Voltager regulators, that's a lot of power. The Xeon E5-2699 v3 is a massive powerhouse, but it's one that needs a lot of amps to perform its job.

Interestingly, the Xeon E5-2695 v3 is also using more power than all previous Xeons. The contrast with our Drupal power measurements is very telling. In the Drupal test, the CPU was able to let many of the cores sleep a lot of the time. In OpenFOAM, all the cores are working at full bore and the superior power savings of the Haswell cores deep sleep states do not matter much. But which CPU is the winner? To make this more clear, we have to calculate the actual energy consumed (average power x time ran).

Total HPC Energy Consumption per job

When we look at how much energy is consumed to get the job done, the picture changes. The old Xeon "Sandy Bridge EP" is far behind. It is clear that Intel has improved AVX efficiency quite a bit. The low power Xeon E5-2650L v3 is a clear winner. In second place, the fastest Xeon on the planet actually saves energy compared to the older Xeons, as long as you can provide the peak amps.

HPC: OpenFoam LRDIMMs: Capacity and Real World Performance
Comments Locked

85 Comments

View All Comments

  • bsd228 - Friday, September 12, 2014 - link

    Now go price memory for M class Sun servers...even small upgrades are 5 figures and going 4 years back, a mid sized M4000 type server was going to cost you around 100k with moderate amounts of memory.

    And take up a large portion of the rack. Whereas you can stick two of these 18 core guys in a 1U server and have 10 of them (180 cores) for around the same sort of money.

    Big iron still has its place, but the economics will always be lousy.
  • platinumjsi - Tuesday, September 9, 2014 - link

    ASRock are selling boards with DDR3 support, any idea how that works?

    http://www.asrockrack.com/general/productdetail.as...
  • TiGr1982 - Tuesday, September 9, 2014 - link

    Well... ASRock is generally famous "marrying" different gen hardware.
    But here, since this is about DDR RAM, governed by the CPU itself (because memory controller is inside the CPU), then my only guess is Xeon E5 v3 may have dual-mode memory controller (supporting either DDR4 or DDR3), similarly as Phenom II had back in 2009-2011, which supported either DDR2 or DDR3, depending on where you plugged it in.

    If so, then probably just the performance of E5 v3 with DDR3 may be somewhat inferior in comparison with DDR4.
  • alpha754293 - Tuesday, September 9, 2014 - link

    No LS-DYNA runs? And yes, for HPC applications, you actually CAN have too many cores (because you can't keep the working cores pegged with work/something to do, so you end up with a lot of data migration between cores, which is bad, since moving data means that you're not doing any useful work ON the data).

    And how you decompose the domain (for both LS-DYNA and CFD makes a HUGE difference on total runtime performance).
  • JohanAnandtech - Tuesday, September 9, 2014 - link

    No, I hope to get that one done in the more Windows/ESXi oriented review.
  • Klimax - Tuesday, September 9, 2014 - link

    Nice review. Next stop: Windows Server. (And MS-SQL..)
  • JohanAnandtech - Tuesday, September 9, 2014 - link

    Agreed. PCIe Flash and SQL server look like a nice combination to test this new Xeons.
  • TiGr1982 - Tuesday, September 9, 2014 - link

    Xeon 5500 series (Nehalem-EP): up to 4 cores (45 nm)
    Xeon 5600 series (Westmere-EP): up to 6 cores (32 nm)
    Xeon E5 v1 (Sandy Bridge-EP): up to 8 cores (32 nm)
    Xeon E5 v2 (Ivy Bridge-EP): up to 12 cores (22 nm)
    Xeon E5 v3 (Haswell-EP): up to 18 cores (22 nm)

    So, in this progression, core count increases by 50% (1.5 times) almost each generation.

    So, what's gonna be next:

    Xeon E5 v4 (Broadwell-EP): up to 27 cores (14 nm) ?

    Maybe four rows with 5 cores and one row with 7 cores (4 x 5 + 7 = 27) ?
  • wallysb01 - Wednesday, September 10, 2014 - link

    My money is on 24 cores.
  • SuperVeloce - Tuesday, September 9, 2014 - link

    What's the story with 2637v3? Only 4 cores and the same freqency and $1k price as 6core 2637v2? By far the most pointless cpu on the list.

Log in

Don't have an account? Sign up now