LS-DYNA Power Consumption

For HPC buyers, peak power tends to be a very important metric. As HPC systems are run at close to or equal to 100% CPU load, the energy consumption is at its peak for a long time. Peak power thus also determines the cooling and energy requirements. This is in sharp contrast with most other servers, where calculating the power and amps based on the peak load of a complete rack is considered wasteful as it is highly unlikely that all servers will hit 100% CPU load at the same time. We took the 95th percentile of our power numbers.

LS-DYNA Peak Power consumption

Note that the Xeon E5 numbers are not directly comparable to the Opteron numbers as the CPUs are tested in servers with different form factors. We will tackle that in the next test. Let us focus on the Opteron results for now.

AMD has made some real progres here. At the same clock, the total power consumption is 6% lower. Even at a 200MHz higher clock the peak power is very slightly—but consistently—lower (2%).

Of course, we also want to compare the AMD and Intel CPUs directly. To do this, we always run the fans at maximum speed. That way, the fans always consume the same amount of power. We then test with one and two CPUs, while keeping the amount of memory (64GB) the same. This way we measure how much extra power you consume at the wall when you add a second CPU. This number thus includes the voltage regulators (which can amount to up to 10% of the total server power) and the PSU inefficiency.

LS-DYNA Peak CPU Power consumption

The Intel Xeon has a TDP of 95W, but even with a very FP intensive application it does not get anywhere near that number. About 75W out of those 94W are consumed by the CPU, as measured by our Hardware Monitoring Software that reads out the MSRs. We are still working on our version for the AMD platform (AMD's documentation is a bit late), but we estimate that the Opteron 6376 consumes about 110W and the Opteron 6380 needs about 120W. That means that AMD's top CPUs are probably consuming a bit more than their TDP indicates if you push the FP unit hard.

We also tried to measure idle power. Take the numbers with a grain of salt, but we measured about 19-20W for the Opteron 6380 (p-states disabled), 17-18W for the Opteron 6376 and 16-17W for the Xeon.

HPC: LSTC's LS Dyna Encryption and Decryption
Comments Locked

55 Comments

View All Comments

  • Sivar - Wednesday, February 20, 2013 - link

    Please go away. You don't add any new information to the discussion.

    Your writing is of a teenager who knows nothing of processor architecture, the brilliant engineers at both AMD and Intel, or the competitive landscape.

    You present no data, only misinformed opinion. You reduce the quality of this discussion, and have shown no interest in improving your knowledge.
  • JamesAnthony - Wednesday, February 20, 2013 - link

    In the article it mentions you were using the E5-2660 CPU (8 core 2.2 GHz) 95W, in a Dell PowerEdge R720 server

    It may have been a lot more useful to also have included the E5-2680 (8 core 2.7 GHz) and the E5-2690 (8 Core 2.9 GHz) as while they are 130W parts, they are ones that are often used in the PowerEdge R720 and from what we find in a lot of server sales the higher performance ones are very popular for transactional database servers and payment processing servers.

    If you want to go head to head on Intel's top part vs AMD's top part, then it would seem it should be the E5-2690 vs 6386 SE
  • JohanAnandtech - Wednesday, February 20, 2013 - link

    We all know that when you want top performance, Intel is the way to go. So I don't really see the point, even AMD will tell you that the 6376 and 6380 are their most competitive parts.. It is pretty obvious that the E5-2690 2.9 GHz will be faster and consume less than a 6386SE. I don't think our readers really need to see numbers on that.

    And I really doubt that the E5-2690 are sold that much. Most reports say that the top bins with the highest TDP are less than 5% of the total sales.
  • lwatcdr - Wednesday, February 20, 2013 - link

    Wow this is about the most gibberish I have seen in a post ever.
    Good heavens you are an idiot.
    Let's just tear this post bits so this person will NEVER post on here again.
    1"No, it's worth per dollar that you have paid to buy Intel based servers. Intel is more reliable because it has Hyperthreading so you can reduce the latencies that will occur in every workloads."
    Hyperthreading has nothing to do with reliability. So that was a waste of bandwidth.
    "Unlike AMD's engineers who can not design a microprocessor properly. It was AMD's own fault why AMD did not have money like Intel"
    My I introduce you to Titan http://www.olcf.ornl.gov/titan/ The worlds most powerful computer and powered by AMD cpus. AKA yea I think that AMD can actually do pretty well at designing CPUs so this part of your post is also pure manure.
    "Look 99% Bank's in the world uses Intel based ATM as Intel processor can send information without any error." And here we can see that you understand nothing about digital theory or communications. Again a waste of bandwidth.
    "That is why IBM itself does not use Power based processors for its ATM machine because its CEO has admitted that its engineers are not capable to design a lower power processor. So, IBM uses Intel as the standard processor to exchange information between ATM machine to server, so every digits that sent will come in exact same digits when it has been received."
    The IBM power line is for high end systems not for ATM machines. Odds are good that many banks use Power based system for handling ATM transactions. IBM uses Intel or AMD because it is cheap and you can get standard boards. As to the every digit sent nonsense. IT IS DIGITAL you MORON. The communications links have error checking and correction not the CPUs. Please NEVER WASTE OUR TIME AGAIN, YOU KNOW NOTHING OF VALUE ON THIS SUBJECT.
  • toyotabedzrock - Wednesday, February 20, 2013 - link

    Something is wrong with the LZMA benchmarks.

    Can you do a realworld test? There are scripts out there to do this.

    LZMA is built around the idea that decompression is supposed to be much faster than compression.
  • JohanAnandtech - Wednesday, February 20, 2013 - link

    From the 7zip manual:

    "The benchmark shows a rating in MIPS (million instructions per second). The rating value is calculated from the measured speed, and it is normalized with results of Intel Core 2 CPU with multi-threading option switched off. "

    So that is the reason why the compression MIPS values are in the same order as the decompression. The decompression "MB/s" values are indeed about 10x and more higher than compression.
  • Oldboy1948 - Thursday, February 21, 2013 - link

    It is an interesting bench and if cache and memory are fast decompress and compress will be very close. It looks better for Bulldozer in this:
    http://www.7-cpu.com/

    ARM has a long way to go if it will be a server one day.
  • extide - Wednesday, February 20, 2013 - link

    Can we PLEASE get folding@home benches?! musky on the hardocp forums has come up with a system where you can run repeatable benchmarks. Myself as well as many others would really love to see F@H benches on systems like this!
  • JohanAnandtech - Wednesday, February 20, 2013 - link

    Ok, Link? :-)
  • alpha754293 - Wednesday, February 20, 2013 - link

    Because of the way that the current Opteron architecture is (1 FPU per module), did you run with the number of LS-DYNA processes equal to the number of FPUs on chip or did you run it based on per "core" (i.e. 2 processes per module)?

Log in

Don't have an account? Sign up now