LS-DYNA

LS-DYNA is a "general purpose structural and fluid analysis simulation software package capable of simulating complex real world problems", developed by the Livermore Software Technology Corporation (LSTC). It is used by the automobile, aerospace, construction, military, manufacturing and bioengineering industry. Even simple simulations take hours to complete, so even a small performance increase results in tangible savings. Add to that that many of our readers have been asking that we perform some benchmarking with HPC workloads. So reasons enough to include our own LS-DYNA benchmarking.

These numbers are not directly comparable with AMD's and Intel's benchmarks as we did not perform any special tuning besides using the message passing interface (MPI) version of LS-DYNA ( ls971_mpp_hpmpi ) to run the LS-DYNA solver to get maximum scalability. This is HP-MPI version of LS-DYNA 9.71.

Our first test is a refined revised  Neon crash test simulation.

LS-Dyna Neon-Refined Revised

This is one of the few benchmarks (besides SAP) where the Opteron 6276 outperforms the older Opteron 6174 by a tangible margin (about 20% faster) and is significantly faster than the Xeon 5600, by 40% to be more precise. However, the direct competitor of the 6276, the Xeon E5-2630, will do a bit better (see the E5-2660 6C score). When you are aiming for the best performance, it is impossible to beat the best Xeons: the Xeon E5-2660 offers 26% better performance, the 2690 is 46% faster. It is interesting to note that LS-Dyna does not scale well with clockspeed: the 32% higher clockspeed of the Xeon E5-2690 results in only a 15% speed increase.

A few other interesting things to note: we saw only a very smal performance increase (+5%) due to Hyperthreading. Memory bandwidth does not seem to be critical either, as performance increased by only 6% when we replaced DDR3-1333 with DDR3-1600. If LS-Dyna was bottlenecked severely by the memory speed we should have seen a performance increase close to 20% (1600 vs 1333).

CMT boosted the Opteron 6276's performance by up to 33%, which seems weird at first since LS-DYNA is a typical floating point intensive application. As the shared floating point "outsources" load and stores to the integer cores, the most logical explanation is that LS-DYNA is limited by the load/store bandwidth. This is in sharp contrast with for example 3DS Max where the additional overhead of 16 extra threads slowed the shared FP down instead of speeding it up.

Also, both CPUs seem to have made good use of their turbo capabilities. The AMD Opteron was running at 2.6 GHz most of the time, the Xeon 2690 at 3.3 GHz and the Xeon 2660 at 2.6 GHz.

The second test is the "Three Vehicle Collision Test" simulation, which runs a lot longer.

LS-Dyna Three Vehicle Collision Test

The three vehicle collision test does not change the benchmarking picture, it confirms our early findings. The Opteron Interlagos does well, but the Xeon E5 is the new HPC champion.

Blender and 3DS Max Compression and Encryption
Comments Locked

81 Comments

View All Comments

  • think-ITB-live-OTB - Tuesday, March 6, 2012 - link

    Can i ask you a question? do you at least get paid when you bend over for Intel?

    These are Server Chips - who cares about single-threaded application performance.. or Corporate IPOs. AMD has delivered far greater TCO/performance than Intel has for at least a Decade and running.

    You want to praise a company like a Deity? ARM Holdings. nuff said. They can design a 35 dollar computer that can decode H.264 better than Intel can on SoCs that run 4x's the price. Currently have more Chips in more devices than in Intels entire history and Push Power envelopes far beyond anything Intel could ever muster.

    Just you wait before the Storm ARM and its Licensees unleash as it will eventually take over ALL markets including the Server space (Calxeda much?). Oh and as for Apple. (an ARM Licensee itself... i can see them moving to in-house ARM designs pretty soon). 4-6-8 Core Cortex A15 (with A7 core for low power iPod/tablet sync) Macbook Airs anyone?

    Intel is becoming the strongest of the Dinosaurs. But even the T-Rex fell eventually.
  • swizeus - Wednesday, March 7, 2012 - link

    We have been using the Flemish/Dutch Web 2.0 website Nieuws.be as a benchmark for some time. 99% of the loads on the database are selects and about 5% of them are stored procedures.

    The database is loaded 104%. is it possible ?
  • JohanAnandtech - Wednesday, March 7, 2012 - link

    Stored procedures can contain selects :-)
  • fredisdead - Saturday, April 7, 2012 - link

    From the 'article' .....

    'The Opteron might also have a role in the low end, price sensitive HPC market, where it still performs very well. It won't have much of chance in the high end clustered one as Intel has the faster and more power efficient PCIe interface'

    Well, if that's the case, why exactly would AMD be scoring so many design wins with Interlagos. Including this one ...

    http://www.pcmag.com/article2/0,2817,2394515,00.as...

    http://www.eweek.com/c/a/IT-Infrastructure/Cray-Ti...

    U think those guys at Cray were going for low performance ? In fact, seems like AMD has being rather cleaning up in the HPC market since the arrival of Interlagos. And the markets have picked up on it, AMD stock is thru the roof since the start of the year. Or just see how many Intel processors occupy the the top 10 supercomputers on the planet. Nuff said ...
  • InsaneScientist - Wednesday, March 7, 2012 - link

    Johan, where in the specs where you have this line:
    Transistors (Billion) 2,26 2x 1,2 2x 904 1,17

    I sure hope that 2x 904 (Billion) is a typo... otherwise AMD has some serious explaining to do. ;)

    Should be 2x ,904 (I think? Would be 2x .904 for me, I assume you follow the same rules...)
  • iliev - Wednesday, March 7, 2012 - link

    Page 5, Benchmark Configuration

    R2208GZ4GSSPP specs table... E5-2660 is 2.2Ghz, and not 2.9GHz
  • dodge776 - Wednesday, March 7, 2012 - link

    Hi Johan,
    Always look forward to reading your server reviews at AT, but no SAPS benchmarks this time?
  • ppennisi - Wednesday, March 7, 2012 - link

    For maximum VMware performance on Opteron Interlagos cpu under VMWARE it's better to disable C1E and enable, where available, HPC mode.

    I found myself on a fresh installation of ESXi 5.0 on Dell R715 that leaving C1E enable literally crippled vm performance.
  • boudini - Thursday, March 8, 2012 - link

    I'm not sure I would recommend using iray as a reliable benchmark renderer in 3ds max. It is not a self configuring mental ray, but an unbiased renderer which behaves fairly differently to mental ray, and most other renderers such as vray, final render and brazil. It is comparible to maxwell and fryrender, but is very new compared to those two longer established unbiased render engines. It also attempts to use the gpu to add to its calculations as well - which could significantly skew results.

    Using mental ray or vray might well give you quite a different result, and besides I don't think iray is widely used in the industry.
  • omega4711 - Friday, March 9, 2012 - link

    This. The results of iray are mostly dependent on the GPU. The lack of proper scaling certainly isn't due to Amdahl's law. Just use mentalray with small enough render buckets and you can easily satisfy 64+ threads.

    Also, due to the limitations of iray, it can (at this moment) only be used in about 1-3% of real world scenarios.

    Please, for all the people that care about these benchmarks, use mentalray and/or vray.

    Otherwise, it's a brilliant article.

Log in

Don't have an account? Sign up now