Nehalem EX Confusion

One of the reasons that the Xeon X7560 did not show its full potential at launch was a small error in the firmware of the Dell R810 testing platform. This caused the memory subsystem to underperform. As a result some of the bandwidth sensitive benchmarks, including many HPC applications, were not performing optimally. Intel claimed that a dual CPU config should be able to reach 39GB/s, and a quad CPU configuration should reach up to 70GB/s. We could not reach those stream numbers as we test with our somewhat older stream binary as described here. Using the same stream binary as before allows us to compare our findings with all our previous measurements.

We reran our stream benchmarks on the new QSSC-S4R server system.

Stream TRIAD on 64 bit Linux—maximum threads
* New measurements.

The new results tell us that available memory bandwidth is about 21% higher (29GB/s) than what we previously measured on the DELL R810 (24GB/s). That means that many benchmarks published at the launch of the Xeon 7500 and using the Dell R810 were too low, especially the HPC ones. The Xeon X7560 will not be able to beat the quad Opteron 6174 when it comes to raw bandwidth, but it is far from a bandwidth starved platform.

The 32-Core, 64-Thread Beast Stress Testing the High End
Comments Locked

51 Comments

View All Comments

  • duploxxx - Thursday, September 2, 2010 - link

    Looking at the differences between olap/oltp and web it is very clear that this web based test:

    The MCS eFMS portal, a real-world facility management web application, has been discussed in detail here. It is a complex IIS, PHP, and FastCGI site running on top of Windows 2003 R2 32-bit. Note that these two VMs run in a 32-bit guest OS, which impacts the VM monitor mode. We left this application running on Windows 2003, as virtualization allows you to minimize costs by avoiding unnecessary upgrades. We use three MCS VMs, as web servers are more numerous than database servers in most setups. Each VM gets two vCPUs and 2GB of RAM space.

    is really in favor of intel cpu's this makes actually the final result a bit out of order....

    database wise it would actually mean that you can order a L5640 or 6136 and you will have about the same virtualization performance, this means that it is only due to the web based vm behavior and results that you get such a difference. I think it is clear that although the vApus is a nice benchmark it should be enhanced more with different kinds of applications, the web based solution is providing in the end a wrong total conclusion.

Log in

Don't have an account? Sign up now