Nehalem EX Confusion

One of the reasons that the Xeon X7560 did not show its full potential at launch was a small error in the firmware of the Dell R810 testing platform. This caused the memory subsystem to underperform. As a result some of the bandwidth sensitive benchmarks, including many HPC applications, were not performing optimally. Intel claimed that a dual CPU config should be able to reach 39GB/s, and a quad CPU configuration should reach up to 70GB/s. We could not reach those stream numbers as we test with our somewhat older stream binary as described here. Using the same stream binary as before allows us to compare our findings with all our previous measurements.

We reran our stream benchmarks on the new QSSC-S4R server system.

Stream TRIAD on 64 bit Linux—maximum threads
* New measurements.

The new results tell us that available memory bandwidth is about 21% higher (29GB/s) than what we previously measured on the DELL R810 (24GB/s). That means that many benchmarks published at the launch of the Xeon 7500 and using the Dell R810 were too low, especially the HPC ones. The Xeon X7560 will not be able to beat the quad Opteron 6174 when it comes to raw bandwidth, but it is far from a bandwidth starved platform.

The 32-Core, 64-Thread Beast Stress Testing the High End
Comments Locked

51 Comments

View All Comments

  • davegraham - Wednesday, August 11, 2010 - link

    which is actually why you should be using a Cisco C460 for this type of test.

    dave
  • MySchizoBuddy - Wednesday, August 11, 2010 - link

    Is there an exact correlation with number of cores and VMs. How many VMs can a 48 core system support.

    Let's assume you want 100 systems virtualized. What's the minimum number of cores that will handle those 100 VMs.
  • dilidolo - Wednesday, August 11, 2010 - link

    Depends on how many vCPU and memory you assign to each VM and how much physical memory your server has. CPU is rarely the bottleneck , memory and storage are.

    Then not all the VMs have the same workload. So no one can really answer your question.
  • davegraham - Wednesday, August 11, 2010 - link

    was going to say that a small amount of memory oversubscription is "ok" depending on the workload but you'd want that buffered with something a little more powerful than spinning disk (SSD, for example).
  • tech6 - Wednesday, August 11, 2010 - link

    The parameters for determining the optimal configuration for VMWare go well beyond just which CPU is faster. I like the AT stories about server tech but there need to be broader considerations of server features.

    1. Many applications are memory limited and not CPU bound so the memory flexibility may trump CPU power. That is why 256Gb with a dual 75xx or 6xxx series CPU in an 810 may well be the better choice than either a quad socket or dual socket 56xx configuration.

    2. Software licensing is a big part of choosing the server as it is often licensed per socket. Sometime more cores and more memory is cheaper than more sockets.

    3. Memory reliability is another major issue. Large amounts of plain ECC memory will most likely result in problems 2-3 years after deployment. The platforms available with the 6xxx and 75xx series CPUs support memory reliability features that often make it a better choice for VM data centers.

    4. Power and density is another major issue which drive data center costs that must be given consideration when reviewing servers.
  • don_k - Wednesday, August 11, 2010 - link

    Would like to see some non-windows VM benchmarks as well as a different virtualisation application used and by extension an SQL server that does not come from microsoft. Also would like to see benchmarks on para-virtualised VMs along with full hardware virtualised VMs.

    The review as is is quite meaningless to anyone that does not run windows VMs and/or does not use VMware.

    You do have oracle on a windows VM so maybe oracle on a solaris/bsd VM as well as oracle on a linux para-virtualised guest.

    There is also no mention of how, if at all, the VMs were optimised for the workloads they are running. In particular and most importantly how are the DBs using the disks? Where is the data and where are the logs? How are the disks passed on to the VM (local file, separate partition, virtual volume, full access to one/more drives etc etc).

    Way too many variables to make any kind of an accurate conclusion in my opinion.
  • phoenix79 - Wednesday, August 11, 2010 - link

    I'm curious as to why you didn't include a quad-socket Magny-Cours system. I would have been very interested to see how it would have stacked up in this article.
  • Stuka87 - Wednesday, August 11, 2010 - link

    Ditto, I would like to see the best from each CPU maker. To really see which has the best price:performance ratio.
  • davegraham - Wednesday, August 11, 2010 - link

    if vApus II was available i could run it on my Magny-Cours.

    dave
  • JohanAnandtech - Thursday, August 12, 2010 - link

    The Dell R815 and quad MC deserve an article on their own.

Log in

Don't have an account? Sign up now