Virtualization and Consolidation

VMmark—which we discussed in detail here—tries to measure typical consolidation workloads: a combination of a light mail server, database, fileserver, and website with a somewhat heavier Java application. One VM is just sitting idle, representative of workloads that have to be online but which perform very little work (for example, a domain controller). In short, VMmark goes for the scenario where you want to consolidate lots and lots of smaller apps on one physical server.

VMware VMmark

Very little VMmark benchmark data has been available so far, but it is obvious that this is favorite playing ground of the Xeon 7500. It outperforms an octal 2.8GHz Opteron by a large margin. Granted, the octal Opterons scale pretty badly in most applications, but VMmark is not one of them. It is reasonable to expect that a quad twelve-core Opteron 6100 series will outperform older higher clocked octal six-core Opterons in many applications including SAP, OLTP and data mining benchmarks. After all, the communication between the cores has vastly improved. But VMmark is running many small independent applications, which usually run on the same node, so the chances are slim that the quad Opteron 6100 will come even close to the quad Xeon X7560.

vApus Mark I: Performance-Critical Virtualized Applications

As we've discussed previously, our vApus Mark I benchmark is due for a major overhaul. We found out that the 24 cores of the Opteron 6172 were not at the expected 85-95% CPU load, and thus the numbers reported were under the potential of the twelve-core Opteron. To get an idea of where the Xeon X7560 would land, we disabled Hyper-Threading, as our test is capable of stressing 16 cores/threads easily. The dual Xeon X7560 was about 5% slower than the Xeon X5670 with Hyper-Threading enabled, and about 13% faster than the dual octal-core Opteron 6136 2.4GHz. Considering that we found that performance is about 15% higher due to Hyper-Threading, we estimate that the dual Xeon X7560 at 2.26GHz is about 10% faster than a Xeon X5670 at 2.93GHz, and about 29% faster than the octal 2.4GHz Opteron 6136. So core per core, clock per clock the Xeon X7560 has probably in the neighborhood of a 30% performance advantage over the Opteron. Once vApus Mark II is ready, we'll provide more accurate numbers.

However, that is not enough to win the price/performance or performance/watt comparison. An octal-core Xeon X7560 costs four times more and the server consumes a lot more than a similar (clock speed, core count) Opteron 6136.

SAP S&D 2-Tier Power and Conclusion
Comments Locked

23 Comments

View All Comments

  • JohanAnandtech - Tuesday, April 13, 2010 - link

    "Damn, Dell cut half the memory channels from the R810!"

    You read too fast again :-). Only in Quad CPU config. In dual CPU config, you get 4 memory controllers, which connect each two SMBs. So in a dual Config, you get the same bandwidth as you would in another server.

    The R810 targets those that are not after the highest CPU processing power, but want the RAS features and 32 DIMM slots. AFAIK,
  • whatever1951 - Tuesday, April 13, 2010 - link

    2 channels of DDR3-1066 per socket in a fully populated R810 and if you populate 2 sockets, you get the flex memory routing penalty...damn..............!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! R810 sucks.
  • Sindarin - Tuesday, April 13, 2010 - link

    whatever1951 you lost me @ Hello.........................and I thought Sauron was tough!! lol
  • JohanAnandtech - Tuesday, April 13, 2010 - link

    "It is hard to imagine 4 channels of DDR3-1066 to be 1/3 slower than even the westmere-eps."

    On one side you have a parallel half duplex DDR-3 DIMM. On the other side of the SMB you have a serial full duplex SMI. The buffers might not perform this transition fast enough, and there has to be some overhead. I also am still searching for the clockspeed of the IMC. The SMIs are on a different (I/O) clockdomain than the L3-cache.

    We will test with Intel's / QSSC quad CPU to see whether the flexmem bridge has any influence. But I don't think it will do much. You might add a bit of latency, but essentially the R810 is working like a dual CPU with four IMCs just like another (Dual CPU) Nehalem EX server system would.
  • whatever1951 - Tuesday, April 13, 2010 - link

    Thanks for the useful info. R810 then doesn't meet my standard.

    Johan, is there anyway you can get your hands on a R910 4 Processor system from Dell and bench the memory bandwidth to see how much that flex mem chip costs in terms of bandwidth?
  • IntelUser2000 - Tuesday, April 13, 2010 - link

    The Uncore of the X7560 runs at 2.4GHz.
  • JohanAnandtech - Wednesday, April 14, 2010 - link

    Do you have a source for that? Must have missed it.
  • Etern205 - Thursday, April 15, 2010 - link

    I think AT needs to fix this "RE:RE:RE...:" problem?
  • amalinov - Wednesday, April 14, 2010 - link

    Great article! I like the way in witch you describe the memory subsystem - I have readed the Intel datasheets and many news articles about Xeon 7500, but your description is the best so far.

    You say "So each CPU has two memory interfaces that connect to two SMBs that can each drive two channels with two DIMMS. Thus, each CPU supports eight registered DDR3 DIMMs ...", but if I do the math it seems: 2 SMIs x 2 SMBs x 2 channels x 2 DIMMs = 16 DDR3 DIMMs, not 8 as written in the second sentence. Later in the article I think you mention 16 at different places, so it seems it is realy 16 and not 8.

    What about Itanium 9300 review (including general background on the plans of OEMs/Intel for IA-64 platform)? Comparision of scalability(HT/QPI)/memory/RAS features of Xeon 7500, Itanium 9300 and Opteron 6000 would be welcome. Also I would like to see a performance comparision with appropriate applications for the RISC mainframe market (HPC?) with 4- and 8-socket AMD, Intel Xeon, Intel Itanium, POWER7, newest SPARC.
  • jeha - Thursday, April 15, 2010 - link

    You really should review the IBM 3850 X5 I think?

    They have some interesting solutions when it comes to handling memory expansions etc.

Log in

Don't have an account? Sign up now