The Opteron 6276: a closer lookby Johan De Gelas on February 9, 2012 6:00 AM EST
MySQL 5.5.17 "Percona Server"
Many readers asked us why we only tested MySQL in a virtualized environment and not on "native" Linux. Indeed, it has been years since we tested MySQL "natively". The reason is simple: MySQL 5.1 and earlier versions scaled pretty badly beyond 4-8 cores, so there is no incentive to run them on modern dual socket servers. However, starting in December 2010, MySQL 5.5 has been available and it should feature much improved scalability. Even better, the people at Percona released their version of MySQL and the Innodb Storage Engine, Percona Server with XtraDB. This MySQL/Innodb combination is engineered for even better scalability.
To test this, we installed Percona Server 5.5.17-55 (Release 22.1, November 2011) on top of a Ubuntu 11.10 x86-64 linux with the 3.0.0-14 kernel. This kernel was the latest stable version at the time and is "Bulldozer/Interlagos aware".
We migrated the "Nieuws.be" database to MySQL to have a test similar to our SQL server test. That migration is not perfect as not all stored procedures were successfully converted, so you should not use the benchmark results below to compare SQL Server and MySQL. However, the profile of the test is the same: it is 99% complex selects that scan large parts of the database. The database is several tens of gigabytes instead of one.
The results are abysmal for the latest Interlagos Opteron. The best Xeon score is 84% better than the best Opteron score. The results indicate what went wrong: the 8 thread Opteron 6220 at 3GHz scores better than the 16 thread Opteron 6276 at 2.3GHz. A clockspeed advantage of 30% has prevailed over twice as many threads. So we can suspect that the scaling problems are not gone, at least in this test.
Let us take a closer look by performing the same test on a different number of threads and cores. The BIOS of the SuperMicro H8DGU-F allowed us to disable the second integer unit or one or more modules of the new Opterons. (Disabling both at the same time was not possible.) The Asus Z8PS-D12-1U was more flexible: we could disable Hyper-Threading and/or several cores of the Xeon. Here are the scaling results.
First, we focus on the results with few cores and threads. Two Bulldozer Modules are capable of slightly outperforming four cores of the Opteron Magny-Cours. The ideas behind Bulldozer are sound: two modules are smaller (157 mm²) and more power efficient than four K10 cores (231 mm²). At the same time they perform equal to the Xeon X5650—which is clocked higher—with the same amount of threads. At eight threads this is still the case, and the gap between the newer and older Opteron widens in favor of the former.
Beyond eight threads, the new Opteron starts to scale badly. Doubling the number of modules to eight delivers a very small 5% performance advantage. Double the number of modules again and you end up with negative scaling. To make matters worse, the Xeon doesn't have this problem. From eight to 16 threads we get a 76% performance boost. The end result is that a quad-core Xeon beats the best Opteron by a large margin. Let us investigate the matter further.