Original Link: http://www.anandtech.com/show/2326
Low Power Server CPU Redux: Quad-Core Comes to Playby Jason Clark & Ross Whitehead on September 13, 2007 6:05 AM EST
- Posted in
- IT Computing
A couple months ago, we took a look at the low voltage (LV) server CPU market. At the time, we focused on four-way solutions using two dual-core processors, since those represent the largest slice of the server pie. Our conclusion was that while the power savings brought about by using low voltage CPUs were real, processor choice was only one part of the equation. AMD came out ahead overall in performance/watt, not because they were faster or because their CPUs used less power, but rather because their platform as a whole offered competitive performance while using less power.
We discussed previously exactly what's involved in a low voltage part, but of course the picture is far bigger than just talking about power requirements. Take for example Intel's low-voltage Woodcrest parts; they are rated at 40W compared to the regular Woodcrest parts that are rated at 80W. The price premium for upgrading to a low-voltage part varies; in the case of AMD it's typically anywhere from $100 to $300 per CPU, while on the Intel side some low-voltage parts cost more, the same, or even less than the regular parts (i.e., the Xeon 5140 currently sells for about $450 while the low voltage Xeon 5148 only costs $400). Regardless of price, it's difficult to justify low-voltage processors in terms of power bill savings.
An extra 40W of power in a device running 24/7 for an entire year works out to around $35 per year, so at the low-end of the equation you would need a minimum of three years to recoup the investment (at which point it's probably time to upgrade the server). Other factors are usually the driving consideration.
Saving 40W per CPU socket may not save you money directly in terms of power bills, but generally speaking these chips are going into servers that sit in a datacenter. Air conditioning for the datacenter typically has costs directly related to the amount of power being consumed, so every 40W of power you can save could end up saving another 20W-40W of power in air conditioning requirements. That's still not even the primary concern for a lot of companies, though.
Datacenters often run dozens or even hundreds of servers within a single large room, and the real problem is making sure that there's enough power available to run all of the equipment. The cost of building a datacenter is anything but cheap, and if you can pack more processing power into the same amount of space, that is where low-voltage parts can really become useful. Blade servers were specifically created to address this requirement, and if you can reduce the total power use of the servers by 20% that means some companies could choose to run 20% more servers.
Of course, that doesn't mean that every company out there is interested in running a datacenter with hundreds of computers, so individually businesses need to look at what sort of server setup will best fit their needs. After determining that, then they need to look at low-voltage CPUs and decide whether or not they would actually be helpful. Assuming low-voltage parts are desired, the good news is that it's extremely easy to get them in most modern servers. Dell, HP, and other large server vendors usually include low-voltage parts as an easy upgrade for a small price premium. And that brings us to our low-voltage CPU update.
Intel Quad G-Stepping
Intel doesn't seem to sit still these days, pushing the power and performance envelope further and further. Recently, Intel announced two new G-stepping quad-core parts. The new parts run at the extreme ends of the power consumption spectrum. The first is a 2.0GHz 1333FSB part that runs at 50W while the second is a 3.0GHz 1333FSB part that runs at 120W. There are two main changes to the G-stepping parts, the first of which is power consumption: G-stepping introduces optimizations for idle state power. The second change involves enhancements to the Virtualization Extensions (VT), which mainly improve interrupt handling in the virtualization of Microsoft Windows 32-bit operating systems.
Of course, we would be remiss if we didn't mention AMD's recently launched Barcelona processor here. AMD expects their new quad-core processor to run within the same power envelope as the previous dual-core Opterons, which means twice as many CPU cores potentially without increasing power requirements, resulting in a potential doubling of performance/watt on the socket level. Low-voltage (HE) Barcelona parts will still be available, but even the regular chips include many new enhancements to help with power requirements. We are doing our best to get some additional Barcelona servers in-house in order to test this aspect of the performance/power equation and we hope to follow up in the near future.
One final item worth mentioning is that Intel's 45nm Harpertown refresh of Clovertown is due out in the very near future, which is one more item we can to look forward to testing. Unlike the desktop world, however, acquiring and testing server products often requires a lot more time and effort. Even with the appropriate hardware, the sort of benchmarks we run on servers can often take many hours just to complete a single test, and there are many parameters that can be tuned to improve performance. Since there aren't a lot of early adopters in the server market, though, we should be able to provide you with results before any of the IT departments out there are ready to upgrade. Now let's get on to the testing.
Benchmarking Low Voltage
Since this article is focused on low power parts and that market is mostly focused on Performance/Watt, we decided we would only report results with all power management features enabled. To configure our servers with all power management features on, we perform the following:
In the BIOS ensure that Thermal Management is On/Enabled, C1 Enhanced Mode is On/Enabled, and EIST Support is On/Enabled.
In the BIOS ensure that PowerNow is On/Enabled. Additionally, you must install the Processor Driver, from AMD, in your OS.
For both platforms you must also set the Power Options in Control Panel to "Server Balanced Processor Power and Performance".
For details about the impact that the Power Management features had on a system at idle please see our previous article. With all Power Management features turned on we recorded idle power usage for all three systems.
We see that AMD has the lowest idle power consumption. Intel's Woodcrest system uses approximately 41% more power at idle, and the Clovertown system uses approximately 51% more power at idle. The Clovertown system only uses approximately 4% more power at idle than the Woodcrest system, which is not bad considering it has twice as many CPU cores.
During the testing, we often speculate about where all the power goes. We attempted to find out by measuring power consumption of the entire system at idle, then removing a component and re-measuring the power consumption. The difference in power can be attributed to the removed component. This is not a perfect way to determine component power requirements, but it does provide some general guidance as to where all of the power goes. The results are very interesting:
In the AMD system we see that the bulk of the power is consumed by the idle CPUs. Overlooking the "Unaccounted For", the next biggest consumer is the five case fans, followed by the SAS RAID Card. The "Unaccounted For" is everything which is not listed, including the inefficiency of the power supply and the motherboard and chipset.
In the Intel Woodcrest system we see that the CPUs require significantly less power than the AMD CPUs, 54% less to be exact. On the other hand the FB-DIMMs require 862% more power than the AMD DIMMs. (Yikes!) Also, the "Unaccounted For" is twice as high on the Intel system as the AMD system. Keep in mind both of these systems have identical power supplies, so the efficiency is roughly the same.
The Intel Clovertown system is identical to the Woodcrest system except for the CPUs, which require approximately 4% more power at idle.
Choosing the contenders
In previous articles, we've been asked to explain why we chose the parts we did for an article. For this article we used the latest low voltage quad-core parts from Intel and the latest low power dual-core parts from AMD. The first question that may come to mind is, "Why are you comparing a quad-core part with a dual-core part?". The answer is quite simple: for Intel the two quad-core / dual-core parts we're comparing cost about the same, so while the parts we are comparing aren't equal in terms of the number of Cores, they are equal in terms of price. On the AMD side of things, the CPU price is once again similar, but other than the just-released Barcelona there are no quad-core AMD offerings. Since testing for this article began over a month ago, including Barcelona wasn't an option at the time.
Both of our systems are in an identical chassis, with identical power supplies. The systems differ only in the motherboard/CPU/memory and in their fan setup. Intel Xeon systems typically use a ducted system whereas AMD uses a conventional heatsink w/fan. Our benchmarks consist of the same applications/test suites we used in the previous article,described here.
AMD Opteron Server
Intel Woodcrest/Clovertown Server
The AMD system has two 2.6GHz (2218 HE) processors mounted on a Tyan S3992 main board, with 8x1GB of DDR2-667 OEM memory. Internal cooling consists of five 3.5" fans and two CPU fans. Internal storage is provided by one WD1600YD hard drive, which is where the OS is installed.
The Intel system is configured with two LV 2.33GHz Woodcrest processors (Xeon 5148), and the latest G-Stepping LV 2.0GHz Clovertown processors (Xeon L5335). The motherboard is a SuperMicro X7DBE+. The Intel system is outfitted with 8x1GB 667 MHz OEM FB-DIMMs. Internal cooling consists of five 3.5" fans, with plastic ducting directing airflow across the CPUs and FB-DIMMs. Internal storage once again comes from one WD1600YD hard drive with the OS installed.
LSI Logic 8480E MegaRaid Controller
Promise VTRAK J300s SAS Chassis
12 x 146GB Fujitsu 15,000 RPM SAS Drives configured in RAID 0
Windows 2003 Enterprise SP2 x64
SQL 2005 Enterprise x64 SP2
For the first four load points, all of the parts perform about the same but for load point five and six there is no competition as the Clovertown part is able to beat the others by as much as 53%.
There are no real surprises here. The dual-core parts are relatively similar, though the Opteron CPU load is consistently higher. The quad-core Clovertown shows significantly lower CPU utilization and thus headroom beyond load point four.
Looking at the power requirements, we see that the Intel configurations use significantly more power than the AMD configurations. The Woodcrest system uses as much as 46W more power than the AMD system, and the Clovertown system uses as much as 76W more power than the AMD system. Recall however that FB-DIMMs from the Intel system use approximately 60W more than the DIMMs in the AMD system, so the difference is in the platform rather than in the CPUs.
Up to and including load point four Opteron is the clear winner. Beyond load point four Clovertown steals the show with a lead as great as 17% over the Opteron.
Scalable Hardware - CPU
Once again, we get very similar results with the Clovertown system clearly pulling ahead at the top end of the load points. The dual-core AMD and Intel systems are essentially tied.
Everything is mostly linear here as we would expect. The Woodcrest, and to a lesser degree Clovertown, take an unexplainable little dip at the fourth load point.
The power results are similar to what we've seen already, with the Opteron being the clear winner with little difference between Woodcrest and Clovertown.
It is a very tight race here until the fourth load point, after which the Clovertown system again pulls away with a lead as great as 44% over the Opteron.
Scalable Hardware - Mixed
In this test the contenders are only comparable for the first two load points; after that the Clovertown system is able to pull ahead by a significant amount and is as much as 81% faster than the Opteron system. The Woodcrest system also leads the Opteron system by a small margin.
We see the Clovertown system uses less CPU power to reach the same or greater Transactions/Second performance, which is expected.
Power results continue to be similar to our previous tests. However...
...similar to the Transactions/Second results, the Clovertown system is the clear leader at higher loads. In this graph the lead is as great as 52% over the Opteron system.
Scalable Hardware - Reads
In this test we see the Clovertown system again lead from load point three and beyond, although the lead is only really significant at the two highest load points. Its lead is as great as 55% over the Opteron system in this test.
CPU Utilization is about the same as in previous tests.
Not surprisingly, Power Consumption results continue the patterns seen already.
This graph is the closest so far between the Opteron and Clovertown. It's not until the last load point that the Clovertown system is able to pull ahead in terms of performance/watt.
If you pardon the pun, power continues to be a hot topic in the world of computer servers. The costs associated with operating and cooling an average server are certainly not cheap, and these costs continue to rise over time with higher performing and higher power parts being released. Low-voltage processors try to reverse that trend, although they are only truly effective at halting the CPU power requirements. Unfortunately, the CPU is just a piece of the puzzle. Memory, fans, chipset, drives, HBAs, etc. all play a role in power requirements, and in some of those areas (FB-DIMMs in particular) the increase easily overshadows the power savings associated with low-voltage processors.
In our previous article that compared Intel dual-core parts to AMD dual-core parts, AMD came out on top. The main reason for their victory is their power consumption figures. The Intel dual-core Xeons compete on a performance basis, but FB-DIMMs hurt overall performance/watt numbers. In this article we see the tables turn somewhat. With two extra cores the Intel Clovertown parts are able to easily outpace the AMD Opteron, at least when overall load is near saturation. At low to average workloads, there is little difference between any of the parts, in which case server consolidation might be a better solution. Obviously, the quad-core parts are best suited for loaded database servers and their sweet spot is in virtualized environments.
If there was ever any doubt that Intel made a bad decision not going true quad-core, it should be clear with numbers like these that their decision was sound and is paying off. Quad-core processors may not be faster in every situation, but in heavily threaded CPU intensive environments the extra CPU cores are easily able to make up for any penalties associated with the dual-die packaging.
This is not the end of the story, however. The next few months should prove interesting for the two processor giants, as AMD's Barcelona should begin to show up in volume and Intel is set to refresh their Xeon line-up with Harpertown. Stay tuned; we'll have thorough coverage for both products in the near future.