The Best Server CPUs part 2: the Intel "Nehalem" Xeon X5570by Johan De Gelas on March 30, 2009 3:00 PM EST
- Posted in
- IT Computing
|Rendering: 3ds Max 2008|
|Operating System||Windows 2008 Enterprise RTM (64-bit)|
|Software||3ds Max 2008|
|Benchmark software||Build in timer|
|Typical error margin||1-2%|
Render server are only a small part of the server market. We used the "architecture" scene included in the SPEC APC 3DS Max test. All tests were done with 3ds max's default scanline renderer, SSE enabled, and we rendered at HD 720p (1280x720) resolution. We measured the time it takes to render 10 frames (from 20 to 29) and then calculated (3600 seconds * 10 frames / time recorded) how many frames a certain CPU configuration could render in one hour. Results are reported as rendered images per hour.
We used the 32-bit version of 3ds Max 2008 on 64-bit Windows 2008 RTM. The 64-bit version of Windows 2008 is a bit slower (especially when you use the scanline renderer). All CPU configurations are dual, unless we indicate otherwise.
When it comes to floating point and SSE, the performance gains over several CPU generations are a bit smaller. The Xeon 5570 again shatters all records, but it's "only" three times faster than the Xeon 5080. There are two reasons for this. First, the Xeon 5080 is based on the Pentium 4 architecture. Thanks to its high clock speed, it can deliver relatively high FLOPS (Floating Point Operations per Second). The high branch prediction penalty, the relatively low hit rate of the trace cache, and very high memory latency which all made the Pentium 4 based Xeons very inefficient in integer code are of no real importance when running floating point intensive applications such as 3ds Max.
Improvements have been slower in this area. In the Xeon 51xx we have seen the introduction of 128-bit SSE units (AMD: Barcelona, Opteron 23xx) and faster 4-bit RADIX in the Harpertown Xeon (Xeon 54xx). We analyzed this in great detail previously: while the Opterons are still better at divisions, the Xeon 54xx is faster in multiplications which are much more common. The Xeon 55x "Nehalem" is almost identical to the Xeon 54xx "Harpertown", while the AMD "Shanghai" is identical to AMD "Barcelona" core when it comes to floating point. Notice how the Nehalem at 2.93GHz (in reality 3.1GHz) settles between the 3GHz and 3.3GHz Xeon 54xx. This confirms that floating point code hardly sees a difference between a Harpertown and a Nehalem… unless it is limited by the bandwidth available to the core of course. Nehalem can still beat its older brothers thanks to SMT, once again underlining what a powerful weapon SMT is.
While the Xeon X5570 is only 24% faster than the Xeon 5450, that is good enough to make the current 4-way servers completely useless for rendering. The dual Xeon "Nehalem" offers the same performance at much lower price points, while consuming a lot less power.