Westmere-EP to Sandy Bridge-EP: The Scientist Potential Upgradeby Ian Cutress on March 4, 2013 9:30 AM EST
Is Sandy Bridge-EP an Upgrade Path?
At the beginning of this review, I referred back to Johan’s article on the behind the scenes benefits that Sandy Bridge-EP offers over Westmere-EP, and condensed them into a list for what a non-CS student in a scientific field might have to consider:
- The improved core and µop cache on Sandy Bridge-EP should boost IPC through the roof with calculations that can take advantage, especially advanced trigonometric functions.
- The increase in L3 cache would reduce stress on jumps out to main memory for values, although the improved memory bandwidth would also help in this regard.
- More cores are always welcome – Turbo 2.0 would help with pre-release code testing, which often occurs in debug / single thread mode.
- An increase of memory limits would help various simulation scenarios, as well as aid having VMs of different environments.
- The move up to PCIe 3.0 helps any GPGPU simulation that requires lots of memory transfers back and forth across the bus (matrix solving), as long as the GPU supports PCIe 3.0 (K10, K20X, FirePro, not Xeon Phi which uses PCIe 2.0).
Every scenario that an individual faces, either in the office, the laboratory, or generic work place #147 is going to be different – perhaps only slightly, but different nonetheless. We have to weigh up the pros and cons of the specific workload and make relative suggestions.
For the most part, any simulation which has large parts that can be computed in parallel should be looking at GPUs, unless the thread are ‘dense’ (require lots of memory registers for the serial calculation) or are already optimized for SSE4/AVX. Double precision can also be a hurdle to GPU computing, but the NVIDIA GTX Titan makes the cost a lot more palatable on research grants. Lots of researchers will be dealing with Fortran code tens of thousands of lines long and 20 years old, meaning that porting to GPUs is not a reasonable situation (unless you encourage the research supervisor to apply for a 3 year grant to convert the code). In these cases, make a note of how much memory the simulation needs – if it is sub 2.5 MB, then load up on as many cores as you can get as you will still be in L3 cache on the 20MB L3 processors. For more than that, you will be dealing with memory accesses out to main memory, and unless you are comfortable dealing with NUMA based code and tools (which your Fortran probably is not geared for), then a single fast processor is probably the best bet. MPI based Fortran is where dual processors systems would be best, or for simulations that require more memory than what a single processor can have equipped.
In terms of Westmere-EP vs. Sandy Bridge-EP for our benchmark suite, the relative numbers are:
|Dual E5-2690 vs. Dual X5690|
|Price||+25% (before tax and additional seller markup)|
|HT On||HT Off||Recommended Setup|
|2D Explicit FD||+12.7%||+7.3%||
Single Multicore CPU
w/High Speed Memory
|3D Explicit FD||+7.7%||-10.3%||
Single Multicore CPU
w/High Speed Memory
High Mem Bandwidth
|+2.4%||+2.8%||High Single CPU Speed|
|WinRar||+27.4%||+3.4%||High Mem Bandwidth|
|FastStone||+6.5%||+3.2%||High Single CPU Speed|
|x264 Pass 1||-9.0%||+3.4%||Single CPU|
|x264 Pass 2||+27%||+24.3%||Multi-CPU|
While we do not get a price equivalent speed up across the board, certain scenarios (Xilisoft, x264 Pass 2) benefit greatly from a dual processor Sandy Bridge-EP system over either Westmere-EP or GPU. Sometimes a GPU is not available, putting the Brownian Motion benchmark through the roof when it comes to more cores. A limiting factor in many of these benchmarks is memory speed – if you do not need a Xeon, then the latest Intel/AMD processors can handle 2133+ MHz memory which provides an absolute tangible boost in finite difference simulation and WinRar.
If we come back to the original question ‘Is moving from Westmere-EP to Sandy Bridge-EP a reasonable upgrade’, in the majority of our scenarios it probably is not – either other alternatives exist that perform better (single CPU, GPU, memory bandwidth) or the price difference is not worth the jump. Remember that most scenarios will have to absorb the whole cost, rather than the cost of an upgrade, and calculating that into the cost/benefit analysis is a major part of the equation. But none of our scenarios need more than 96 GB of memory, PCIe 3.0, VMs for different environments, or use advanced processor instruction sets, which could be vital to your work.
Ivy Bridge-EP is slated for the end of the year, meaning that those on Westmere-EP would probably consider waiting to see what comes out from Intel. If you need a DP system now, then Sandy Bridge-EP is an obvious choice if you want to go down the Intel route, though NUMA related code may benefit from a quad AMD system better. If we get one in for another comparison point, we will let you know.
A final note to give thanks to the Gigabyte server team for loaning us the CPUs and motherboard to make this testing possible.