Windows 7 Application Performance

3dsmax 9

Today's desktop processors are more than fast enough to do professional level 3D rendering at home. To look at performance under 3dsmax we ran the SPECapc 3dsmax 8 benchmark (only the CPU rendering tests) under 3dsmax 9 SP1. The results reported are the rendering composite scores.

3dsmax r9 - SPECapc 3dsmax 8 CPU Test

Offline 3D rendering applications make some of the best use of CPU cores, unfortunately our test here doesn't scale all that well. We only see a 7% increase over the 2600K. If we look at a more modern 3D workload however...

Cinebench 11.5

Created by the Cinema 4D folks we have Cinebench, a popular 3D rendering benchmark that gives us both single and multi-threaded 3D rendering results.

Cinebench 11.5 - Single Threaded

Single threaded performance is marginally better than the 2600K thanks to the 3960X's slightly higher max turbo speed. What's more important than the performance here is the fact that the 3960X is able to properly power gate all idle cores and give a single core full reign of the chip's TDP. Turbo is alive and well in SNB-E, just as it was in Sandy Bridge.

Cinebench 11.5 - Multi-Threaded

Here the performance gains are staggering. The 3960X is 53% faster than the 2600K and 19% faster than Intel's previous 6-core flagship, the 990X. The Bulldozer comparison is almost unfair, the 3960X is 75% faster (granted it is also multiple times the price of the FX-8150).

7-Zip Benchmark

While Cinebench shows us multithreaded floating point performance, the 7-zip benchmark gives us an indication of multithreaded integer performance:

7-zip Benchmark

Here we see huge gains over the 2600K (58%), indicating that the increase in cache size and memory bandwidth help the boost in core count a bit here. The advantage over the 990X is only 7%. This gives us a bit of a preview of what we can expect from SNB-EP Xeon server performance.

PAR2 Benchmark

Par2 is an application used for reconstructing downloaded archives. It can generate parity data from a given archive and later use it to recover the archive

Chuchusoft took the source code of par2cmdline 0.4 and parallelized it using Intel’s Threading Building Blocks 2.1. The result is a version of par2cmdline that can spawn multiple threads to repair par2 archives. For this test we took a 708MB archive, corrupted nearly 60MB of it, and used the multithreaded par2cmdline to recover it. The scores reported are the repair and recover time in seconds.

Par2 - Multi-Threaded par2cmdline 0.4

Here we see a 40% increase in performance over the 2600K and FX-8150.

TrueCrypt Benchmark

TrueCrypt is a very popular encryption package that offers full AES-NI support. The application also features a built-in encryption benchmark that we can use to measure CPU performance with:

AES-128 Performance - TrueCrypt 7.1 Benchmark

As both the 990X and 3960X have AES-NI support, both are equally capable at cranking through an AES workload. Per core performance doesn't appear to have changed all that much with the move to Sandy Bridge, so here we have a situation where the 3960X is much faster than the 2600K but no faster than the 990X. I suspect these types of scenarios will be fairly rare.

x264 HD 3.03 Benchmark

Graysky's x264 HD test uses x264 to encode a 4Mbps 720p MPEG-2 source. The focus here is on quality rather than speed, thus the benchmark uses a 2-pass encode and reports the average frame rate in each pass.

x264 HD Benchmark - 1st pass - v3.03

Single threaded performance isn't significantly faster than your run-of-the-mill Sandy Bridge, which means the first x264 HD pass doesn't look all that impressive on SNB-E.

x264 HD Benchmark - 2nd pass - v3.03

The second pass however stresses all six cores far more readily, resulting in a 47.5% increase in performance over the 2600K. Even compared to the 990X there's a 15% increase in performance.

Adobe Photoshop CS4

To measure performance under Photoshop CS4 we turn to the Retouch Artists’ Speed Test. The test does basic photo editing; there are a couple of color space conversions, many layer creations, color curve adjustment, image and canvas size adjustment, unsharp mask, and finally a gaussian blur performed on the entire image.

The whole process is timed and thanks to the use of Intel's X25-M SSD as our test bed hard drive, performance is far more predictable than back when we used to test on mechanical disks.

Time is reported in seconds and the lower numbers mean better performance. The test is multithreaded and can hit all four cores in a quad-core machine.

Adobe Photoshop CS4 - Retouch Artists Speed Test

Our Photoshop test is multithreaded but there are only spikes that use more than four cores. That combined with the short duration of the benchmark shows no real advantage to the 3960X over the 2600K. Sandy Bridge E is faster than Intel's old 6-core solution though.

Compile Chromium Test

You guys asked for it and finally I have something I feel is a good software build test. Using Visual Studio 2008 I'm compiling Chromium. It's a pretty huge project that takes over forty minutes to compile from the command line on the Core i3 2100. But the results are repeatable and the compile process will stress all 12 threads at 100% for almost the entire time on a 980X so it works for me.

Build Chromium Project - Visual Studio 2008

Our compile test is extremely well threaded, which once again does well on the 3960X. The gains aren't as big as what we saw in some of our earlier 3D/transcoding tests, but if you're looking to build the fastest development workstation you'll want a Sandy Bridge E.

Excel Monte Carlo

Microsoft Excel 2007 SP1 - Monte Carlo Simulation

Multithreaded compute does well on SNB-E regardless of the type of application. Excel is multithreaded and if you have a beefy enough workload, you'll see huge gains over the 2600K.

Cache and Memory Bandwidth Performance Gaming Performance
Comments Locked

163 Comments

View All Comments

  • SonicIce - Monday, November 14, 2011 - link

    cool good review.
  • wharris1 - Monday, November 14, 2011 - link

    It would be interested to test the OC'd SBE vs an OC'd SB; I suspect that the 2x advantage of the SBE would fall back in line to around the ~30-40% speed advantage seen in non-OC'd testing (in heavily threaded workloads). I have the feeling that between being defective xeon CPU parts and lacking more SATA 6Gbs as well as USB 3.0 functionality on the motherboard side, this release is a bit hamstrung. I be that with the release of Ivy Bridge E parts/motherboards, this combo will be more impressive. Part of the problem is that the regular SB parts are so compelling from a price/performance perspective. As always, nice review.
  • Johnmcl7 - Monday, November 14, 2011 - link

    I thought that odd as well as it almost implies the regular Sandybridge processors are poor overclockers when there are results for the new processor overclocked and Bulldozer overclocked. I guess though it's more it would be interesting to see rather than actually change anything, I currently have an i7 960 and was hoping for an affordable six core processor but it's looking like I'll wait until Ivybridge now
  • Tunnah - Monday, November 14, 2011 - link

    although i can understand the expectation of all 6 ports being sata 3, maybe the reasoning is implementing it would probably be pointless for 99.9% of users - i can't even begin to imagine any none-enterprise usage for 6 SSDs running at max speed!
  • Exodite - Monday, November 14, 2011 - link

    While I personally don't disagree with most people not needing more than two SATA 6Gbps ports you have to keep in mind that 99.9% of all users have no need for the SB-E /platform/ in its entirety.

    Since it's squarely aimed at workstation power users and extreme-end enthusiasts, those last 0.1% of users if you will, offering more SATA 6.0Gbps ports makes sense.
  • Zoomer - Monday, November 14, 2011 - link

    I can't imagine the area difference being an issue. Like, are sata3 controllers really that different once it was already done and validated? Having two types of sata controllers on chip seems redundant to me. It's like PCIe 1.0 vs 2.0; once you have the 2.0 implementationd one, there's no reason to have 1.0 only lanes since it is backwards compatible.
  • Jaybus - Tuesday, November 15, 2011 - link

    The reason for keeping SATA 3Gbps and PCIe 1.0 is not a die area issue or lack of reasoning. SATA 6Gbps takes considerably more power than 3Gbps, and PCIe 2.0 likewise consumes more power than 1.0. It's simply the physical reality of higher transfer rates. SB-E is already at 130 W, so there simply isn't room in the power envelope to make every interface the highest speed available.
  • MossySF - Tuesday, November 15, 2011 - link

    We ran into this problem. Our data processing database has 1 slow SSD for a boot drive and 5 x Sandforce SATA3 SSDS in a RAID0 array ... and we can't do even half the speed the SSDs can run at.

    You might say why would a non-enterprise user being using this many SSDs? Uh, why would a non-enterprise user be running this obscenely fast computer? You need this much speed to play Facebook Farmville?
  • ltcommanderdata - Monday, November 14, 2011 - link

    Given Ivy Bridge is coming in a few months, perhaps you could comment whether SB-E is worth it even for power users at this time? Has there been indications that high-end Ivy Bridge will likewise launch much later than mainstream parts? Is LGA 2011 going to be around a while or will it need to be replaced if high-end Ivy Bridge decides to integrate an IGP for QuickSync support and as an OpenCL co-processor?
  • DanNeely - Monday, November 14, 2011 - link

    I don't think Intel's spoken publicly about IB-E yet.

    That said, Intel hasn't done socket changes for any of the other recent die shrinks so I doubt we'll see one for ivy. Incremental gains in clock speed, and possibly pushing more cores down to lower price points ($300 6 core, or $1000 8 core) are the most likely results.

    OTOH if its launch is as delayed as SB-E's was Haswell will be right around the corner and there will again be the risk of the new quad core wiping the floor with the old hex for most workloads.

Log in

Don't have an account? Sign up now