Windows 7 Application Performance

3dsmax 9

Today's desktop processors are more than fast enough to do professional level 3D rendering at home. To look at performance under 3dsmax we ran the SPECapc 3dsmax 8 benchmark (only the CPU rendering tests) under 3dsmax 9 SP1. The results reported are the rendering composite scores.

3dsmax r9 - SPECapc 3dsmax 8 CPU Test

Cinebench 11.5

Created by the Cinema 4D folks we have Cinebench, a popular 3D rendering benchmark that gives us both single and multi-threaded 3D rendering results.

Cinebench 11.5 - Single Threaded

With only a 100MHz clock speed advantage over a 2600K when running in single core turbo mode, the 3820 isn't much faster than the 2600K in our single threaded Cinebench test. The additional L3 cache doesn't have much of an impact here, although I suspect that has more to do with this particular workload rather than a general statement about the 3820. Let's look at multithreaded perf:

Cinebench 11.5 - Multi-Threaded

The performance gap increases to 5% once we ramp up thread count. The extra performance is mostly due to clock speed here, although you'll see later on that there are some applications that definitely appreciate the larger L3 cache.

7-Zip Benchmark

While Cinebench shows us multithreaded floating point performance, the 7-zip benchmark gives us an indication of multithreaded integer performance:

7-zip Benchmark

The 7-zip benchmark gives us a good example of what the SNB-E platform can offer given the right workload. Here we see an 8.6% performance advantage, despite a much smaller clock speed advantage. The added L3 cache helps out a bit here, although obviously there's a huge gap between the 3820 and its hexa-core brethren.

PAR2 Benchmark

Par2 is an application used for reconstructing downloaded archives. It can generate parity data from a given archive and later use it to recover the archive

Chuchusoft took the source code of par2cmdline 0.4 and parallelized it using Intel’s Threading Building Blocks 2.1. The result is a version of par2cmdline that can spawn multiple threads to repair par2 archives. For this test we took a 708MB archive, corrupted nearly 60MB of it, and used the multithreaded par2cmdline to recover it. The scores reported are the repair and recover time in seconds.

Par2 - Multi-Threaded par2cmdline 0.4

In tests that have more of an IO influence the difference between the 3820 and the 2600K is negligible, it will take higher clock speeds and more cores to really separate SNB-E from the vanilla SNB systems.

TrueCrypt Benchmark

TrueCrypt is a very popular encryption package that offers full AES-NI support. The application also features a built-in encryption benchmark that we can use to measure CPU performance:

AES-128 Performance - TrueCrypt 7.1 Benchmark

Encryption speed once again scales with core count and clock speeds, the additional L3 cache doesn't do much in this benchmark.

x264 HD 3.03 Benchmark

Graysky's x264 HD test uses x264 to encode a 4Mbps 720p MPEG-2 source. The focus here is on quality rather than speed, thus the benchmark uses a 2-pass encode and reports the average frame rate in each pass.

x264 HD Benchmark - 1st pass - v3.03

We see a slight advantage over the 2600K in our x264 HD benchmark, however video transcoding doesn't benefit all that much from the small gains the 3820 offers. Most client users would be better off with the Quick Sync enabled 2600K, and the serious video professionals will want to invest in a six-core 3930K at the minimum.

x264 HD Benchmark - 2nd pass - v3.03

Compile Chromium Test

You guys asked for it and finally I have something I feel is a good software build test. Using Visual Studio 2008 I'm compiling Chromium. It's a pretty huge project that takes over forty minutes to compile from the command line on the Core i3 2100. But the results are repeatable and the compile process will stress all 12 threads at 100% for almost the entire time on a 980X so it works for me.

Build Chromium Project - Visual Studio 2008

Again we see a step function improvement when moving from four to six cores in our compile test, but no change between the 2600K and 3820. If you're building a dev workstation you're going to either want to save money and grab a 2600K or move to six cores for better performance. It is worth mentioning however that if you need eight DIMM slots the 3820 might be a better option than the 2600K, allowing you to outfit your workstation with insane amounts of memory.

Excel Monte Carlo

Microsoft Excel 2007 SP1 - Monte Carlo Simulation

Our Monte Carlo simulation test is CPU bound but the 3820 shows a marginal improvement over the 2600K.

SYSMark 2007 & 2012

Although not the best indication of overall system performance, the SYSMark suites do give us a good idea of lighter workloads than we're used to testing. SYSMark 2007 is a better indication of low thread count performance, although 2012 isn't tremendously better in that regard.

In 2007 we see mild gains over the 2600K, although 2012 shows a much bigger gap between the 3820 and the 2500K due to the former's support for 8 threads vs. 4.

SYSMark 2007 - Overall

SYSMark 2012 - Overall

The Chip & Overclocking Gaming Performance
POST A COMMENT

83 Comments

View All Comments

  • 14ccKemiskt - Thursday, December 29, 2011 - link

    Exactly. The original line-up for the enthusiast platform (LGA1366) was 920 ($280),940 ($560), 965 ($999). That has since then gradually transformed (by price level)

    ~$280: 920 > 930 > 950 > 960 > 3820
    ~$560: 940 > 950 > 960 > 970 > 980 > 3930K
    ~$999: 965 > 975 > 980X > 990X > 3960X

    The big "winner" on the enthusiast platform(s) is the $560 part that has gone from being a locked quad-core 2.9 GHz chip to a unlocked hex-core 3.2 GHz one.

    But it is fair that the 920 has got it's successor. And if you want a lot of RAM, don't need the internal graphics or want the option to upgrade your cpu, the LGA2011+3820 is as good a choice as the LGA1155+2700K. Remember that we may well see a octa-core IVB-E within a year or so and LGA2011 will be the only platform to put it into.
    Reply
  • rgallant - Thursday, December 29, 2011 - link

    have a I7-920 and 2 x 580 ,+ 2 x gtx 285's laying around.
    -down the road might want to use a 3rd card , either a 580 or a gtx 285[phsyx]
    -so sb = x8,x4,x4 pci-e 2.0
    -so ib = x8,x4,x4 pci-e 2.0 - need all 3.0 cards for x8,x4,x4 pci-e 3.0

    -so 40 lanes look's better to me , not = to sb\ib.
    -come Jan.09 hope to see some sb benches with 2 x 7970 and a 3rd card at x8,x4,x4, and at x8+x8 ,then some 2 x x16 3.0 + x8 on a 79x system.
    -ib will not have a nv200 chip to give more lanes , as it's does not do pci-e 3.0.
    Reply
  • DanNeely - Thursday, December 29, 2011 - link

    I'd be shocked if nVidia doesn't luanch a PCIe 3.0 successor to the nv200... Reply
  • dj christian - Thursday, January 05, 2012 - link

    I did not understand a thing what you just wrote Reply
  • tpi2009 - Friday, December 30, 2011 - link

    Hi Anand,

    could you tell uss what is the latency of 10 MB L3 cache in the i7 3820 ? From the 3960X review the latency for the i7 3930K and 3960X were a bit higher compared to Sandy Bridge, given their bigger size, and also the main memory acess lantecy was also higher .

    Given that the i7 3820 is not an eight core chip with disabled cores and cache, I was wondering what latency does the cache and main memory access have ? Close to Sandy Bridge ? Close to the i7 3930K and 3960X ?

    Thanks!
    Reply
  • HMTK - Friday, December 30, 2011 - link

    Looks like a nice cheap CPU for a virtualization setup if it has all the necessary hardware activated. Reply
  • SunLord - Friday, December 30, 2011 - link

    Just looking at that transistor count pretty much shows exactly why AMD isn't as good as Intel they keep failing at trying to do more with less. They'd probably have far better luck trying to do more in the same amount of transistors Reply
  • Hauk - Friday, December 30, 2011 - link

    Now for the release date..? Reply
  • murray13 - Friday, December 30, 2011 - link

    Your 'niche' posit has one big flaw. Not everyone builds a new system every year or even 18 months. Those of us that only build new systems every 3 to 4 years are more looking at the platform and it's longevity than the single generation cpu gains.

    If someone wants to build a system (in the next couple of months) and needs it to last for 3 or 4 years, LGA2011 sure looks a lot better than LGA1155 does, at least with the current z68 chipset. That may change with the 7x chipset upgrade coming with IVB.

    So for me the real question is do I build when the 3820 comes out or do I wait and build when IVB comes out, assuming IVB brings with it a 7x chipset...

    I'm leaning heavily right now on LGA2011. Maybe I'm one of those 'niche' people.
    Reply
  • descendency - Saturday, December 31, 2011 - link

    Nah. Anand is right. The performance gap between SBE and SB isn't big enough in the vast majority of applications (especially consumer applications, ie games). 3-4 years or not.

    You will only see a performance gap increase at the ultra high end of the markets. Regardless of what year it is. So unless you are predicting that in the next 3-4 years, the ultra high end needs today become the midrange needs of tomorrow (something I would say, from a software engineer's perspective, is far from likely), I'd suggest you buy an SB instead of an SBE.

    I'm running a 3 year old AMD system fine. (speaking of which... might be time to upgrade lol)
    Reply

Log in

Don't have an account? Sign up now