SMT Integer Performance With SPEC CPU2006

Next, to test the performance impact of simultaneous multithreading (SMT) on a single core, we test with two threads on the same core. This way we can evaluate how well the core handles SMT. 

Subtest Application type Xeon E5-2690 @ 3.8 Xeon E5-2690 v3 @ 3.5 Xeon E5-2699 v4 @ 3.6 EPYC 7601 @3.2 Xeon 8176 @ 3.8
400.perlbench Spam filter 39.8 43.9 47.2 40.6 55.2
401.bzip2 Compression 32.6 32.3 32.8 33.9 34.8
403.gcc Compiling 40.7 43.8 32.5 41.6 32.1
429.mcf Vehicle scheduling 44.7 51.3 55.8 44.2 56.6
445.gobmk Game AI 36.6 35.9 38.1 36.4 39.4
456.hmmer Protein seq. analyses 32.5 34.1 40.9 34.9 44.3
458.sjeng Chess 36.4 36.9 39.5 36 41.9
462.libquantum Quantum sim 75 73.4 89 89.2 91.7
464.h264ref Video encoding 52.4 58.2 58.5 56.1 75.3
471.omnetpp Network sim 25.4 30.4 48.5 26.6 42.1
473.astar Pathfinding 31.4 33.6 36.6 29 37.5
483.xalancbmk XML processing 43.7 53.7 78.2 37.8 78

Now on a percentage basis versus the single-threaded results, so that we can see how much performance we gained from enabling SMT:

Subtest Application type Xeon E5-2699 v4 @ 3.6 EPYC 7601 @3.2 Xeon 8176 @ 3.8
400.perlbench Spam filter 109% 131% 110%
401.bzip2 Compression 137% 141% 128%
403.gcc Compiling 137% 119% 131%
429.mcf Vehicle scheduling 125% 110% 131%
445.gobmk Game AI 125% 150% 127%
456.hmmer Protein seq. analyses 127% 125% 125%
458.sjeng Chess 120% 151% 125%
462.libquantum Quantum sim 91% 129% 90%
464.h264ref Video encoding 101% 112% 112%
471.omnetpp Network sim 109% 116% 103%
473.astar Pathfinding 140% 149% 137%
483.xalancbmk XML processing 120% 107% 116%

On average, both Xeons pick up about 20% due to SMT (Hyperthreading). The EPYC 7601 improved by even more: it gets a 28% boost on average. There are many possible explanations for this, but two are the most likely. In the situation where AMD's single threaded IPC is very low because it is waiting on the high latency of a further away L3-cache (>8 MB), a second thread makes sure that the CPU resources can be put to better use (like compression, the network sim). Secondly, we saw that AMD core is capable of extracting more memory bandwidth in lightly threaded scenarios. This might help in the benchmarks that stress the DRAM (like video encoding, quantum sim). 

Nevertheless, kudos to the AMD engineers. Their first SMT implementation is very well done and offers a tangible throughput increase. 

Single Threaded Integer Performance: SPEC CPU2006 Multi-core SPEC CPU2006
POST A COMMENT

219 Comments

View All Comments

  • JKflipflop98 - Wednesday, July 12, 2017 - link

    For years I thought you were just really committed to playing the "dumb AMD fanbot" schtick for laughs. It's infinitely more funny now that I know you've actually been *serious* this entire time. Reply
  • ddriver - Wednesday, July 12, 2017 - link

    Whatever helps you feel better about yourself ;) I bet it is funny now, that AT have to carefully devise intel biased benches and lie in its reviews in hopes intel at least saves face. BTW I don't have a single amd CPU running ATM. Reply
  • WinterCharm - Thursday, July 13, 2017 - link

    Uh, what are you smoking? this is a pretty even piece. Reply
  • boozed - Tuesday, July 11, 2017 - link

    You haven't done your job properly unless you've annoyed the fanboys (and perhaps even fangirls) for both sides! Reply
  • JohanAnandtech - Wednesday, July 12, 2017 - link

    Wise words. Indeed :-) Reply
  • Ranger1065 - Wednesday, July 12, 2017 - link

    If you are referring to ddriver, I agree, wise words indeed. Reply
  • ddriver - Wednesday, July 12, 2017 - link

    Well, that assumption rests on the presumption that the point of reviews is to upsed fanboys.

    I'd say that a "review done right" would include different workload scenarios, there is nothing wrong with having one that will show the benefits of intel's approach to doing server chips, but that should be properly denoted, and should be just one of several database tests and should be accompanied by gigabytes of databases which is what we use in real world scenarios.
    Reply
  • CoachAub - Wednesday, July 12, 2017 - link

    It was mentioned more than once that this review was rushed to make a deadline and that the suite of benchmarks were not everything they wanted to run and without optimizations or even the usual tweaks an end-user would make to their system. So, keep that in mind as you argue over the tests and different scenarios, etc. Reply
  • ddriver - Thursday, July 13, 2017 - link

    It doesn't take a lot of time to populate a larger database so that you can make a benchmark that involves an actual real world usage scenario. It wasn't the "rushing" that prompted the choice of database size... Reply
  • mpbello - Friday, July 14, 2017 - link

    If you are rushing, you reduce scope and deliver fewer pieces with high quality instead of insisting on delivering a full set of benchmarks that you are not sure about its quality.
    The article came to a very strong conclusion: Intel is better for database scenarios. Whatever you do, whether you are rushing or not, you cannot state something like that if the benchmarks supporting your conclusion are not well designed.
    So I agree that the design of the DB benchmark was incredibly weak to sustain such an important conclusion that Intel is the best choice for DB applications.
    Reply

Log in

Don't have an account? Sign up now