SPEC CPU - Multi-Threaded Performance

Moving onto multi-threaded SPEC CPU 2017 results, these are the same workloads as on the single-threaded test (we purposefully avoid Speed variants of the workloads in ST tests). The key to performance here is not only microarchitecture or core count, but the overall power efficiency of the system and the levels of performance we can fit into the thermal envelope of the device we’re testing.

It’s to be noted that among the four chips I put into the graph, the i9-11980HK is the only one at a 45W TDP, while the AMD competition lands in at 35W, and the i7-1185G7 comes at a lower 28W. The test takes several hours of runtime (6 hours for this TGL-H SKU) and is under constant full load, so lower duration boost mechanisms don’t come into play here.

SPECint2017 Rate-N Estimated Scores

Generally as expected, the 8-core TGL-H chip leaves the 4-core TGL-U sibling in the dust, in many cases showcasing well over double the performance. The i9-11980HK also fares extremely well against the AMD competition in workloads which are more DRAM or cache heavy, however falls behind in other workloads which are more core-local and execution throughput bound. Generally that’d be a fair even battle argument between the designs, if it weren’t for the fact that the AMD systems are running at 23% lower TDPs.

SPECfp2017 Rate-N Estimated Scores

In the floating-point multi-threaded suite, we again see a similar competitive scenario where the TGL-H system battles against the best Cezanne and Renoir chips.

What’s rather odd here in the results is 503.bwaves_r and 549.fotonik_r which perform far below the numbers which we were able to measure on the TGL-U system. I think what’s happening here is that we’re hitting DRAM memory-level parallelism limits, with the smaller TGL-U system and its 8x16b LPDDR4 channel memory configuration allowing for more parallel transactions as the 2x64b DDR4 channels on the TGL-H system.

SPEC2017 Rate-N Estimated Total

In terms of the overall performance, the 45W 11980HK actually ends up losing to AMD’s Ryzen 5980HS even with 10W more TDP headroom, at least in the integer suite.

We also had initially run the suite in 65W mode, the results here aren’t very good at all, especially when comparing it to the 45W results. For +40-44% TDP, the i9-11980HK in Intel’s reference laptop only performs +9.4% better. It’s likely here that this is due to the aforementioned heavy thermal throttling the system has to fall to, with long periods of time at 35W state, which pulls down the performance well below the expected figures. I have to be explicit here that these 65W results are not representative of the full real 65W performance capabilities of the 11980HK – just that of this particular thermal solution within this Intel reference design.

SPEC CPU - Single-Threaded Performance CPU Tests: Office and Science
Comments Locked

229 Comments

View All Comments

  • Qasar - Monday, May 17, 2021 - link

    " Name a single workload where the spec results line up with application performance"
    post a single link that shows you are right, and Andrei is wrong, as so far, it seems you are just typing FUD.
    personally, im going with Andrei.
  • Spunjji - Tuesday, May 18, 2021 - link

    Why don't you name some where it doesn't, given that you're the one making the extraordinary claim here?
  • Andrei Frumusanu - Monday, May 17, 2021 - link

    I've added in the text to those pages now, and I explain why they would end up like that.

    The TGL-H system has half the memory level parallelism with its 2x64 DDR4 channels versus the 4x16b LPDDR4 channels of the TGL system, and those two workloads are characterised by heavy parallelised memory bandwidth.

    We've seen a 66% performance difference on a 5950X between 2x SR and 4x SR DIMM memory in the MT test, it all depends on the DRAM configuration and what kind of parallelism it allows.

    Our testing is correct and we have the correct understanding of the microarchitectures and workloads.
  • vyor - Monday, May 17, 2021 - link

    Thanks for finally actually giving reasons, should have been there before publishing.

    And no, no it isn't. You don't even publish your compiler settings.
  • Andrei Frumusanu - Monday, May 17, 2021 - link

    The compiler settings are literally on the SPEC page and have been there the whole time, and have been set in stone on the Windows side for over a year now for every article.
  • vyor - Monday, May 17, 2021 - link

    I do not believe those are the actual compiler settings. Because if they are, you fucked up hard.
  • mode_13h - Monday, May 17, 2021 - link

    Outlier results should be investigated and understood. They might be very informative of edge cases. Or, they might indeed expose procedural errors in the testing. Either way, your attitude of dismissing them as erroneous and abusing the testers is not helpful.

    It's fine to call attention to anomalies and ask questions, but abuse is not called for and shouldn't be tolerated.
  • vyor - Monday, May 17, 2021 - link

    Except that he's been getting called out for this for the last year+.
  • Andrei Frumusanu - Monday, May 17, 2021 - link

    And every time I've demolished the unsubstantiated empty argument with data and facts. If you do not have any actual technical argument to make then don't make any.
  • ballsystemlord - Monday, May 17, 2021 - link

    I have to back up Andrei here. You've only given us hyperbole so far.
    Which compiler settings do you have a problem with exactly?
    As a former Gentoo Linux user, I don't see a problem with them. Of course, -Ofast shouldn't be used in a production system -- but he is benchmarking here.

Log in

Don't have an account? Sign up now