AMD's Ryzen 9 6900HS Rembrandt Benchmarked: Zen3+ Power and Performance Scalingby Dr. Ian Cutress on March 1, 2022 9:30 AM EST
CPU Tests: SPEC Performance
SPEC2017 is a series of standardized tests used to probe the overall performance between different systems, different architectures, different microarchitectures, and setups. The code has to be compiled, and then the results can be submitted to an online database for comparison. It covers a range of integer and floating point workloads, and can be very optimized for each CPU, so it is important to check how the benchmarks are being compiled and run.
For compilers, we use LLVM both for C/C++ and Fortran tests, and for Fortran we’re using the Flang compiler. The rationale of using LLVM over GCC is better cross-platform comparisons to platforms that have only have LLVM support and future articles where we’ll investigate this aspect more. We’re not considering closed-sourced compilers such as MSVC or ICC.
clang version 10.0.0
clang version 7.0.1 (ssh://email@example.com/flang-compiler/flang-driver.git
-mfma -mavx -mavx2
Our compiler flags are straightforward, with basic –Ofast and relevant ISA switches to allow for AVX2 instructions. We decided to build our SPEC binaries on AVX2, which puts a limit on Haswell as how old we can go before the testing will fall over. This also means we don’t have AVX512 binaries, primarily because in order to get the best performance, the AVX-512 intrinsic should be packed by a proper expert, as with our AVX-512 benchmark. All of the major vendors, AMD, Intel, and Arm, all support the way in which we are testing SPEC.
To note, the requirements for the SPEC licence state that any benchmark results from SPEC have to be labeled ‘estimated’ until they are verified on the SPEC website as a meaningful representation of the expected performance. This is most often done by the big companies and OEMs to showcase performance to customers, however is quite over the top for what we do as reviewers.
In the single threaded test, the jump over the regular Zen 3 Ryzen mobile variant (5980HS) at the same power is quite substantial: +9.6% on integer performance and +14.1% on floating point. The move from DDR4 to DDR5 is quite substantial in that regard, and it’s seen in a lot of our upcoming benchmarks.
We didn’t see any change from 35 W to 45 W to 65 W in our AMD testing as the power consumption of the chip in single threaded workloads did not exceed 24 W, however we did see performance difference in Intel’s Alder Lake going from 45 W to 65 W, showcasing how much power the core can consume.
But if we compared that to Intel’s latest Alder Lake offerings, there’s a deficit in both categories – even though our lowest data here is at 45 W, we can see that the 45 W testing of the previous generation Intel also beats the 6900HS at SPECint (but AMD wins in SPECfp). This is something that carries through to multi-threaded performance.
For Multi-Threaded performance, we only saw the slightest improvement from AMD moving up to 65 W, perhaps showcasing that the hardware is limited in other ways than just power and the uplift from DDR4 to DDR5. In any event, at 35 W, AMD still surpasses what the previous generation Intel i9-11980HK can provide at 65 W.
But if we compare it to Intel’s latest Alder Lake processors, featuring 6 performance cores and 8 efficiency cores, we now have 20 threads up against AMD’s 16 threads. If we compare 45 W to 45 W, Intel has a +14.0% lead in integer and a +13.3% lead in floating point, despite the 20% increase in threads. With Intel introducing this dual tier performance with hybrid SoCs, multi-threaded performance is going to be a combination of fast+slow and it all comes down to how the system can divide up the work.