The Intel 12th Gen Core i9-12900K Review: Hybrid Performance Brings Hybrid Complexity
by Dr. Ian Cutress & Andrei Frumusanu on November 4, 2021 9:00 AM ESTCPU Tests: SPEC MT Performance - DDR5 Advantage
Multi-threaded performance is where things become very interesting for Alder Lake, where the chip can now combine its 8 P-cores with its 8 E-cores. As we saw, the 8 E-cores are nothing to sneeze about, but another larger consideration for MT performance is DDR5. While in the ST results we didn’t see much change in the performance of the cores, in MT scenarios when all cores are hammering the memory, having double the memory channels as well as +50% more bandwidth is going to be extremely beneficial for Alder Lake.
As we noted, the DDR5 vs DDR4 results showcase a very large performance gap between the two memory technologies in MT scenarios. Running a total of 24 threads, 16 for the SMT-enabled P-cores, and 8 for the E-cores, Alder Lake is able to take the performance crown in quite a lot of the workloads. There are still cases where AMD’s 16-core setup with larger cores are able to perform better, undoubtedly also partly attributed to 64MB of on-chip cache.
Compared to the 11900K, the new 12900K showcases giant leaps, especially when paired with DDR5.
In the FP suite, the DDR5 advantage in some workloads is even larger, as the results scale beyond that of the pure theoretical +50% bandwidth improvement. What’s important for performance is not just the theoretical bandwidth, but the actual utilised bandwidth, and again, the doubled up memory channels of DDR5 here are seemingly contributing to extremely large increases, if the workload can take advantage of it.
In the aggregate results, there’s very clearly two conclusions, depending on whether you use the chip with DDR5 or DDR4.
With DDR4, Alder Lake and the 12900K in particular, is able to showcase very good and solid increases in performance, thanks to the IPC gains on the Golden Cove core, but most importantly, also thanks to the extra 8 Gracemont cores, which do carry their own weight. The 12900K falls behind AMD’s 5900X with DDR4, which is fair given the pricing of the chips here are generally in line with teach other.
With DDR5, the 12900K is able to fully stretch its multi-threaded performance legs. In less memory dependent workloads, the chip battles it out with AMD’s 16-core 5950X, winning some workloads, losing some others. In more memory dependent workloads, the DDR5 advantage is extremely clear, and the 12900K is able to blow past any competition, even slightly edging out the latest Apple M1 Max, released a few weeks ago, and notable for its memory bandwidth.
474 Comments
View All Comments
mode_13h - Monday, November 15, 2021 - link
Do you know, for a fact, that the new scheduling policies override the priority-boost you mentioned? I wouldn't assume so, but I'm not saying they don't.Maybe I'm optimistic, but I think MS is smart enough to know there are realtime services that don't necessarily have focus and wouldn't break that usage model.
ZioTom - Monday, November 29, 2021 - link
Windows 11 scheduler fails to allocate workloads...I noticed that the scheduler parks the cores if the application isn't full screen.
I did a test on a 12700k with Handbrake: as long as the program window remains in the foreground, all the Pcore and Ecore are allocated at 100%. If I open a browser and use it while the movie is being compressed, the kernel takes the load off the Pcore and runs the video compression only on the Ecores. Absurd behavior, absolutely useless!
alpha754293 - Wednesday, January 12, 2022 - link
I have my 12900K for a little less than a month now and here's what I've found from the testing that I've done with the CPU:(Hardware notes/specs: Asus Z690 Prime-P D4 motherboard, 4x Crucial 32 GB DDR4-3200 unbuffered, non-ECC RAM (128 GB total), running CentOS 7.7.1908 with the 5.14.15 kernel)
IF your workload CAN be multithreaded and it can run on BOTH the P cores AND the E cores simultaneously, then there is a potential that you can have better performance than the 5950X. BUT if you CAN'T run your application on both the P cores and the E cores at the same time (which a number of distributed parallel applications that rely on MPI), then you WON'T be able to realise the performance advantages that having both said P cores and E cores would give you (based on what the benchmark results show).
And if your program, further, cannot use HyperThreading (which some HPC/CAE program will actually lock you out of doing so), then you can be upwards of anywhere between 63-81% SLOWER than the 5950X (because on the 5950X, even with SMT disabled, you can still run the programme on all 16 physical cores, vs. the 8 P cores on the 12900K).
Please take note.
alceryes - Wednesday, August 24, 2022 - link
Question.Did you use 'affinities' for all the different core tests (P-core only, P+E-core tests)?