Section by Dr. Ian Cutress (Orignal article)

Windows Optimizations

One of the key points that have been a pain in the side of non-Intel processors using Windows has been the optimizations and scheduler arrangements in the operating system. We’ve seen in the past how Windows has not been kind to non-Intel microarchitecture layouts, such as AMD’s previous module design in Bulldozer, the Qualcomm hybrid CPU strategy with Windows on Snapdragon, and more recently with multi-die arrangements on Threadripper that introduce different memory latency domains into consumer computing.

Obviously AMD has a close relationship with Microsoft when it comes down to identifying a non-regular core topology with a processor, and the two companies work towards ensuring that thread and memory assignments, absent of program driven direction, attempt to make the most out of the system. With the May 10th update to Windows, some additional features have been put in place to get the most out of the upcoming Zen 2 microarchitecture and Ryzen 3000 silicon layouts.

The optimizations come on two fronts, both of which are reasonably easy to explain.

Thread Grouping

The first is thread allocation. When a processor has different ‘groups’ of CPU cores, there are different ways in which threads are allocated, all of which have pros and cons. The two extremes for thread allocation come down to thread grouping and thread expansion.

Thread grouping is where as new threads are spawned, they will be allocated onto cores directly next to cores that already have threads. This keeps the threads close together, for thread-to-thread communication, however it can create regions of high power density, especially when there are many cores on the processor but only a couple are active.

Thread expansion is where cores are placed as far away from each other as possible. In AMD’s case, this would mean a second thread spawning on a different chiplet, or a different core complex/CCX, as far away as possible. This allows the CPU to maintain high performance by not having regions of high power density, typically providing the best turbo performance across multiple threads.

The danger of thread expansion is when a program spawns two threads that end up on different sides of the CPU. In Threadripper, this could even mean that the second thread was on a part of the CPU that had a long memory latency, causing an imbalance in the potential performance between the two threads, even though the cores those threads were on would have been at the higher turbo frequency.

Because of how modern software, and in particular video games, are now spawning multiple threads rather than relying on a single thread, and those threads need to talk to each other, AMD is moving from a hybrid thread expansion technique to a thread grouping technique. This means that one CCX will fill up with threads before another CCX is even accessed. AMD believes that despite the potential for high power density within a chiplet, while the other might be inactive, is still worth it for overall performance.

For Matisse, this should afford a nice improvement for limited thread scenarios, and on the face of the technology, gaming. It will be interesting to see how much of an affect this has on the upcoming EPYC Rome CPUs or future Threadripper designs. The single benchmark AMD provided in its explanation was Rocket League at 1080p Low, which reported a +15% frame rate gain.

Clock Ramping

For any of our users familiar with our Skylake microarchitecture deep dive, you may remember that Intel introduced a new feature called Speed Shift that enabled the processor to adjust between different P-states more freely, as well as ramping from idle to load very quickly – from 100 ms to 40ms in the first version in Skylake, then down to 15 ms with Kaby Lake. It did this by handing P-state control back from the OS to the processor, which reacted based on instruction throughput and request. With Zen 2, AMD is now enabling the same feature.

AMD already has sufficiently more granularity in its frequency adjustments over Intel, allowing for 25 MHz differences rather than 100 MHz differences, however enabling a faster ramp-to-load frequency jump is going to help AMD when it comes to very burst-driven workloads, such as WebXPRT (Intel’s favorite for this sort of demonstration). According to AMD, the way that this has been implemented with Zen 2 will require BIOS updates as well as moving to the Windows May 10th update, but it will reduce frequency ramping from ~30 milliseconds on Zen to ~1-2 milliseconds on Zen 2. It should be noted that this is much faster than the numbers Intel tends to provide.

The technical name for AMD’s implementation involves CPPC2, or Collaborative Power Performance Control 2, and AMD’s metrics state that this can increase burst workloads and also application loading. AMD cites a +6% performance gain in application launch times using PCMark10’s app launch sub-test.

Hardened Security for Zen 2

Another aspect to Zen 2 is AMD’s approach to heightened security requirements of modern processors. As has been reported, a good number of the recent array of side channel exploits do not affect AMD processors, primarily because of how AMD manages its TLB buffers that have always required additional security checks before most of this became an issue. Nonetheless, for the issues to which AMD is vulnerable, it has implemented a full hardware-based security platform for them.

The change here comes for the Speculative Store Bypass, known as Spectre v4, which AMD now has additional hardware to work in conjunction with the OS or virtual memory managers such as hypervisors in order to control. AMD doesn’t expect any performance change from these updates. Newer issues such as Foreshadow and Zombieload do not affect AMD processors.

X570 Motherboards: PCIe 4.0 For Everybody Test Bed and Setup
POST A COMMENT

449 Comments

View All Comments

  • PProchnow - Friday, July 12, 2019 - link

    Here's is Jus' a good ol' boy trying out. No OC off stock Multi but 3333Mhz RAM
    #1
    https://browser.geekbench.com/v4/cpu/13863634

    Rather a new rig and it is X470 up to the A.A BIOS and it is MSI Gaming Plus.
    OK link #2 is here and I stroked the DDR$ up top 3333Mhz. I also stroked the fan
    to stay sub 70C. Wild OCs will take water at least "in The Home" versus LiqN2 Lab.

    https://browser.geekbench.com/v4/cpu/13865361

    BTW where is the Bragging Thread? My MOBO is the MSI X470 Gaming Plus BIOS A.A makes Ryzen 9 go BTW.
    I have yet to up the MULTI in case you want to know. I wonder what good Ocers will get with the right stuff.

    Single-Core Performance
    Memory Score 6431
    Floating Point Score 5409
    Integer Score 5190
    Crypto Score 6888
    Single-Core Score 5589

    You underst and that RAM set at 1672 is 1/2 the common referred to speed. 3344Mhz is the common nomenclature.

    ***Single-Core Score ***Multi-Core Score
    5589 47755
    Geekbench 4.3.4 Tryout for Windows x86 (64-bit)
    Result Information
    Upload Date July 12 2019 08:16 PM
    Views 2
    System Information
    System Information
    Operating System Microsoft Windows 10 Pro (64-bit)
    Model Micro-Star International Co., Ltd. MS-7B79
    Motherboard Micro-Star International Co., Ltd. X470 GAMING PLUS (MS-7B79)
    Memory 32768 MB DDR4 SDRAM 1672MHz
    Northbridge AMD Ryzen SOC 00
    Southbridge AMD X470 51
    BIOS American Megatrends Inc. A.A0
    Processor Information
    Name AMD Ryzen 9 3900X
    Topology 1 Processor, 12 Cores, 24 Threads
    Identifier AuthenticAMD Family 23 Model 113 Stepping 0
    Base Frequency 3.80 GHz
    Maximum Frequency 4.53 GHz
    Reply
  • Maxiking - Tuesday, July 23, 2019 - link

    Why would anyone brag about something if

    You can't reach 5.0ghz +
    You can't reach even the boost frequency on a single core
    You can't beat consistently competitor's older 14nm cpu architecture which has been on the market since 2016...
    You can't beat RAM OC'ing records either because over 3733mhz IF gets actually downlocked and due tu that, "faster" ram performs worse unless you OC 7400mhz, which is not possible even with liquid nitrogen.
    Reply
  • PProchnow - Friday, July 12, 2019 - link

    These are my scores with my Ryzen 9 3900X.
    #1
    https://browser.geekbench.com/v4/cpu/13863634

    Rather a new rig and it is X470 up to the A.A BIOS and it is MSI Gaming Plus.
    OK link #2 is here and I stroked the DDR$ up top 3333Mhz. I also stroked the fan
    to stay sub 70C. Wild OCs will take water at least "in The Home" versus LiqN2 Lab.

    https://browser.geekbench.com/v4/cpu/13865361

    BTW where is the Bragging Thread? My MOBO is the MSI X470 Gaming Plus BIOS A.A makes Ryzen 9 go BTW.
    I have yet to up the MULTI in case you want to know. I wonder what good Ocers will get with the right stuff.

    Single-Core Performance
    Memory Score 6431
    Floating Point Score 5409
    Integer Score 5190
    Crypto Score 6888
    Single-Core Score 5589

    You underst and that RAM set at 1672 is 1/2 the common referred to speed. 3344Mhz is the common nomenclature.

    ***Single-Core Score ***Multi-Core Score
    5589 47755
    Geekbench 4.3.4 Tryout for Windows x86 (64-bit)
    Result Information
    Upload Date July 12 2019 08:16 PM
    Views 2
    System Information
    System Information
    Operating System Microsoft Windows 10 Pro (64-bit)
    Model Micro-Star International Co., Ltd. MS-7B79
    Motherboard Micro-Star International Co., Ltd. X470 GAMING PLUS (MS-7B79)
    Memory 32768 MB DDR4 SDRAM 1672MHz
    Northbridge AMD Ryzen SOC 00
    Southbridge AMD X470 51
    BIOS American Megatrends Inc. A.A0
    Processor Information
    Name AMD Ryzen 9 3900X
    Topology 1 Processor, 12 Cores, 24 Threads
    Identifier AuthenticAMD Family 23 Model 113 Stepping 0
    Base Frequency 3.80 GHz
    Maximum Frequency 4.53 GHz

    Now you can cross ref with others.
    Reply
  • Meteor2 - Monday, July 15, 2019 - link

    Nice! Reply
  • willis936 - Wednesday, July 17, 2019 - link

    The editor's choice awards are a bit strange to me. Zen 1 didn't receive one even though it was the largest CPU performance increase from a company this century. The i7-4950HQ received an editor's choice silver award even though it had little importance to the industry. And the 3700X, which offers comparable SP performance to competing intel products at a huge discount and smaller power budget gets the same editor's choice level as the i7-4950HQ? Reply
  • willis936 - Wednesday, July 17, 2019 - link

    I know it was a different editor at the time, but the selective excitement is a bit of a bummer. eDRAM was exciting to see at the time and then nothing ever came of it. The enthusiasm of chiplets under the new editor comes through much less. That too is fine. However if the rating system is what it is then I don't think it's much to argue that chiplets are much more disruptive than eDRAM and is already making much larger waves. Reply
  • Maxiking - Monday, July 22, 2019 - link

    AMD fraund getting finally the attention it deserves

    https://www.youtube.com/watch?v=x03FyPQ3a3E

    check at 05m25s
    Reply
  • Korguz - Monday, July 22, 2019 - link

    Maxiking, and HOW LONG till intel gets the SAME treatment?? saying a processor uses x watts, but in reality uses 50 to 100 watts MORE isnt FRAUD ??? hell you confine intels cpus to the watts they state, and their performance goes DOWN THE TOILET !!!. again .. you KEEP saying AMD is a fraud, but you STILL refuse to admit, that intel is a fraud as well..

    does this guy even acknowlege the issue with intel and the amount of power they " say " their cpus use, and how much power they REALLY use ??
    Reply
  • Korguz - Monday, July 22, 2019 - link

    further.. intel doesnt do any marketing, cause they DON'T want the general average user to know the cpu they bought, uses MORE power then has been stated, THAT also is false advertising, come on maxiking, go after intel as well, the same same things you are accusing amd of... Reply
  • Maxiking - Tuesday, July 23, 2019 - link

    You are uneducated, TDP doesn't mean power consumption but the amount of heat dissipated, it informs you how much of heat the cooler must be able to dissipate in order to keep the cpu cool enough to run.

    Get it? 1700x TDP was 95W yet there were tasks it managed to consume 120 or even 140w on stock settings. Like do you even watch reviews? It was the same with 2700w.

    but mimimimimimi AMD good mimimimimi Intel bad
    Reply

Log in

Don't have an account? Sign up now