Feeding the Beast

When frequency was all that mattered for CPUs, the main problem became efficiency, thermal performance, and yields: the higher the frequency was pushed, the more voltage needed, the further outside the peak efficiency window the CPU was, and the more power it consumed per unit work. For the CPU that was to sit at the top of the product stack as the performance halo part, it didn’t particularly matter – until the chip hit 90C+ on a regular basis.

Now with the Core Wars, the challenges are different. When there was only one core, making data available to that core through caches and DRAM was a relatively easy task. With 6, 8, 10, 12 and 16 cores, a major bottleneck suddenly becomes the ability to make sure each core has enough data to work continuously, rather than waiting at idle for data to get through. This is not an easy task: each processor now needs a fast way of communicating to each other core, and to the main memory. This is known within the industry as feeding the beast.

Top Trumps: 60 PCIe Lanes vs 44 PCIe lanes

After playing the underdog for so long, AMD has been pushing the specifications of its new processors as one of the big selling points (among others). Whereas Ryzen 7 only had 16 PCIe lanes, competing in part against CPUs from Intel that had 28/44 PCIe lanes, Threadripper will have access to 60 lanes for PCIe add-in cards. In some places this might be referred to as 64 lanes, however four of those lanes are reserved for the X399 chipset. At $799 and $999, this competes against the 44 PCIe lanes on Intel’s Core i9-7900X at $999.

The goal of having so many PCIe lanes is to support the sort of market these processors are addressing: high-performance prosumers. These are users that run multiple GPUs, multiple PCIe storage devices, need high-end networking, high-end storage, and as many other features as you can fit through PCIe. The end result is that we are likely to see motherboards earmark 32 or 48 of these lanes for PCIe slots (x16/x16, x8/x8/x8/x8, x16/x16/x16, x16/x8/x16/x8), followed by a two or three for PCIe 3.0 x4 storage via U.2 drives or M.2 drives, then faster Ethernet (5 Gbit, 10 Gbit). AMD allows each of the PCIe root complexes on the CPU, which are x16 each, to be bifurcated down to x1 as needed, for a maximum of 7 devices. The 4 PCIe lanes going to the chipset will also support several PCIe 3.0 and PCIe 2.0 lanes for SATA or USB controllers.

Intel’s strategy is different, allowing 44 lanes into x16/x16/x8 (40 lanes) or x16/x8/x16/x8 (40 lanes) or x16/x16 to x8/x8/x8x8 (32 lanes) with 4-12 lanes left over for PCIe storage or faster Ethernet controllers or Thunderbolt 3. The Skylake-X chipset then has an additional 24 PCIe lanes for SATA controllers, gigabit Ethernet controllers, SATA controllers and USB controllers.

Top Trumps: DRAM and ECC

One of Intel’s common product segmentations is that if a customer wants a high core count processor with ECC memory, they have to buy a Xeon. Typically Xeons will support a fixed memory speed depending on the number of channels populated (1 DIMM per channel at DDR4-2666, 2 DIMMs per channel at DDR4-2400), as well as ECC and RDIMM technologies. However, the consumer HEDT platforms for Broadwell-E and Skylake-X will not support these and use UDIMM Non-ECC only.

AMD is supporting ECC on their Threadripper processors, giving customers sixteen cores with ECC. However, these have to be UDIMMs only, but do support DRAM overclocking in order to boost the speed of the internal Infinity Fabric. AMD has officially stated that the Threadripper CPUs can support up to 1 TB of DRAM, although on close inspection it requires 128GB UDIMMs, which max out at 16GB currently. Intel currently lists a 128GB limit for Skylake-X, based on 16GB UDIMMs.

Both processors run quad-channel memory at DDR4-2666 (1DPC) and DDR4-2400 (2DPC).

Top Trumps: Cache

Both AMD and Intel use private L2 caches for each core, then have a victim L3 cache before leading to main memory. A victim cache is a cache that obtains data when it is evicted from the cache underneath it, and cannot pre-fetch data. But the size of those caches and how AMD/Intel has the cores interact with them is different.

AMD uses 512 KB of L2 cache per core, leading to an 8 MB of L3 victim cache per core complex of four cores. In a 16-core Threadripper, there are four core complexes, leading to a total of 32 MB of L3 cache, however each core can only access the data found in its local L3. In order to access the L3 of a different complex, this requires additional time and snooping. As a result there can be different latencies based on where the data is in other L3 caches compared to a local cache.

Intel’s Skylake-X uses 1MB of L2 cache per core, leading to a higher hit-rate in the L2, and uses 1.375MB of L3 victim cache per core. This L3 cache has associated tags and the mesh topology used to communicate between the cores means that like AMD there is still time and latency associated with snooping other caches, however the latency is somewhat homogenized by the design. Nonetheless, this is different to the Broadwell-E cache structure, that had 256 KB of L2 and 2.5 MB of L3 per core, both inclusive caches.

The AMD Ryzen Threadripper 1950X and 1920X Review Silicon, Glue, & NUMA Too
Comments Locked

347 Comments

View All Comments

  • nitin213 - Thursday, August 10, 2017 - link

    Thanks for your reply. Hopefully the test suite can be expanded as Intel's CPUs probably also move to higher core count and IO ranges in future.
    and i completely understand the frustration trying to get a 3rd party to change their defaults. Cheers
  • deathBOB - Thursday, August 10, 2017 - link

    It's clear to me . . . Ian is playing both sides and making out like a bandit! /s
  • FreckledTrout - Thursday, August 10, 2017 - link

    Ian can we get an updated comments section so we can +/- people and after x number of minuses they wont show by default. I'm saying this because some of these comments(the one in this chain included) are not meaningful responces. The comments section is by far the weakest link on Anantech.

    Nice review btw.
  • mapesdhs - Thursday, August 10, 2017 - link

    toms has that, indeed it's kinda handy for blanking out the trolls. Whether it's any useful indicator of "valid" opinion though, well, that kinda varies. :D (there's nowt to stop the trolls from voting everything under the sun, though one option would be to auto-suspend someone's ability to vote if their own posts get hidden from down voting too often, a hands-off way of slapping the trolls)

    Given the choice, I'd much rather just be able to *edit* what I've posted than up/down-vote what others have written. I still smile recalling a guy who posted a followup to apologise for the typos in his o.p., but the followup had typos aswell, after which he posted aaaaagh. :D

    Ian.
  • Johan Steyn - Thursday, August 10, 2017 - link

    Ian thanks for at least responding, I appreciate it. Please compare your review to sites like PCPer and many others. They have no problem to also point out the weak points of TR, yet clearly understand for what TR was mostly designed and focus properly on it and even though they did not test the 64 PCI lanes as an example, mention that they are planning a follow-up to do it, since it is an important point. You do mention these as well, but could have said more than just mention it by the way.

    Look at your review, most of it is about games. Are you serious?

    I have to give you credit to at least mention the problems with Sysmark.

    Let me give you an example of slanted journalism, When you do the rendering benchmarks, where AMD is known to shine, you only mention at each benchmark what they do etc, and fail to mention that AMD clearly beats Intel, even though other sites focus more ons these benchmarks. In the one benchmark where Intel get a descent score, you take time to mention that:

    "Though it's interesting just how close the 10-core Core i9-7900X gets in the CPU (C++) test despite a significant core count disadvantage, likely due to a combination of higher IPC and clockspeeds."

    Not in one of the rendering benchmarks do you give credit to AMD, yet you found it fitting to end the section of with:

    "Intel recently announced that its new 18-core chip scores 3200 on Cinebench R15. That would be an extra 6.7% performance over the Threadripper 1950X for 2x the cost."

    Not slanted journalism? At least you mention "2x the cost," but for most this will not defer them in buying the monopoly.

    After focussing so much time on game performance, I am not sure you understand TR at all. AMD still has a long way to go in many areas. Why? Because corrupt Intel basically drove them to bankruptcy, but that is a discussion for another day. I lived through those days and experienced it myself.

    Maybe I missed it, but where did you discuss the issue of memory speed? You mention in the beginning of memory overclock. Did you test the system running at 3200 or 2666? It is important to note. If you ran at 2666, then you are missing a very important point. Ryzen is known to gain a huge amount with memory speed. You should not regard 3200 as an overclock, since that is what that memory is made for, even if 2666 is standard spec. Most other sites I checked, used it like that. If you did use 3200, don't you think you should mention it?

    Why is it that your review ends up meh about TR and leave you rather wanting an i9 an almost all respects, yet most of the other site gives admiration where deserved, even though they have criticism as well. Ian I see that you clearly are disappointed with TR, which is OK, maybe you just like playing games and that is why you are so.

    It was clear how much you admire Intel in your previous article. You say that I gave no examples of slanted journalism, maybe you should read my post again. "Most Powerful, Most scalable." It is well known that people don't read the fine print. This was intentional. If not, you are a very unlucky guys for having so many unintended mishaps. Then I truly need to say I am sorry.

    For once, please be a bit excited that there is some competition against the monopoly of Intel, or maybe you are also deluded that they became so without any underhanded ways.

    By the way, sorry that I called you Anand. I actually wanted to type Anandtech, but left it like it. This site still carries his name and he should still take responsibility. After I posted, I realised I should have just checked the author, so sorry about that.
  • vanilla_gorilla - Thursday, August 10, 2017 - link

    "Intel recently announced that its new 18-core chip scores 3200 on Cinebench R15. That would be an extra 6.7% performance over the Threadripper 1950X for 2x the cost."

    How do you not understand that is a dig at Intel? He's saying you have to pay twice as much for only a 6.7% improvement.
  • smilingcrow - Thursday, August 10, 2017 - link

    The memory speed approach taken was clearly explained in the test and was stated as being consistent with how they always test.
    I don't take issue with testing at stock speeds at launch day as running memory out of spec for the system can be evaluated in depth later on.
  • Johan Steyn - Friday, August 11, 2017 - link

    That is just rubbish. Threadripper has no problem with 3200 memory and other sites has no problem running it at that speed. 3200 memory is designed to run 3200, why run it at 2666? There is just no excuse except being paid by Intel.

    Maybe then you can accuse other sites of being unscientific?
  • fanofanand - Tuesday, August 15, 2017 - link

    Anandtech always tests at JDEC, regardless of the brand.
  • Manch - Friday, August 11, 2017 - link

    ""Intel recently announced that its new 18-core chip scores 3200 on Cinebench R15. That would be an extra 6.7% performance over the Threadripper 1950X for 2x the cost."

    Not slanted journalism? At least you mention "2x the cost," but for most this will not defer them in buying the monopoly."

    You call Intel the monopoly and call him out for not wording the sentence to dissuade people from buying Intel. Who has the bias here? If he was actively promoting Intel over AMD or vice versa, you'd be OK with the latter, but to do neither. He's an Intel shill? Come on. That's unfair. HOW should he have wrote it so it would satisfy you?

    FYI Anand is gone. He's NOT responsible for anything at Anandtech. Are you going to hold Wozniak's feet to the fire for the lack of ports on a Mac too?

Log in

Don't have an account? Sign up now