Memory Subsystem: Bandwidth

As we have reported before, measuring the full bandwidth potential with John McCalpin's Stream bandwidth benchmark has become a matter of extreme tuning, requiring a very deep understanding of the platform. 

If we used our previous binaries, both the first and second generation EPYC could not get past 200-210 GB/s. It gave the impression of running into a "bandwidth wall", despite the fact that we now had 8-channel DDR4-3200. So we used the results that Intel and AMD best binaries produce using AVX-512 (Intel) and AVX-2 (AMD). 

The results are expressed in gigabytes per second.

Stream Triad

AMD can reach even higher numbers with the setting "number of nodes per socket" (NPS) set to 4. With 4 nodes per socket, AMD reports up to 353 GB/s. NPS4 will cause the CCX to only access the memory controllers with the lowest latency at the central IO Hub chip.

Those numbers only matter to a small niche of carefully AVX(-256/512) optimized HPC applications. AMD claims a 45% advantage compared to the best (28-core) Intel SKUs. We have every reason to believe them but it is only relevant to a niche. 

For the rest of the enterprise world (probably 95+%), memory latency has much larger impact than peak bandwidth. 

Benchmark Configuration and Methodology Memory Subsystem: Latency
Comments Locked

180 Comments

View All Comments

  • close - Thursday, August 8, 2019 - link

    VMware licenses per socket. I'm not sure what kind of niche market one would have to be in (maybe HPC on Windows with the HPC Pack?) to run Win server bare metal on this thing. So I'm pretty sure the average cores/VM for Windows servers is relatively low and no reason for concern.
  • schujj07 - Thursday, August 8, 2019 - link

    @deltaFx2 Most people purchase more cores than they currently need so that they can grow. In the long run it is cheaper to purchase a higher SKU right now than purchase a second host a year down the road.
    @close There are companies that are Windows only so they would install Hyper-V onto this host to use as their hypervisor. However, even under VMware if you want to license Windows as a VM you have to pay the per-core licensing for every CPU core on each VM. I looked into getting volume licensing for Server 2016 for the company I work for we have 2 hosts with dual 24 core Epyc 7401's and we would need to get 16 dual core license packs for each instance of Server 2016. It ended up that we couldn't afford to get Sever 2016 because it would have cost us $5k per instance of Server 2016.
  • DigitalFreak - Thursday, August 8, 2019 - link

    @schujj07 Just buy a Windows Server Datacenter license for each host and you don't have to worry about licensing each VM.
  • schujj07 - Thursday, August 8, 2019 - link

    AFAIK it doesn't work that way when you are running VMware. With VMware you will still have to license each one.
  • wolrah - Thursday, August 8, 2019 - link

    @schujj07 nope. Windows Server licensing is the same no matter which hypervisor you're using. Datacenter licenses allow unlimited VMs on any licensed host.
  • diehardmacfan - Thursday, August 8, 2019 - link

    This is correct. You do need to buy the licenses to match the core count of the hypervisor, however.
  • Dug - Friday, August 9, 2019 - link

    You still have to pay for cores on datacenter. Each datacenter license covers 2 cores with a minimum purchase of 8. So over 8 cores and you are buying more licenses. 64 cores is about $25k
  • MDD1963 - Friday, August 9, 2019 - link

    Windows license (Standard or Datacenter) covers 2 *sockets* for, a total of 16 cores....; if you have more than 2 sockets, you need more licenses...; if you have 2 sockets, filled with 8 core CPUs, you are good with one standard license... If you have 20 total cores, you need a standard license, and a pair of '2 core' add ons... If you have 32 cores, you need 2 full standard licenses....
  • MDD1963 - Friday, August 9, 2019 - link

    Datacenter is still licensed for 16 cores, with little 2 pack increments available, or, in the case of a 64 core CPU, effectively 4 Datacenter licenses would be required...($6k per 16 cores, or, roughly $24k)
  • deltaFx2 - Friday, August 9, 2019 - link

    @schujj07: Of course I get that. The OP @Pancakes implied that Rome was going to hurt the wallets of buyers using windows server. The implication being this would not happen if they bought Intel. I was questioning those assumptions. How can Rome cost more money for windows licenses unless rome needs more cores to get the same job done or enterprises overprovision Rome (in terms of total cores) vs. Intel. That would make sense if the per-thread performance is worse but it's not.

Log in

Don't have an account? Sign up now