Closing Thoughts

First of all, we have to emphasize that we were only able to spend about a week on the AMD server, and about two weeks on the Intel system. With the complexity of both server hardware and especially server software, that is very little time. There is still a lot to test and tune, but the general picture is clear.

We can continue to talk about Intel's excellent mesh topology and AMD strong new Zen architecture, but at the end of the day, the "how" will not matter to infrastructure professionals. Depending on your situation, performance, performance-per-watt, and/or performance-per-dollar are what matters.

The current Intel pricing draws the first line. If performance-per-dollar matters to you, AMD's EPYC pricing is very competitive for a wide range of software applications. With the exception of database software and vectorizable HPC code, AMD's EPYC 7601 ($4200) offers slightly less or slightly better performance than Intel's Xeon 8176 ($8000+). However the real competitor is probably the Xeon 8160, which has 4 (-14%) fewer cores and slightly lower turbo clocks (-100 or -200 MHz). We expect that this CPU will likely offer 15% lower performance, and yet it still costs about $500 more ($4700) than the best EPYC. Of course, everything will depend on the final server system price, but it looks like AMD's new EPYC will put some serious performance-per-dollar pressure on the Intel line.

The Intel chip is indeed able to scale up in 8 sockets systems, but frankly that market is shrinking fast, and dual socket buyers could not care less.

Meanwhile, although we have yet to test it, AMD's single socket offering looks even more attractive. We estimate that a single EPYC 7551P would indeed outperform many of the dual Silver Xeon solutions. Overall the single-socket EPYC gives you about 8 cores more at similar clockspeeds than the 2P Intel, and AMD doesn't require explicit cross socket communication - the server board gets simpler and thus cheaper. For price conscious server buyers, this is an excellent option.

However, if your software is expensive, everything changes. In that case, you care less about the heavy price tags of the Platinum Xeons. For those scenarios, Intel's Skylake-EP Xeons deliver the highest single threaded performance (courtesy of the 3.8 GHz turbo clock), high throughput without much (hardware) tuning, and server managers get the reassurance of Intel's reliable track record. And if you use expensive HPC software, you will probably get the benefits of Intel's beefy AVX 2.0 and/or AVX-512 implementations.

The second consideration is the type of buyer. It is clear that you have to tune more and work harder to get the best performance out of AMD EPYC CPUs. In many ways it is basically a "virtual octal socket" solution. For enterprises with a small infrastructure crew and server hardware on premise, spending time on hardware tuning is not an option most of the time. For the cloud vendors, the knowledge will be available and tuning for EPYC will be a one-time investment. Microsoft is already deploying AMD's EPYC in their Azure Cloud Datacenters.

Looking Towards the Future

Looking towards the future, Intel has the better topology to add more cores in future CPU generations. However AMD's newest core is a formidable opponent. Scalar floating point operations are clearly faster on the AMD core, and integer performance is – at the same clock – on par with Intel's best. The dual CCX layout and quad die setup leave quite a bit of performance on the table, so it will be interesting how much AMD has learned from this when they launch the 7 nm "Rome" successor. Their SKU line-up is still very limited.

All in all, it must be said that AMD executed very well and delivered a new server CPU that can offer competitive performance for a lower price point in some key markets. Server customers with non-scalar sparse matrix HPC and Big Data applications should especially take notice.

As for Intel, the company has delivered a very attractive and well scaling product. But some of the technological advances in Skylake-SP are overshadowed by the heavy price tags and somewhat "over the top" market segmentation.

Energy Consumption
POST A COMMENT

217 Comments

View All Comments

  • TheOriginalTyan - Tuesday, July 11, 2017 - link

    Another nicely written article. This is going to be a very interesting next couple of months. Reply
  • coder543 - Tuesday, July 11, 2017 - link

    I'm curious about the database benchmarks. It sounds like the database is tiny enough to fit into L3? That seems like a... poor benchmark. Real world databases are gigabytes _at best_, and AMD's higher DRAM bandwidth would likely play to their favor in that scenario. It would be interesting to see different sizes of transactional databases tested, as well as some NoSQL databases. Reply
  • psychobriggsy - Tuesday, July 11, 2017 - link

    I wrote stuff about the active part of a larger database, but someone's put a terrible spam blocker on the comments system.

    Regardless, if you're buying 64C systems to run a DB on, you likely will have a dataset larger than L3, likely using a lot of the actual RAM in the system.
    Reply
  • roybotnik - Wednesday, July 12, 2017 - link

    Yea... we use about 120GB of RAM on the production DB that runs our primary user-facing app. The benchmark here is useless. Reply
  • SofiaRogers - Saturday, July 22, 2017 - link

    I resigned my office-job and now I am getting paid £64 hourly. How? I work over internet! My old work was making me miserable, so I was forced to try something different, two years after...I can say my life is changed-completely for the better!

    Check it out what i do.... http://cutt.us/SL0Hi
    Reply
  • haplo602 - Thursday, July 13, 2017 - link

    I do hope they elaborate on the DB benchmarks a bit more or do a separate article on it. Since this is a CPU article, I can see the point of using a small DB to fit into the cache, however that is useless as an actual DB test. It's more an int/IO test.

    I'd love to see a larger DB tested that can fit into the DRAM but is larger than available caches (32GB maybe ?).
    Reply
  • ddriver - Tuesday, July 11, 2017 - link

    We don't care about real world workloads here. We care about making intel look good. Well... at this point it is pretty much damage control. So let's lie to people that intel is at least better in one thing.

    Let me guess, the databse size was carefully chosen to NOT fit in a ryzen module's cache, but small enough to fit in intel's monolithic die cache?

    Brought to you by the self proclaimed "Most Trusted in Tech Since 1997" LOL
    Reply
  • Ian Cutress - Tuesday, July 11, 2017 - link

    I'm getting tweets saying this is a severely pro AMD piece. You are saying it's anti-AMD. ¯\_(ツ)_/¯ Reply
  • ddriver - Tuesday, July 11, 2017 - link

    Well, it is hard to please intel fanboys regardless of how much bias you give intel, considering the numbers.

    I did not see you deny my guess on the database size, so presumably it is correct then?
    Reply
  • ddriver - Tuesday, July 11, 2017 - link

    In the multicore 464.h264ref test we have 2670 vs 2680 for the xeon and epyc respectively. Considering that the epyc score is mathematically higher, howdoes it yield a negative zero?

    Granted, the difference is a mere 0.3% advantage for epyc, but it is still a positive number.
    Reply

Log in

Don't have an account? Sign up now