CPU Performance: Intel's Own Claims

Before we get into the new AI benchmarks, let’s take a quick look at the usual CPU benchmarks and performance claims made available by Intel. 

For this comparison we’ll focus on the second row – the first row is comparing the insanely priced 400W dual-die Intel Platinum 9282 to a much more reasonable and available to everyone Intel Platinum 8180. The second row tells it all: a few MHz and slightly higher RAM speeds result in a 3% (Integer) to 5% (FP) performance increase compared to the first-generation Xeon Scalable parts. The higher boost in floating point performance is probably the result of the fact that Intel's second generation parts can use faster DDR4-2933 DIMMs and hence offer more bandwidth to the cores. 

The midrange SKUs get a bigger boost as some of x2xx Xeon Scalable parts get more cores and more L3 cache than the previous x1xx parts. For example, the 6252 has 24 cores and 35.75 MB L3, while the 6152 had 22 cores and 30.25 MB L3. 

 

The comparison with AMD's EPYC 7601 however deserves our attention, as there’s some interesting data here. Again, the comparison of a 400W, $50k chiplet CPU with a 180W $4k one does not make any sense whatsoever, so we ignore the first line. 

The Linpack numbers are not surprising: the more expensive Skylake SKUs add a 512-bit FMAC to the already existing dual 256-bit FMACs, offering up to 4 times more AVX throughput than AMD's EPYC. AMD's next generation will be a lot more competitive in this area as the each FP unit is now capable of doing 256-bit AVX instead of 128-bit. 

The image classification results clearly show that Intel is trying to convince people that some AI applications should simply run on a CPU, no GPU needed. Well, at least for now… 

The fact that Intel claims that database performance is a lot better than on the EPYC is quite interesting, as we’ve previously pointed out that AMD's four NUMA dies on a chip does have drawbacks. Quoting our Xeon Skylake vs EPYC review

Out of the box, the EPYC CPU is a rather mediocre transactional database CPU ... transactional databases will remain Intel territory for now.

In databases, cache (coherency) latency plays an important role. It will be interesting to see how well AMD has addressed this weakness in the second generation EPYC server chips. 

Testing Notes & Benchmark Configuration SAP S&D 2-tier
Comments Locked

56 Comments

View All Comments

  • ballsystemlord - Saturday, August 3, 2019 - link

    Spelling and grammar errors:

    "But it will have a impact on total energy consumption, which we will discuss."
    "An" not "a":
    "But it will have an impact on total energy consumption, which we will discuss."

    "We our newest servers into virtual clusters to make better use of all those core."
    Missing "s" and missing word. I guessed "combine".
    "We combine our newest servers into virtual clusters to make better use of all those cores."

    "For reasons unknown to us, we could get our 2.7 GHz 8280 to perform much better than the 2.1 GHz Xeon 8176."
    The 8280 is only slightly faster in the table than the 8176. It is the 8180 that is missing from the table.

    "However, since my group is mostly using TensorFlow as a deep learning framework, we tend to with stick with it."
    Excess "with":
    "However, since my group is mostly using TensorFlow as a deep learning framework, we tend to stick with it."

    "It has been observed that using a larger batch can causes significant degradation in the quality of the model,..."
    Remove plural form:
    "It has been observed that using a larger batch can cause significant degradation in the quality of the model,..."

    "...but in many applications a loss of even a few percent is a significant."
    Excess "a":
    "...but in many applications a loss of even a few percent is significant."

    "LSTM however come with the disadvantage that they are a lot more bandwidth intensive."
    Add an "s":
    "LSTMs however come with the disadvantage that they are a lot more bandwidth intensive."

    "LSTMs exhibit quite inefficient memory access pattern when executed on mobile GPUs due to the redundant data movements and limited off-chip bandwidth."
    "pattern" should be plural because "LSTMs" is plural, I choose an "s":
    "LSTMs exhibit quite inefficient memory access patterns when executed on mobile GPUs due to the redundant data movements and limited off-chip bandwidth."

    "Of course, you have the make the most of the available AVX/AVX2/AVX512 SIMD power."
    "to" not "the":
    "Of course, you have to make the most of the available AVX/AVX2/AVX512 SIMD power."

    "Also, this another data point that proves that CNNs might be one of the best use cases for GPUs."
    Missing "is":
    "Also, this is another data point that proves that CNNs might be one of the best use cases for GPUs."

    "From a high-level workflow perfspective,..."
    A joke, or a misspelling?

    "... it's not enough if the new chips have to go head-to-head with a GPU in a task the latter doesn't completely suck at."
    Traditionally, AT has had no language.
    "... it's not enough if the new chips have to go head-to-head with a GPU in a task the latter is good at."

    "It is been going on for a while,..."
    "has" not "is":
    "It has been going on for a while,..."
  • ballsystemlord - Saturday, August 3, 2019 - link

    Thanks for the cool article!
  • tmnvnbl - Tuesday, August 6, 2019 - link

    Great read, especially liked the background and perspective next to the benchmark details
  • dusk007 - Tuesday, August 6, 2019 - link

    Great Article.
    I wouldn't call Apache Arrow a database though. It is a data format more akin to a file format like csv or parquet. It is not something that stores data for you and gives it to you. It is the how to store data in memory. Like CSV or Parquet are a "how to" store data in Files. More efficient less redundancy less overhead when access from different runtimes (Tensorflow, Spark, Pandas,..).

    Love the article, I hope we get more of those. Also that huge performance optimizations are possible in this field just in software. Often renting compute in the cloud is cheaper than the man hours required to optimize though.
  • Emrickjack - Thursday, August 8, 2019 - link

    Johan's new piece in 14 months! Looking forward to your Rome review
  • Emrickjack - Thursday, August 8, 2019 - link

    It More Information http://americanexpressconfirmcard.club/

Log in

Don't have an account? Sign up now