Intel’s View on AI: Do What NV Doesn't

On the whole, Intel has a good point that there is "a wide range of AI applications", e.g. there is AI life beyond CNNs. In many real-life scenarios, traditional machine learning techniques outperform CNNs, and not all deep learning is done with the ultra-scalable CNNs. And in other real-world cases, having massive amounts of RAM is another big performance advantage, both while training the model and using it to infer new data. 

So despite NVIDIA’s massive advantage in running CNNs, high end Xeons can offer a credible alternative in the data analytics market. To be sure, nobody expects the new Cascade Lake Xeons to outperform NVIDIA GPUs in CNN training, but there are lots of cases where Intel might be able to convince customers to invest in a more potent Xeon instead of an expensive Tesla accelerator:

  • Inference of AI models that require a lot of memory
  • "Light" AI models that do not require long training times.
  • Data architectures where the batch or stream processing time is more important than the model training time.
  • AI models that depend on traditional “non-neural network” statistical models

As result, there might be an opportunity for Intel to keep NVIDIA at bay until they have a reasonable alternative for NVIDIA’s GPUs in CNN workloads. Intel has been feverishly adding features to the Xeons Scalable family and optimizing its software stack to combat NVIDIA AI hegemony. Optimized AI software like Intel’s own distribution for Python, the Intel Math Kernel Library for Deep Learning, and even the Intel Data Analytics Acceleration Library – mostly for traditional machine learning... 

All told then, for the second generation of Intel’s Xeon Scalable processors, the company has added new AI hardware features under the Deep Learning (DL) Boost name. This primarily includes the Vector Neural Network Instruction (VNNI) set, which can do in one instruction what would have previously taken three. However even farther down the line, Cooper Lake, the third-generation Xeon Scalable processor, will add support for bfloat16, further improving training performance.

In summary, Intel trying to recapture the market for “lighter AI workloads” while making a firm stand in the rest of data analytics market, all the while adding very specialized hardware (FPGA, ASICs) to their portfolio. This is of critical importance to Intel's competitiveness in the IT market. Intel has repeatedly said that the data center group (DCG) or “enterprise part” is expected to be the company's main growth engine in the years ahead.

Convolutional, Recurrent, & Scalability NVIDIA’s Answer: Bringing GPUs to More Than CNNs
Comments Locked

56 Comments

View All Comments

  • ballsystemlord - Saturday, August 3, 2019 - link

    Spelling and grammar errors:

    "But it will have a impact on total energy consumption, which we will discuss."
    "An" not "a":
    "But it will have an impact on total energy consumption, which we will discuss."

    "We our newest servers into virtual clusters to make better use of all those core."
    Missing "s" and missing word. I guessed "combine".
    "We combine our newest servers into virtual clusters to make better use of all those cores."

    "For reasons unknown to us, we could get our 2.7 GHz 8280 to perform much better than the 2.1 GHz Xeon 8176."
    The 8280 is only slightly faster in the table than the 8176. It is the 8180 that is missing from the table.

    "However, since my group is mostly using TensorFlow as a deep learning framework, we tend to with stick with it."
    Excess "with":
    "However, since my group is mostly using TensorFlow as a deep learning framework, we tend to stick with it."

    "It has been observed that using a larger batch can causes significant degradation in the quality of the model,..."
    Remove plural form:
    "It has been observed that using a larger batch can cause significant degradation in the quality of the model,..."

    "...but in many applications a loss of even a few percent is a significant."
    Excess "a":
    "...but in many applications a loss of even a few percent is significant."

    "LSTM however come with the disadvantage that they are a lot more bandwidth intensive."
    Add an "s":
    "LSTMs however come with the disadvantage that they are a lot more bandwidth intensive."

    "LSTMs exhibit quite inefficient memory access pattern when executed on mobile GPUs due to the redundant data movements and limited off-chip bandwidth."
    "pattern" should be plural because "LSTMs" is plural, I choose an "s":
    "LSTMs exhibit quite inefficient memory access patterns when executed on mobile GPUs due to the redundant data movements and limited off-chip bandwidth."

    "Of course, you have the make the most of the available AVX/AVX2/AVX512 SIMD power."
    "to" not "the":
    "Of course, you have to make the most of the available AVX/AVX2/AVX512 SIMD power."

    "Also, this another data point that proves that CNNs might be one of the best use cases for GPUs."
    Missing "is":
    "Also, this is another data point that proves that CNNs might be one of the best use cases for GPUs."

    "From a high-level workflow perfspective,..."
    A joke, or a misspelling?

    "... it's not enough if the new chips have to go head-to-head with a GPU in a task the latter doesn't completely suck at."
    Traditionally, AT has had no language.
    "... it's not enough if the new chips have to go head-to-head with a GPU in a task the latter is good at."

    "It is been going on for a while,..."
    "has" not "is":
    "It has been going on for a while,..."
  • ballsystemlord - Saturday, August 3, 2019 - link

    Thanks for the cool article!
  • tmnvnbl - Tuesday, August 6, 2019 - link

    Great read, especially liked the background and perspective next to the benchmark details
  • dusk007 - Tuesday, August 6, 2019 - link

    Great Article.
    I wouldn't call Apache Arrow a database though. It is a data format more akin to a file format like csv or parquet. It is not something that stores data for you and gives it to you. It is the how to store data in memory. Like CSV or Parquet are a "how to" store data in Files. More efficient less redundancy less overhead when access from different runtimes (Tensorflow, Spark, Pandas,..).

    Love the article, I hope we get more of those. Also that huge performance optimizations are possible in this field just in software. Often renting compute in the cloud is cheaper than the man hours required to optimize though.
  • Emrickjack - Thursday, August 8, 2019 - link

    Johan's new piece in 14 months! Looking forward to your Rome review
  • Emrickjack - Thursday, August 8, 2019 - link

    It More Information http://americanexpressconfirmcard.club/

Log in

Don't have an account? Sign up now