More 14+++, No 10nm in Sight

For readers that haven’t followed Intel’s manufacturing story as of late, we are desperately awaiting the arrival of Intel’s 10nm process node technology to come to the desktop market in a big way. Intel has historically been at the forefront of process node developments since the start of the century, and it first started discussing its 10nm node back in 2010 (when it was called ‘11nm’), and slowly started to push through its process technology cadence. Initially promised in 2015, Intel declared that it had shipped some 10nm products in late December in 2017, although we didn’t see anything with 10nm in the market until mid-to-late 2018.

In 2019, we have had Intel’s 10nm products now pop up in portable form factors, such as high-end laptops. This is hardware that is far from ubiquitous, but at least it isn’t vaporware any more. We even tested the reference system earlier this year before they went on sale, and the results were fairly good by comparison. However, despite this, we have yet to see 10nm on the desktop. Intel has promised 10nm Ice Lake Xeons for enterprise (production ramp H2 2020), and has stated that 10nm ‘will come to the desktop’, but Intel isn't there quite yet.

To that end, we get more 14nm products. Officially Intel doesn’t like to mention whether a product is on its 14nm process, 14+, 14++, or anything beyond that – partly because it just further indicates that it isn’t 10nm, but also it wants to focus its messaging on the product regardless of the process node. One of Intel’s VPs, at a recent tour of its fabs by the European press, stated in not so many words that ‘consumers don’t care about process nodes, so you shouldn’t either’. Take from that what you will.

In the high-end desktop market, like the enterprise market, we expect a slower cadence compared to the bleeding edge used in the mainstream markets and notebook markets. Even with that in mind, today’s launch is Intel’s third line of HEDT processors on 14nm, following Skylake-X with the 7980XE family and a Skylake-X Refresh with the 9980XE family. The new family is called ‘Cascade Lake-X’, promising better support for high-end memory (up from 128 GB to 256 GB), more PCIe lanes (44 to 48), and more frequency (+100 MHz), for a lower cost ($979 for 18-cores, rather than $1929) and more hardened security updates (the first round of Spectre/Meltdown).

The issue Intel has, with not executing on its 10nm plans, is that the competition has caught up and surpassed them. By utilizing TSMC’s 7nm process, AMD has taken advantage of its chiplet strategy to drive higher core counts on a more efficient process, with smaller chips to allow for a better binning strategy and helps higher yields than big chips with the same defect rate.

So where Intel offers 18 cores with AVX-512, AMD offers 16 cores with better IPC and higher frequencies, at a lower price. Intel’s platform is HEDT, so it does come with more memory and PCIe lanes, and users wanting that on the latest AMD will have to jump up another 40% in cost, but will get 24-cores instead.

Benchmark wise, our results show that the 10980XE sits pretty much where the 9980XE did, albeit at half the price. What the 10980XE does well is that users who want a high-end desktop platform around $1000, with more memory and more PCIe lanes, can either use Intel’s latest solution, or an older AMD solution. AMD has priced its high-end desktop parts out of this market ($1399+), and is hoping that users at this price range don’t need high memory or high PCIe counts. So in an unusual turn of events, after having previously charged a sizable premium even within the HEDT lineup for extra PCIe lanes, now it's Intel who is offering the best deal for peripheral I/O.

Intel’s product fits in nicely with what the competition has to offer, but they no longer have the crown. Intel loves that halo spot, but it’s going to be a tough climb for it to get it back. We might have to wait until we see a consumer 10nm HEDT part for that, and the roadmap doesn’t look to great from where we’re standing. If Ice Lake Xeons are the priority in 2H 2020, that puts any 10nm for the $500-$1000 market in 2021.

Gaming: F1 2018
Comments Locked

79 Comments

View All Comments

  • Thanny - Wednesday, November 27, 2019 - link

    Zen does not support AVX-512 instructions. At all.

    AVX-512 is not simply AVX-256 (AKA AVX2) scaled up.

    Something to consider is that AVX-512 forces Intel chips to run at much slower clock speeds, so if you're mixing workloads, using AVX-512 instructions could easily cause overall performance to drop. It's only in an artificial benchmark situation where it has such a huge advantage.
  • Everett F Sargent - Monday, November 25, 2019 - link

    Obviously, AMD just caught up with Intel's 256-bit AVX2, prior to Ryzen 3 AMD only had 128-bit AVX2 AFAIK. It was the only reason I bought into a cheap Ryzen 3700X Desktop (under $600US complete and prebuilt). To get the same level of AVX support, bitwise.

    I've been using Intel's Fortran compiler since 1983 (back then it was on a DEC VAX).

    So I only do math modeling at 64-bits like forever (going back to 1975), So I am very excited that AVX-512 is now under $1KUS. An immediate 2X speed boost over AVX2 (at least for the stuff I'm doing now).
  • rahvin - Monday, November 25, 2019 - link

    I'd be curious how much the AVX512 is used by people. It seems to be a highly tailored for only big math operations which kinda limits it's practical usage to science/engineering. In addition the power use of the module was massive in the last article I read, to the point that the main CPU throttled when the AVX512 was engaged for more than a few seconds.

    I'd be really curious what percentage of people buying HEDT are using it, or if it's just a niche feature for science/engineering.
  • TEAMSWITCHER - Tuesday, November 26, 2019 - link

    If you don't need AVX512 you probably don't need or even want a desktop computer. Not when you can get an 8-core/16-thread MacBook Pro. Desktops are mostly built for show and playing games. Most real work is getting done on laptops.
  • Everett F Sargent - Tuesday, November 26, 2019 - link

    LOL, that's so 2019.
    Where I am from it's smartwatches all the way down.
    Queue Four Yorkshiremen.
  • AIV - Tuesday, November 26, 2019 - link

    Video processing and image processing can also benefit from AVX512. Many AI algorithms can benefit from AVX512. Problem for Intel is that in many cases where AVX512 gives good speedup, GPU would be even better choice. Also software support for AVX512 is lacking.
  • Everett F Sargent - Tuesday, November 26, 2019 - link

    Not so!
    https://software.intel.com/en-us/parallel-studio-x...
    It compiles and runs on both Intel and AMD. Full AVX-512 support on AVX-512 hardware.
    You have to go full Volta to get true FP64, otherwise desktop GPU's are real FP64 dogs!
  • AIV - Wednesday, November 27, 2019 - link

    There are tools and compilers for software developers, but not so much end user software actually use them. FP64 is mostly required only in science/engineering category. Image/video/ai processing is usually just fine with lower precision. I'd add that also GPUs only have small (<=32GB) RAM while intel/amd CPUs can have hundreds of GB or more. Some datasets do not fit into a GPU. AVX512 still has its niche, but it's getting smaller.
  • thetrashcanisfull - Monday, November 25, 2019 - link

    I asked about this a couple of months ago. Apparently the 3DPM2 code uses a lot of 64b integer multiplies; the AVX2 instruction set doesn't include packed 64b integer mul instructions - those were added with AVX512, along with some other integer and bit manipulation stuff. This means that any CPU without AVX512 is stuck using scalar 64b muls, which on modern microarchitectures only have a throughput of 1/clock. IIRC the Skylake-X core and derivatives have two pipes capable of packed 64b muls, for a total throughput of 16/clock.

    I do wish AnandTech would make this a little more clear in their articles though; it is not at all obvious that the 3DPM2 is more of a mixed FP/Integer workload, which is not something I would normally expect from a scientific simulation.

    I also think that the testing methodology on this benchmark is a little odd - each algorithm is run for 20 seconds, with a 10 second pause in between? I would expect simulations to run quite a bit longer than that, and the nature of turbo on CPUs means that steady-state and burst performance might diverge significantly.
  • Dolda2000 - Monday, November 25, 2019 - link

    Thanks a lot, that does explain much.

Log in

Don't have an account? Sign up now