Announcement Four: AVX-512 & Favored Core

To complete the set, there are a couple of other points worth discussing. First up is that AVX-512 support coming to Skylake-X. Intel has implemented AVX-512 (or at least a variant of it) in the last generation of Xeon Phi processors, Knights Landing, but this will be the first implementation in a consumer/enterprise core.

Intel hasn’t given many details on AVX-512 yet, regarding whether there is one or two units per CPU, or if it is more granular and is per core. We expect it to be enabled on day one, although I have a suspicion there may be a BIOS flag that needs enabling in order to use it.

As with AVX and AVX2, the goal here is so provide a powerful set of hardware to solve vector calculations. The silicon that does this is dense, so sustained calculations run hot: we’ve seen processors that support AVX and AVX2 offer decreased operating frequencies when these instructions come along, and AVX-512 will be no different. Intel has not clarified at what frequency the AVX-512 instructions will run at, although if each core can support AVX-512 we suspect that the reduced frequency will only effect that core.

With the support of AVX-512, Intel is calling the Core i9-7980X ‘the first TeraFLOP CPU’. I’ve asked details as to how this figure is calculated (software, or theoretical), but it does make a milestone in processor design. We are muddying the waters a bit here though: an AVX unit does vector calculations, as does a GPU. We’re talking about parallel compute processes completed by dedicated hardware – the line between general purpose CPU and anything else is getting blurred.

Favored Core

For Broadwell-E, the last generation of Intel’s HEDT platform, we were introduced to the term ‘Favored Core’, which was given the title of Turbo Boost Max 3.0. The idea here is that each piece of silicon that comes off of the production line is different (which is then binned to match to a SKU), but within a piece of silicon the cores themselves will have different frequency and voltage characteristics. The one core that is determined to be the best is called the ‘Favored Core’, and when Intel’s Windows 10 driver and software were in place, single threaded workloads were moved to this favored core to run faster.

In theory, it was good – a step above the generic Turbo Boost 2.0 and offered an extra 100-200 MHz for single threaded applications. In practice, it was flawed: motherboard manufacturers didn’t support it, or they had it disabled in the BIOS by default. Users had to install the drivers and software as well – without the combination of all of these at work, the favored core feature didn’t work at all.

Intel is changing the feature for Skylake-X, with an upgrade and for ease-of-use. The driver and software are now part of Windows updates, so users will get them automatically (if you don’t want it, you have to disable it manually). With Skylake-X, instead of one core being the favored core, there are two cores in this family. As a result, two apps can be run at the higher frequency, or one app that needs two cores can participate.

Availability

Last but not least, let's talk about availability. Intel will likely announce availability during the keynote at Computex, which is going on at the same time as this news post goes live. The launch date should be sooner rather than later for the LCC parts, although the HCC parts are unknown. But no matter what, I think it's safe to say that by the end of this summer, we should expect a showdown over the best HEDT processor around.

Announcement Three: Skylake-X's New L3 Cache Architecture
Comments Locked

203 Comments

View All Comments

  • ddriver - Friday, June 2, 2017 - link

    "I would be willing to bet that between 2-4 of those can replace your entire farm and still give you better FLOP/$."

    Not really. Aside from the 3770k's running at 4.4 Ghz, most of the performance actually comes from GPU compute. You can't pack those tiny stock rackmount systems with GPUs. Not that 256 cores @2.9 GHz would come anywhere near 256 cores @4.4 Ghz, even if they had the I/O to accommodate the GPUs.

    And no, Intel is NO LONGER better at flops/$. Actually it may not have ever been, considering how cheap Amd processors are. Amd was simply too slow and too power inefficient for me until now.

    And since the launch of Ryzen, Amd offers 50-100% better flops/$, so it is a no brainer, especially when performance is not only so affordable but actually ample.

    Your who post narrative basically says "intel fanboy in disguise". I guess it is back to the drawing board for you.
  • Meteor2 - Saturday, June 3, 2017 - link

    Ddriver is our friendly local troll; best ignored and not fed.
  • trivor - Saturday, June 3, 2017 - link

    Whether you're a large corporation with $Billion IT budget with dedicated IT or a SOHO (Small Office Home Office) user with a very limited budget everyone is looking for bang for the buck. While most people on this site are enthusiasts we all have some kind of budget to keep. Where do we find the sweet spot for gaming (intersection of CPU/GPU for the resolution we want) and more and more having a fairly quiet system (and even more for a HTPC) is important. While some corporations might be tied to certain vendors (Microsoft, Dell, Lenovo, etc.) they don't necessarily care what components are in there because it is the vendor that will be warranting the system. For pure home users, all of the systems are not for us. Ryzen 5/7, i5/i7, and maybe i9 are the cpus and SOCs for us. Anything more than that will not help our gaming performance or even other tasks (Video Editing/Encoding) because even multi core aware programs (Handbrake) can't necessarily use 16-20 cores. The absolute sweet spot right now are the CPUs around $200 (Ryzen 5 1600/1600x, Core i5) because you can get a very nice system in the $600 range. That will give you good performance in most games and other home user tasks.
  • swkerr - Wednesday, May 31, 2017 - link

    There may be brand loyalty on the Retail side but it does not exist in the Corporate world. Data Center mangers will look at total cost of ownership. Performance per watt will be key as well as the cost of the CPU and motherboard. What the Corporate world s loyal to is the brand of server and if Dell\Hp etc make AMD based servers than they will add them if the total cost of ownership looks good.

  • Namisecond - Wednesday, May 31, 2017 - link

    Actually, even for the consumer retail side, there isn't brandy loyalty at the CPU level (excepting a very vocal subset of the small "enthusiast" community) Brandy loyalty is at the PC manufacturer level: Apple, Dell, HP, Lenovo, etc.
  • bcronce - Tuesday, May 30, 2017 - link

    "But at that core count you are already limited by thermal design. So if you have more cores, they will be clocked lower. So it kind of defeats the purpose."

    TDP scales with the square of the voltage. Reduce the voltage 25%, reduce the TDP by almost 50%. Voltage scales non-linearly with frequency. Near the high end of the stock frequency, you're gaining 10% clock for a 30% increase in power consumption because of the large increase in voltage to keep the clock rate stable.
  • ddriver - Tuesday, May 30, 2017 - link

    The paragraph next to the one you quoted explicitly states that lower clocks is where you hit the peak of the power/performance ratio curve. Even to an average AT reader it should be implied that lowered clocks come with lowered voltage.

    There is no "magic formula" like for example the quadratic rate of intensity decay for a point light source. TDP vs voltage vs clocks in a function of process scale, maturity, leakage and operating environment. It is however true that the more you push above the optimal spot the less performance you will get for every extra watt.
  • boeush - Tuesday, May 30, 2017 - link

    "More cores would be beneficial for servers, where the chips are clocked significantly lower, around 2.5 Ghz, allowing to hit the best power/performance ratio by running defacto underclocked cores.

    But that won't do much good in a HEDT scenario."

    I work on software development projects where one frequently compiles/links huge numbers if files into a very large application. For such workloads, you can never have enough cores.

    Similarly, I imagine any sort of high-resolution (4k, 8k, 16k) raytracing or video processing workloads would benefit tremendously from many-core CPUs.

    Ditto for complex modelling tasks, such as running fluid dynamics, heat transfer, or finite element stress/deformation analysis.

    Ditto for quantum/molecular simulations.

    And so on, and on. Point being, servers are not the only type of system to benefit from high core counts. There are many easily-parallelizable problems in the engineering, research, and general R&D spheres that can benefit hugely.
  • ddriver - Tuesday, May 30, 2017 - link

    The problem is that the industry wants to push HEDT as gaming hardware. They could lower clocks and voltages, and add more cores, which would be beneficial to pretty much anything time staking like compilation, rendering, encoding or simulations, as all of those render themselves very well to multithreading and scale up nicely.

    But that would be too detrimental to gaming performance, so they will lose gamers as potential customers for HEDT. They'd go for the significantly cheaper, lower core count, higher clocked CPU. So higher margins market would be lost.
  • Netmsm - Thursday, June 1, 2017 - link

    "AMD will not and doesn't need to launch anything other than 16 core. Intel is simply playing the core count game, much like it played the Mhz game back in the days of pentium4."
    Exactly ^_^ That's it.

Log in

Don't have an account? Sign up now