The Basin Falls Platform: X299, SKL-X, & KBL-X

For most practical definitions of the Basin Falls platform, the X299 chipset is the heart. X299 supports the new processors, and like the Z170 and Z270 counterparts on the mainstream consumer line, is basically a big PCIe switch. One of the issues with the older X99 chipset was its limited capabilities, and inability to drive many PCIe devices – this changes with the big switch mentality on X299. For the DMI 3.0 link going into the chipset (basically a PCIe 3.0 x4), the chipset has access to up to 24 PCIe 3.0 lanes for network controllers, RAID controllers, USB 3.1 controllers, Thunderbolt controllers, SATA controllers, 10GbE controllers, audio cards, more PCIe slot support, special controllers, accelerators, and anything else that requires PCIe lanes in either an x4, x2 or x1 link. The total uplink is limited by the DMI 3.0 link, but there will be very few situations where this is saturated. There are a few limits to what support is available (some ports are restricted in what they can handle), and only three PCIe 3.0 x4 drives can use the in-built PCIe RAID, but this should satiate all but the most hardcore enthusiasts.

The Skylake-X family of processors for Basin Falls comes in two stages, based on the way the processors are developed. Normally HEDT processors are cut down versions of enterprise processors, usually through restricting certain functions, but the enterprise processors are typically derived from three different silicon layouts during manufacturing. Internally Intel call these three layouts the LCC (low core-count), HCC (high core-count) and XCC (extreme core-count), based on the maximum number of cores they support. Nominally Intel does not disclose which silicon layout they use for which processors, though it is usually straight forward to work them out as long as Intel has disclosed what the configurations of the LCC/HCC/XCC dies are. In this case, Intel has officially left everyone guessing, but the point here is that historically Intel only uses the LCC silicon from the enterprise line for its consumer desktop processors.

In previous generations, this meant either a 6, 8 or 10 core processor at the top of the stack for consumers, with lower core count models being provided by binning/salvaging imperfect CPUs. Each year we expected one of two things: the top-end SKU either gets more frequency, less power, or more cores, and as such the march of progress has been predictable. If you had asked us two months ago, we would have fully expected Skylake-X to top out with LCC silicon at 10 or 12 cores, depending on how Intel was planning the manufacturing part.

So the first element of Intel’s launch is the LCC processors, running up to 10 cores. We previously published that the LCC silicon was 12 cores, but we can now show it is 10 – more on that later. The three Skylake-X CPUs launching today are using LCC silicon with 6, 8 or 10 cores as the Core i7-7800X, Core i7-7820X and Core i9-7900X respectively. Intel is further separating these parts by adjusting the level of official supported DRAM frequency, as well as the PCIe lanes. We’ll go in a bit more detail further in the review.

The second element to the Skylake-X launch is the one that has somewhat surprised most of the industry: the launch will contain four processors based on the HCC silicon. Technically these processors will not be out until Q4 this year (one SKU coming out in August), and the fact that Intel did not have frequency numbers to share when announcing these parts shows that they are not finalized, calling into question when they were added to the roadmap (and if they were a direct response to AMD announcing a 16-core part for this summer). We’ve written a detailed analysis on this in our launch coverage, and we’ll cover some of the topics in this review. But Intel is set to launch 12, 14, 16 and 18-core consumer level processors later this year, with the top part running a tray price (when you buy 1k CPUs at a time) of $1999, so we expect the retail to be nearer $2099.

It should be noted that due to a number of factors, the Skylake-X cores and the communication pathways therein are built slightly differently to the consumer version of Skylake-S, which is something discussed and analyzed in this review.

The final element to the Basin Falls launch is Kaby Lake-X. This is also an aspect of the Basin Falls platform that deviates from the previous generations. Intel’s HEDT line has historically been one generation behind the mainstream consumer platform due to enterprise life cycles as well as the added difficulty of producing these larger chips. As a result, the enterprise and HEDT parts have never had the peak processing efficiency (IPC, instructions per clock) of the latest designs and have sat in the wings, waiting. By bringing the Kaby Lake microarchitecture to HEDT, this changes the scene, albeit slightly. Rather than bringing a new big core featuring the latest microarchitecture, Intel is repurposing the Kaby Lake-S mainstream consumer silicon, binning it to slightly more stringent requirements for frequency and power, disabling the integrated graphics, and then putting it in a package for the high-end desktop platform. There are still some significant limitations, such as having only 16 PCIe 3.0 lanes and dual channel memory, which might exclude it from the traditional designation of being a true HEDT processor, however Intel has stated that these parts fill a request from customers to have the latest microarchitecture on the HEDT platform. They also overclock quite well, which is worth noting.

The Kaby Lake-X parts will consist of a Core i7 and Core i5, both of which are quad core parts, with the i7 supporting hyperthreading. We have a parallel Kaby Lake-X review alongside our Skylake-X coverage, with some numbers from a stable 5 GHz overclock.

The Intel Skylake-X Review Microarchitecture Analysis: Adding in AVX-512 and Tweaks to Skylake-S
Comments Locked

264 Comments

View All Comments

  • Tephereth - Tuesday, June 20, 2017 - link

    "For each of the GPUs in our testing, these games (at each resolution/setting combination) are run four times each, with outliers discarded. Average frame rates, 99th percentiles and 'Time Under x FPS' data is sorted, and the raw data is archived."

    So... where the hell are the games benchmarks in this review?
  • beck2050 - Tuesday, June 20, 2017 - link

    The possibility of the 18 core beast in the upcoming Mac Pro is really exciting for music pros.
    That is a tremendous and long overdue leap for power users.
  • drajitshnew - Tuesday, June 20, 2017 - link

    "... and only three PCIe 3.0 x4 drives can use the in-built PCIe RAID"
    I would like to know which raid level you would use. I can't see 3 m2 drives in raid 1, and raid 5 would require access to the cpu for parity calculations. Then raid 0 it is. Now, which drives will you use for raid 0, which do not saturate the DMI link for sequential reads? And if your workload does not have predominantly sequential reads, then why are you putting the drives in raid.
  • PeterCordes - Tuesday, June 20, 2017 - link

    Standard motherboard RAID controllers are software raid anyway, where the OS drivers queue up writes to each drive separately, instead of sending the data once over the PCIe bus to a hardware RAID controller which queues writes to two drives.

    What makes it a "raid controller" is that you can boot from it, thanks to BIOS support. Otherwise it's not much different from Linux or Windows pure-software RAID.

    If the drivers choose to implement RAID5, that can give you redundancy on 3 drives with the capacity of 2.

    However, RAID5 on 3 disks is not the most efficient way. A RAID implementation can get the same redundancy by just storing two copies of every block, instead of generating parity. That avoids a ton of RAID5 performance problems, and saves CPU time. Linux md software RAID implements this as RAID10. e.g. RAID10f2 stores 2 copies of every block, striped across as many disks as you have. It works very well with 3 disks. See for example https://serverfault.com/questions/139022/explain-m...

    IDK if Intel's mobo RAID controllers support anything like that or not. I don't use the BIOS to configure my RAID; I just put a boot partition on each disk separately and manage everything from within Linux. IDK if other OSes have soft-raid that supports anything similar either.

    > And if your workload does not have predominantly sequential reads, then why are you putting the drives in raid.

    That's a silly question. RAID0, RAID1, and RAID5 over 3 disks should all have 3x the random read throughput of a single disk, at least for high queue depths, since each disk will only see about 1/3rd of the reads. RAID0 similarly has 3x random write throughput.

    RAID10n2 of 3 disks can have better random write throughput than a single disk, but RAID5 is much worse. RAID1 of course mirrors all the writes to all the disks, so it's a wash for writes. (But can still gain for mixed read and write workloads, since the reads can be distributed among the disks).
  • Lieutenant Tofu - Tuesday, June 20, 2017 - link

    I wonder why 1600X outperforms 1800X here on WebXPRT. It's not a huge difference, but I don't see why it's happening. 6-core vs. 8-core, 3.6 GHz base, 4.0 GHz turbo. This presumably runs in just one thread, so performance should be nearly identical. The only reason I can think of is less contention across the IF on the 1600X due to less enabled cores, but don't see that having a major effect on a single-threaded test like this one.

    Maybe 1600X can XFR to a little higher than the 1800X.
  • Eyered - Tuesday, June 20, 2017 - link

    Did they have any issues with heat at all?
  • mat9v - Tuesday, June 20, 2017 - link

    If that were so everyone would be using HEDT instead of 4c/8t CPUs
  • mat9v - Tuesday, June 20, 2017 - link

    Then why again why aren't every workstation consist of dual cpu xeons? If the expense is so insignificant compared to how much faster machine will earn...
  • mat9v - Tuesday, June 20, 2017 - link

    I'm just wondering how did 7900X menage to stay within 140W bracket during Prome95 tests when in other reviews it easily reached 250W or more. Is it some internal throttling mechanism that keeps CPU constantly dynamically underclocked to stay within power envelope? How does such compare to forced 4Ghz CPU clock?
  • mat9v - Tuesday, June 20, 2017 - link

    And yet in conclusion you say to play it safe and get 7900X ?
    How does that work together?

Log in

Don't have an account? Sign up now