Explaining the Jump to Using HCC Silicon

When Intel makes its enterprise processors, it has historically produced three silicon designs:

  • LCC: Low Core Count
  • HCC: High Core Count (sometimes called MCC)
  • XCC: Extreme Core Count (sometimes called HCC, to confuse)

The idea is that moving from LCC to XCC, the silicon will contain more cores (sometimes more features), and it becomes cost effective to have three different designs rather than one big one and disable parts to meet the range. The size of the LCC silicon is significantly smaller than the XCC silicon, allowing Intel to extract a better production cost per silicon die.

Skylake-SP Die Sizes (from chip-architect.com)
  Arrangement Dimensions
(mm)
Die Area
(mm2)
LCC 3x4 (10-core) 14.3 x 22.4 322 mm2
HCC 4x5 (18-core) 21.6 x 22.4 484 mm2
XCC 5x6 (28-core) 21.6 x 32.3 698 mm2

In the enterprise space, Intel has each of the three designs throughout its Xeon processor stack, ranging from four-core parts (usually cut down versions of the LCC silicon) all the way up to 28 core parts (using XCC) for this generation. The enterprise platform has more memory channels, support for error correcting and high-density memory, the ability to communicate to multiple processors, and several other RAS (reliability, accessibility, serviceability) features that are prominent for these markets. These are typically disabled for the prosumer platform.

In the past, Intel has only translated the LCC silicon into the prosumer platform. This was driven for a number of reasons.

  • Cost: if users needed XCC, they had to pay the extra and Intel would not lose high-end sales.
  • Software: Enterprise software is highly optimized for the core count, and systems are built especially for the customer. Prosumer software has to work on all platforms, and is typically not so multi-threaded.
  • Performance: Large, multi-core silicon often runs at a low frequency to compensate. This can be suitable for an enterprise environment, but a prosumer environment requires responsiveness and users expect a good interactive experience.
  • Platform Integration: Some large silicon might have additional design rules above and beyond the smaller silicon support, typically with power or features. In order to support this, a prosumer platform would require additional engineering/cost or lose flexibility.

So what changed at Intel in order to bring HCC silicon to the HEDT prosumer platform?

The short and shrift answer that many point to is AMD. This year AMD launched its own high-end desktop platform, based on its Ryzen Threadripper processors. With their new high performance core, putting up to 16 of them in a processor for $999 was somewhat unexpected, especially with the processor beating Intel’s top prosumer processors in some (not all) of the key industry benchmarks. The cynical might suggest that Intel had to move to the HCC strategy in order to stay at the top, even if their best processor will cost twice that of AMD.

Of course, transitioning a processor from the enterprise stack to the prosumer platform is not an overnight process, and many analysts have noted that it is likely that Intel has considered this option for several generations: testing it internally at least and looking at the market to decide when (or if) it is a good time to do so. The same analysts point to Intel’s initial lack of specifications aside from core count when these processors were first announced several months ago: specifications that would have historically been narrowed down at that point in previous designs if they were in the original plans. It is likely that the feasibly of introducing the HCC silicon was already in process, but actually moving that silicon to retail was a late addition to counter a threat to Intel’s top spot. That being said, to say Intel had never considered it would perhaps be a jump too far.

The question now becomes if the four areas listed above would all be suitable for prosumers and HEDT users:

  • Cost: Moving the 18-core part into the $1999 is unprecedented for a consumer processor, so it will be interesting to see what the uptake will be. This does cut into Intel’s professional product line, where the equivalent processor is nearer $3500, but there are enough ‘cuts’ on the prosumer part for Intel to justify the difference: memory channels (4 vs 6), multi-processor support (1 vs 4), and ECC/RDIMM support (no vs yes). What the consumer platform does get in kind is overclocking support, which the enterprise platform does not.
  • Software: Intel introduced its concept of ‘mega-tasking’ with the last generation HEDT platform, designed to encompass users and prosumers that use multiple software packages at once: encoding, streaming, content creation, emulation etc. Its argument now is that even if software cannot fully scale beyond a few cores, a user can either run multiple instances or several different software packages simultaneously without any slow-down. So the solution to this is rather a redefinition of the problem rather than anything else, which could have applied previously as well.
  • Performance: Unlike enterprise processors, Intel is pushing the frequency on the new HCC parts for consumers. This translates into a slightly lower base frequency but a much higher turbo frequency, along with support for Turbo Max. In essence, software that requires responsiveness can still take advantage of the high frequency turbo modes, as long as the software is running solo. The disadvantage is going to be in power consumption, which is a topic later in the review.
  • Platform Integration: Intel ‘solved’ this by creating one consumer platform suitable for nine processors with three different designs (Kaby Lake-X, Skylake-X LCC and Skylake-X HCC). Both the Kaby Lake-X and Skylake-X parts have different power delivery methods, support different numbers of memory channels, and different numbers of PCIe lanes / IO. When this was first announced, there was substantial commentary that this was making the platform overly complex, and would lead to confusion (it lead to at least one broken processor in our testing).

Each of these areas has either been marked as solved, or redefined out of being issue (even if a user agrees with the redefinition or not). 

New Features in Skylake-X: Cache, Mesh, and AVX-512 Opinion: Why Counting ‘Platform’ PCIe Lanes (and using it in Marketing) Is Absurd
POST A COMMENT

151 Comments

View All Comments

  • mapesdhs - Monday, September 25, 2017 - link

    Just curious mmrezaie, why do you say "unofficially"? ECC support is included on specs pages for X399 boards. Reply
  • frowertr - Tuesday, September 26, 2017 - link

    Run Unbound on a Pi or other Linux VM and block all thise adverts at the DNS level for all the devices on your LAN. I havent seen a site add anywhere in years from my home. Reply
  • Notmyusualid - Thursday, September 28, 2017 - link

    @frowertr

    Interesting - But that won't work for me - I'm a frequent traveller, and thus on different LANs all the time.

    But what works for me, is PeerBlock, then iblocklist.com for the Ad-server & Malicious lists and others, add Microsoft and any other entity I don't want my packets broadcast to (my Antivirus alerts me when I need updates anyway - and thus I temporarily allow http through the firewall for that type of occasion).
    Reply
  • realistz - Monday, September 25, 2017 - link

    This is why the "core wars" won't be a good thing for consumers. Focus on better single thread perf instead quantity. Reply
  • sonichedgehog360@yahoo.com - Monday, September 25, 2017 - link

    On the contrary, single-threaded performance is largely a dead end until we hit quantum computing due to instability inherent to extremely high clock speeds. The core wars is exactly what we need to incentivize developers to improve multi-core scaling and performance: it represents the future of computing. Reply
  • extide - Monday, September 25, 2017 - link

    Some things just can't be split up into multiple threads -- it's not a developer skill level or laziness issue, it's just the way it is. Single threaded speed will always be important. Reply
  • PixyMisa - Monday, September 25, 2017 - link

    Maybe, but it's still a dead end. It's not going to improve much, ever. Reply
  • HStewart - Monday, September 25, 2017 - link

    As a developer for 30 years this is absolutely correct - especially with the user interface logic which includes graphics. Until technology is a truly able to multi-thread the display logic and display hardware - it very important to have single thread performance. I would think this is critically important for games since they deal a lot with screen. Intel has also done something very wise and I believe they realize this important - by allowing some cores to go faster than others. Multi-core is basically hardware assisted multi-threaded applications which is very dependent on application design - most of time threads are used for background tasks. Another critical error is database logic - unless the database core logic is designed to be multithread, you will need single point of entry and in some cases - they database must be on screen thread. Of course with advancement is possible hardware to handle threading and such, it might be possible to over come these limitations. But in NO WAY this is laziness of developer - keep in mind a lot of software has years of development and to completely rewrite the technology is a major and costly effort. Reply
  • lilmoe - Monday, September 25, 2017 - link

    There are lots of instances where I'd need summation and other complex algorithm results from millions of records in certain tables. If I'm going the traditional sql route, it would take ages for the computation to return the desired values. I instead divide the load one multiple threads to get a smaller set in which I would perform some cleanup and final arithmetic. Lots of extra work? Yup. More ram per transaction total? Oh yea. Faster? Yes, dramatically faster.

    WPF was the first attempt by Microsoft to distribute UI load across multiple cores in addition to the gpu, it was so slow in its early days due to lots out inefficiencies and premature multi-core hardware. It's alot better now, but much more work than WinForms as you'd guess. UWP UI is also completely multithreaded.

    Android is inching closer to completely have it's UI multithreaded and separate from the main worker thread. We're getting there.

    Both you and sonich are correct, but it's also a fact that developers are taking their sweet time to get familiar with and/or use these technologies. Some don't want to that route simply because of technology bias and lock-in.
    Reply
  • HStewart - Monday, September 25, 2017 - link

    "Both you and sonich are correct, but it's also a fact that developers are taking their sweet time to get familiar with and/or use these technologies. Some don't want to that route simply because of technology bias and lock-in."

    That is not exactly what I was saying - it completely understandable to use threads to handle calculation - but I am saying that the designed of hardware with a single screen element makes it hard for true multi-threading. Often the critical sections must be lock - especially in a multi-processor system.

    The best use of multi-threading and mult-cpu systems is actually in 3D rendering, this is where multiple threads can be use to distribute the load. In been a while since I work with Lightwave 3D and Vue, but in those days I would create a render farm - one of reason, I purchase a Dual Xeon 5160 ten years ago. But now a days processors like these processors here could do the work or 10 or normal machines on my farm ( Xeon was significantly more power then the P4's - pretty much could do the work of 4 or more P4's back then )
    Reply

Log in

Don't have an account? Sign up now