A New Optimized 14nm Process: 14nm+
As originally reported in the Kaby Lake-Y/U Launch

One of the mysteries with the launch of Kaby Lake is the optimized 14nm+ process that Intel is promoting as one of the key components for the performance uptick in Kaby Lake. It’s worth noting that Intel has said nothing about caches, latencies or bandwidths. We are being told that the underlying microarchitecture for Kaby Lake is the same as Skylake, and that the frequency adjustments from the new process, along with features such as Speed Shift v2 and the new fixed function media codecs, account for the performance improvements as well as battery life increases when dealing with 4K content.

For users that speak in pure IPC, this may/may not be a shock. Without further detail, Intel is implying that Kaby Lake will have the same IPC as Skylake (which we can confirm in our reviews), however it will operate with a better power efficiency (same frequency at lower power, or higher frequency at same power) and for media consumption there will be more idle CPU cycles with lower power drain. The latter makes sense for mobile devices such as tablets, 2-in-1s and notebooks, or for power conscious users, but paints a static picture for the future of the desktop platform in January if the user only gets another 200-400 MHz in base frequencies.

However I digress with conjecture – the story not being told is on how has Intel changed its 14nm+ platform. We’ve only been given two pieces of information: taller fins and a wider gate pitch.


Intel 14nm Circa Broadwell

When Intel launched Broadwell on 14nm, we were given an expose into Intel’s latest and greatest semiconductor manufacturing lithography node. Intel at its core is a manufacturing company rather than a processor company, and by developing a mature and robust process node allows them to gain performance advantages over the other big players: TSMC, Samsung and GlobalFoundries. When 14nm was launched, we had details on their next generation of FinFET technology, discussions about the issues that faced 14nm as it was being developed, and fundamental dimensional data on how transistors/gates were proportioned. Something at the back of my brain says we’ll get something similar for 10nm when we are closer to launch.

But as expected, 14nm+ was given little extra detail. What would interest me is the scale of results or the problems faced by the two changes in the process we know about. Taller fins means less driving current is needed and leakage becomes less of an issue, however a wider gate pitch is typically associated with a decrease in transistor density, requiring higher voltages but making the manufacturing process easier with fewer defects. There is also the argument that a wider pitch allows the heat generation of each transistor to spread more before affecting others, allowing a bit more wiggle room for frequency – this is at least how Intel puts it.

The combination of the two allows for more voltage range and higher frequencies, although it may come at the expense of die size. We are told that transistor density has not changed, but unless there was a lot of spare unused silicon in the Skylake die design for the wider pitch to spread, it seems questionable. It also depends which part of the metal stack is being adjusted as well. It’s worth noting that Intel has not released die size information again, and transistor counts as a metric is not being disclosed, similar to Skylake.

Finally, there's some question over what it takes at a fab level to produce 14nm+. Though certainly not on the scale of making the jump to 14nm to begin with, Intel has been tight-lipped on whether any retooling is required. At a minimum, as this is a new process (in terms of design specifications), I think it's reasonable to expect that some minor retooling is required to move a line over to 14nm+. In which case the question is raised over which Intel fabs can currently produce chips on the new process. One of the D1 fabs in Oregon is virtually guaranteed; whether Arizona or Ireland is also among these is not.

I bring this up because of the parallels between the Broadwell and Kaby Lake launches. Both are bottom-up launches, starting with the low wattage processors followed by the bigger parts a few months later. In Broadwell's case, 14nm yields - and consequently total volume - were a bottleneck to start with. Depending on the amount of retooling required and which fabs have been upgraded, I'm wondering whether the bottom-up launch of Kaby Lake is for similar reasons. Intel's yields should be great even with a retooling, but if it's just a D1 fab producing 14nm+, then it could be that Intel is volume constrained at launch and wanted to focus on producing a larger number of small die 2+2 processors to start with, ramping up for larger dies like 4+2 and 4+4e later on.

Intel Launches 7th Generation Kaby Lake Speed Shift v2: Speed Harder
Comments Locked

43 Comments

View All Comments

  • Lolimaster - Tuesday, January 3, 2017 - link

    Considering the minimum cores you get per module is 4, I see AMD selling months later a 3c/6t cpu for $99.

    They will make a tweak for the raven ridge APU since the core count for those is 4c max.
  • jjj - Friday, January 6, 2017 - link

    Every segment they don't cover (and they don't have Zen APUs yet) is business left on the table - the budget segment is big enough and in regions they care about.

    Maybe they should go to 49$ with quads and disable HT, some cache but it is likely that if they don't do that, most would make an effort to get the 99$ quad. Just hope they don't get too greedy and start way higher, Intel can make quads without a GPU too, won't take too long and AMD needs to exploit this window of opportunity and gain,not just revenue, but hearts and minds.
  • name99 - Tuesday, January 3, 2017 - link

    "We still have not received an official word if Intel is working closely with Apple to bring the feature to macOS, or even if it will be promoted if it ever makes the transition"

    Could some more-or-less unexpected interaction between Speed Shift 2 and the rest of MacOS be the reason for the apparently random dramatic swings in the battery lifetime of the new MacBook Pros? We hardly know enough to point fingers at either Apple or Intel, but I could certainly imagine that each side has a certain mental model of what the other side is/"should" be doing, and the mismatch between those models means that the CPU is randomly being told to run at maximum speed when the OS actually wants it to dramatically slow down.
    I agree that this sounds kinda dumb of the surface, but I could imagine that there are enough layers between UI/framework code, the power driver, the core OS, and EFI, that something gets confused along the way including, perhaps, exposing a bug (again either on the Apple side or the Intel side) that just didn't get triggered (or at least not very often) on either previous x86 CPUs or on Linux/Windows.
  • rodmunch69 - Tuesday, January 3, 2017 - link

    My 5 year old 3930k can still basically keep up with Intel's latest and greatest with stock voltage OC. Hum... I used to buy new stuff every year, or every two years at most, because there was normally a good gain to be had. It's legit been 5 years now and my PC with a little work, in multi core tests, is just as fast as anything out there. That's pitiful on Intel's behalf. They've gotten fat and lazy and the consumer is paying for it. Trump needs to tell AMD to put the A back into their chips and actually put out some products at the high-end that actually pushes Intel to be great again.
  • Laststop311 - Wednesday, January 4, 2017 - link

    Is it really worth saving 60 dollars to get an unlocked i3 vs the unlocked i5? I really can't see any situation where 60 dollars is the difference between being able to afford a new pc or not. With DX12 it HIGHLY benefits from having 4 cores (really 6 cores is optimum with 8 only slightly improving). Being stuck with 2 cores in this day is severely crippling your lifespan of the pc. You will waste GPU power and be constrained by the 2 cores all in the name to save 60 dollars. Nah it's not worth it.

    Kaby lake in general is not worth it. Everyone with quad core sandy bridge and above is going to see very minimal gains from a quad core cpu. You really need to go to 6 cores to get any real performance increase and you also need to be playing in dx 12 mode. Your best bet is to wait for the 2019 tock of 10nm coffee lake. Intel will be moving to pci-e 4.0 which doubles the bandwidth so an 8x pci-e 4.0 is the same as a 16x pci-e 3.0. Since gpu's only lose a few percentage points of performance on 8x pci-e 3.0, 8x pci-e 4.0 will give them all the breathing room they need. This leaves you 16x lanes of the 24 lanes to use for m2 storage devices or capture cards without having to use the higher latency PCH pci-e lanes. Or with multi GPU you still have 8x cpu pci-e lanes and you only need 2x pci-e 4.0 lanes to give you 4GB/s (32gbps) so you can fit dual gpu's and 4 pci-e storage devices all connected to the cpu directly and both gpu's will get 16GB/s (128gbps) bandwidth. This gives you massive future proofing. With intel optane maturing you can go single gpu at 16x pci-e 4.0 lanes 32GB/s bandwidth (256gbps) stick an optane drive on 4x lanes giving you a massive 8GB/s (64gbps) and 2 m2 nvme ssd's on 2x lanes each 4GB/s (32gbps) each, with all devices connected directly to CPU for the lowest latency leaving all the PCH lanes free for external ports like TB3 USB 3.1 gen 2 etc.

    By waiting till 2019 you get a real upgrade instead of a sidegrade. pci-e 4.0 will unlock the true potential of Intel optane as i expect by then the optane drives will be maxing out the 4x pci-e 3.0 lanes at 4GB/s and pci-e 4.0 will allow optane to really shine and most likely hit 7GB/s or more. With that kinda storage speed you can transfer an entire blu ray disc image in about 7 seconds.

    Now by all means if you are still on the Q series quad cores than kaby lake is a compelling upgrade and isn't a total waste of money to upgrade. But even in that circumstance I would say try to stick it out another year so you can have a 6 core coffee lake as 6 cores is incredibly useful in dx12.
  • Lolimaster - Wednesday, January 4, 2017 - link

    You mean upgrade to the 8c/16t Ryzen or wait 2018-2019 for the 7nm Zen+?
  • gopher1369 - Wednesday, January 4, 2017 - link

    The only thing that occurs to me is game emulators. Dolphin and PCSX2 require high clock speeds and high IPC, not more cores. It's quite niche, but if you're building an emulator box then the unlocked Anniversary Edition Haswell Pentium is currently the go-to processor, the new i3 should be even better.
  • Laststop311 - Wednesday, January 4, 2017 - link

    What applications use AVX instructions? I wonder how much it will hurt performance for some applications by decreasing AVX to 4.0ghz so you can hit 5.0ghz on everything else. The highest overclock i've seen talked about is 5.1ghz on the i7-7700k using the corsair 115i
  • johnp_ - Wednesday, January 4, 2017 - link

    (3) Embedded DisplayPort* (eDP) 1.4 and PSR2 under evaluation

    I seriously didn't expect that! This means that they actually changed the display pipeline slightly :)
    Now, hopefully laptop vendors will make use of PSR2 to further improve battery life.

    On a side-note: Does anyone know how to overclock the 7820HK when there's no mobile chipset that supports overclocking? Will laptop vendors have to include the Z270 desktop chipset on their platform?
  • keeepcool - Friday, January 6, 2017 - link

    You open intel XTU and press on the arrows till it BSOD's.
    Laptop chipsets are "different" in a lot of senses.

Log in

Don't have an account? Sign up now