A quick news piece on information coming out of Intel’s annual Investor Day in California. As confirmed to Ashraf Eassa by Intel at the event, Intel’s 8th Generation Core microarchitecture will remain on the 14nm node. This is an interesting development with the recent launch of Intel’s 7th Generation Core products being touted as the ‘optimization’ behind the new ‘Process-Architecture-Optimization’ three-stage cadence that had replaced the old ‘tick-tock’ cadence. With Intel stringing out 14nm (or at least, an improved variant of 14nm as we’ve seen on 7th Gen) for another generation, it makes us wonder where exactly Intel can promise future performance or efficiency gains on the design unless they start implementing microarchitecture changes.

Despite this, if you were to believe supposed ‘leaked’ roadmaps (which we haven’t confirmed from a second source as of yet), the 8th Generation product ‘Cannon Lake’ is more geared towards the Y and U part of Intel’s roadmap. This would ring true with a mobile first strategy that Intel has mirrored with recent generations such that the smaller, low power chips are off the production line for a new product first, however we'd also expect 10nm to also be in the smaller chips first too (as demonstrated at CES). Where Cannon Lake will end up in the desktop or enterprise segment however remains to be seen. To put something a bit more solid into this, Ashraf also mentioned words from Dr. Venkata ‘Murthy’ Renduchintala, VP and GM of Client and IoT:

‘Murthy referred to it at the event, process tech use will be ‘fluid’ based on segment’.

If one read too much into this, we may start seeing a blend of process nodes for different segments at the same time for different areas of the market. We already do have that to some extent with the mainstream CPUs and the HEDT/Xeon families, but this phrasing seems that we might get another split between consumer products or consumer and enterprise. We may get to a point where Intel's 'Gen' naming scheme for its CPUs covers two or more process node variants.

Speaking of the Enterprise segment, another bit of information has also surfaced, coming from a slide during a talk by Diane Bryant (EVP/GM of Data Center) and posted online by Ashraf. The slide contains the words ‘Data center first for next process node’

We can either talk about process node in terms of the ‘number’, either 14nm/10nm/7nm, or by variants within that process (high power, high efficiency). One might suspect that this means Intel is moving hard and fast with 10nm for Xeons and big computing projects, despite showing off 10nm silicon at CES earlier this year. That being said, it’s important to remember that the data center market is large, and includes high-density systems with many cores, such as Atom cores, and Intel did recently open up its 10nm foundry business to ARM Artisan IP projects. So while the slide does say ‘Data center first’, it might be referring to DC projects based on ARM IP in that segment rather than big 4-24+ core Xeons. At this stage of the game it is hard to tell.

On top of all this, Intel still has extreme confidence in its foundry business. An image posted by Dick James of Siliconics from the livestream shows Intel expects to have a three-year process node advantage when its competitors (Samsung, TSMC) start launching 10nm:

I’ve been brief with this news for a reason - at this point there are a lot of balls in the air with many different ways to take this information, and the Investor Day is winding down on talks and finishing with smaller 1-on-1 meetings. We may get further clarification on this news as the day goes on.

Update 1: On speaking with Diane Bryant, the 'data center gets new nodes first' is going to be achieved by using multiple small dies on a single package. But rather than use a multi-chip package as in previous multi-core products, Intel will be using EMIB as demonstrated at ISSCC: an MCP/2.5D interposer-like design with an Embedded Multi-Die Interconnect Bridge (EMIB).


An Intel Slide from ISSCC, via PC Watch

Initially EMIB was thought of as a technology relating to Intel's acquisition of Altera and potential future embedded FPGA designs, and given the slide above and comments made at the Investor Day, it seems there are other plans for this technology too. The benefit of using multiple smaller dies over a large monolithic 600mm2 die is typically related to cost and yield, however the EMIB technology also has to be up to par and there may be a latency or compatibility trade-off.

Source: Intel, @TMFChipFool

POST A COMMENT

124 Comments

View All Comments

  • name99 - Friday, February 10, 2017 - link

    "That all said, process nodes have ceased bringing the kind of frequency gains that they once did. "
    That's not true.
    Compare Apple's performance
    A7: 1.3GHz 28nm
    A8: 1.4GHz 20nm
    A9: 1.8GHz 16nm FF
    A10: 2.3GHz 16nm+ FF

    A9 and A10 are respectable frequency improvements allowed in part by process improvements. A8 does not look like a great frequency improvement, but it was an overall 25% performance improvement (somewhat allowed by higher density allows for better micro-architecture), and it took longer to hit throttling temperatures than the A8.

    I'd say the issue is not that process doesn't allow for improved single thread performance; it's that different actors have different goals and constraints.
    Intel's biggest constraints are that it has SUCH a long lead time between the start of a design and when it ships that they have zero agility. So they can start down a path (say eight years) where it looks like the right track is
    - keep reducing power for mobile and desktop, but don't care much about performance because there are no competitors
    - optimize the design for the server market and minimally port that down to mobile desktop (again because there are no competitors)

    But when the world changes (
    a] Apple showing that single-threaded performance isn't yet dead
    b] long delays and slippages in process rollout)
    they're screwed because they can't deviate from that path. So they've built up a marketing message around an expected performance cadence, and didn't have a backup marketing message prepared.

    They're also (to be fair) providing constantly improving performance through increasing turbo frequencies, and being able to maintain turbo for longer BUT once again, their marketing is so fscked up that they're unable to present that message to the world. They have not built up a corpus of benchmarks that show the real-world value and performance of turbo, and scrambling to do so today would look fake.

    So I would not say the fault is that process improvement no longer deliver performance improvements. I'd say that Intel is a dysfunctional organization that has optimized its process nodes and its designs for the wrong things, is unable to change its direction very fast, and is unable to even inform the public and thus sell well the improvements it has been capable of delivering.
    Reply
  • fanofanand - Friday, February 10, 2017 - link

    So you are suggesting that Apple, in it's 4th or 5th year of making CPUs, would have the same number of low hanging fruit as Intel in it's 40th year (or whatever)? I think you WAY oversimplified this. Reply
  • name99 - Friday, February 10, 2017 - link

    I'm suggesting that Intel has bad incentives in place, and terrible strategic planning.
    Apple already matches Intel performance at the lower power levels, and is likely to extend that all the way up to the non-K CPUs with the A10X. And that, as you say, after just a few years of CPU design.

    Doesn't that suggest that one of these companies is being pretty badly run? I've explained in great detail why it is Intel --- how they planned to exploit the gains of better process was dumb, and then they had no backup plan when even those process gains became unavailable.

    You're claiming what? That CPU design has reached its pinnacle, and that the ONLY way to further exploit improved process is through more cores and slightly lower energy/operation?
    What would it take to change your mind? To change my mind, I'd have to see Apple's performance increases start to tail off once they're matching Intel. Since they're already at Intel levels, that means their performance increase has to start tailing off to <10% or so every year starting with the A11. Do YOU think that's going to happen?

    IBM, to take another company, has likewise not hit any sort of performance wall as process has improved. They've kept increasing their single-threaded performance for POWER even as they've done the other usual server things like add more cores. They've not increased frequency much since 65nm, but they have done a reasonable job (much better than Intel) of increasing IPC.

    Once again, you can spin this as "IBM was behind Intel, so they still have room to grow" and once again, that might be true --- none of us knows the future. But the pattern of the recent past is clear: it is INTEL that has had performance largely frozen even as process has improved, not everyone else. It has not yet been demonstrated that everyone else will slow to a crawl once they exceed Intel's single-threaded performance levels.
    Reply
  • Nagorak - Monday, February 13, 2017 - link

    By stalling so much since Sandy Bridge, Intel really gave its competitors a lot of chance to catch up. Plenty of people see no reason to upgrade who have a Sandy Bridge processor and it's six years later! Reply
  • tygrus - Friday, February 10, 2017 - link

    "6t SRAM which is 0.0806 mm[s]2[/s] vs competitor A's 0.0588 mm[s]2[/s]"
    Isn't 0.0806 > 0.0588 !
    Competitor A is actually smaller per cell, therefore more dense and possible smaller features. The other reason why the overall cache area can increase is by having smaller lines (and more of them), higher n-way and thus more complex selection&load circuitry, more bits for error correcting, more precode and bits for sharing/consistancy. What really matters is, how the perform with real workloads.
    Reply
  • Laststop311 - Friday, February 10, 2017 - link

    If intel can get 15% better single threaded performance on the top 6 core 12 thread mainstream i7 vs the 7700k kaby lake then I don't care if it's on the same node. That's a good jump for intel. A 5GHz OC 6 core 12 thread mainstream cpu at the 350 mark is what intel needs to not look completely terrible in multi threaded apps vs ryzen at this price point. AMD's pressure is already pushing intel into gear. Thank you AMD for making intel release its better technology. Reply
  • Laststop311 - Friday, February 10, 2017 - link

    I still will be holding out using my 4.13ghz 6 core 12 thread i7-980x 32nm LGA 1366 gulftown platform with 12GB of 2400mhz cas 9 DDR3 triple channel ram with a recently replace GPU to a used sapphire tri-X OC edition with some custom work (repasted with coollaboratory liquid metal ultra tim and a custom larger 3 slot aluminum heat sink with copper heat pipes and larger heat sinks for the vrms and better contact with the ram, with 2x nfp12 120mm noctua fans cooling the gpu open air style ) with all but 64 stream processors software unlocked (all but 1 CU fully unlocked) giving it 4032 of 4096 stream processors running at 1140Mhz HBM at stock 500MHz and the storage has been upgraded with a Samsung 850 pro 1TB SATA III SSD in addition to the 4TB HGST 7200 RPM drive.

    It's crazy to think that with just a few minor upgrades and a good deal on a custom GPU made by a friend (since he upgraded) This almost exactly 7 year old PC can still run everything amazing with max details at my 1920x1080 monitor resolution. I never turn down any settings for any games and I get smooth game play and this system is freaking 7 years old with only upgrades done to the storage adding SSD and an upgrade to the graphics card since I got a really cool fancy sapphire fury tri-x OC with custom work done on it that keeps the temps mainly in the 50's while gaming to sometimes low 60's while rly being stressed. I can actually stable overclock to 1190mhz but than the temps hits mid to high 70's so I just keep it at the mild 1140mhz overclock and enjoy silent fluid gaming on a 7 year old desktop. <---- This is why PC sales are slowing down. It's so bad that my pci-e 2.0 express system is not being replaced until pci-e 4.0 comes out in 2-3 years. So that's pretty bad CPU improvements are so bad I was able to skip an entire pci-e generation with a pc that'll be 10 years old by then and still playing games smooth.

    And yes I actually prefer the 1080p resolution because I prefer Vertical alignment panels and their native 3000:1 contrast ratio and no glows from the corners on a black screen like ips glow and its middling 1000:1 contrast ratio. I like the 144hz free sync Samsung curved monitor with 1080p resolution and quantum dot tech that allows more colors to be shown and less banding while keeping the 3000:1 static CR and faster pixels and overall response time with its 144hz panel and no tearing with free sync. The deeper blacks make everything else look brighter and crisper and I think this is a bigger benefit than a 1440p IPS resolution monitor.

    The only thing's I miss on my PC are native sata III, native USB 3.0, m2 slots, sata express, NVME storage type c USB ports that can run 100 watts of power and 10 gbps of speed, USB 3,1 gen 2 in general, thunderbolt and DDR4 and DMI 3.0 connected chipset and maybe 5 and 10 gbps Ethernet that is starting to come on higher end z270 boards It's about time 10 gbps Ethernet starts coming standard in mainstream with the Ethernet port just working at 10/100/1000/10000 auto detected what's needed. It's a shame when you can get 3x3 or 4x4 5ghz wireless ac transmitters and receivers and pump out a higher bandwidth than 1gbps Ethernet. They need to quit holding back 10 gbps wired networks especially now with fast ssd's capable of using the 10gbps local speed to transfer files on your local network.

    But I can't bring myself to buy a pc and pci-e 4.0 is around the corner and is a milestone in computing as that will also bring about DMI 4.0 upgrade from the chipset for even faster read and writes to peripherals not connected directly to cpu. And allow all these m2 devices that arent saturating pcie 3.0x4 to be used on 4.0 x2 lanes instead of 4 lanes packing even more expansion slots to the point that all out storage is connected to the motherboard and literally the only thing connected away from the mobo will be the PSU with everything else fitting right on the mobo for some efficient builds.
    Reply
  • Breit - Friday, February 10, 2017 - link

    You used liquid metal TIM with an aluminium heat sink on the GPU? Good luck with that... oO Reply
  • Gothmoth - Friday, February 10, 2017 - link

    get a job... really who do you think cares about your outdated rig?
    you have way to much time to write such a sermon.
    Reply
  • Achaios - Friday, February 10, 2017 - link

    Omg Gothmog, I rofl'd. Reply

Log in

Don't have an account? Sign up now