Sensible Scaling: OoO Atom Remains Dual-Issue

The architectural progression from Apple, ARM and Qualcomm have all been towards wider, out-of-order cores, to varying degrees. With Swift and Krait, Apple and Qualcomm both went wider. From Cortex A8 to A9 ARM went OoO and then from A9 to A15 ARM introduced a significantly wider architecture. Intel bucks the trend a bit by keeping the overall machine width unchanged with Silvermont. This is still a 2-wide architecture.

At the risk of oversimplifying the decision here, Intel had to weigh die area, power consumption as well as the risk of making Atom too good when it made the decision to keep Silvermont’s design width the same as Bonnell. A wider front end would require a wider execution engine, and Intel believed it didn’t need to go that far (yet) in order to deliver really good performance.

Keeping in mind that Intel’s Bonnell core is already faster than ARM’s Cortex A9 and Qualcomm’s Krait 200, if Intel could get significant gains out of Silvermont without going wider - why not? And that’s exactly what’s happened here.

If I had to describe Intel’s design philosophy with Silvermont it would be sensible scaling. We’ve seen this from Apple with Swift, and from Qualcomm with the Krait 200 to Krait 300 transition. Remember the design rule put in place back with the original Atom: for every 2% increase in performance, the Atom architects could at most increase power by 1%. In other words, performance can go up, but performance per watt cannot go down. Silvermont maintains that design philosophy, and I think I have some idea of how.

Previous versions of Atom used Hyper Threading to get good utilization of execution resources. Hyper Threading had a power penalty associated with it, but the performance uplift was enough to justify it. At 22nm, Intel had enough die area (thanks to transistor scaling) to just add in more cores rather than rely on HT for better threaded performance so Hyper Threading was out. The power savings Intel got from getting rid of Hyper Threading were then allocated to making Silvermont an out-of-order design, which in turn helped drive up efficient use of the execution resources without HT. It turns out that at 22nm the die area Intel would’ve spent on enabling HT was roughly the same as Silvermont’s re-order buffer and OoO logic, so there wasn’t even an area penalty for the move.

The Original Atom microarchitecture

Remaining a 2-wide architecture is a bit misleading as the combination of the x86 ISA and treating many x86 ops as single operations down the pipe made Atom physically wider than its block diagram would otherwise lead you to believe. Remember that with the first version of Atom, Intel enabled the treatment of load-op-store and load-op-execute instructions as single operations post decode. Instead of these instruction combinations decoding into multiple micro-ops, they are handled like single operations throughout the entire pipeline. This continues to be true in Silvermont, so the advantage remains (it also helps explain why Intel’s 2-wide architecture can deliver comparable IPC to ARM’s 3-wide Cortex A15).

While Silvermont still only has two x86 decoders at the front end of the pipeline, the decoders are more capable. While many x86 instructions will decode directly into a single micro-op, some more complex instructions require microcode assist and can’t go through the simple decode paths. With Silvermont, Intel beefed up the simple decoders to be able to handle more (not all) microcoded instructions.

Silvermont includes a loop stream buffer that can be used to clock gate fetch and decode logic in the event that the processor detects it’s executing the same instructions in a loop.

Execution

Silvermont’s execution core looks similar to Bonnell before it, but obviously now the design supports out-of-order execution. Silvermont’s execution units have been redesigned to be lower latency. Some FP operations are now quicker, as well as integer multiplies.

Loads can execute out of order. Don’t be fooled by the block diagram, Silvermont can issue one load and one store in parallel.

 

OoOE & The Pipeline ISA, IPC & Frequency
Comments Locked

174 Comments

View All Comments

  • althaz - Monday, May 6, 2013 - link

    I don't think you fully grasp the situation. Whilst Intel definitely can (and realistically should) take a strong leadership position in the mobile sector, companies like Qualcomm aren't going anywhere - Intel still won't (be able to?) compete on price, which means even if they take the lions-share of the market, there will be enough left for others to survive (they'll be a lot better off than AMD who sells more-expensive-to-manufacture chips for cheaper that perform worse and use more power).

    Although I wouldn't be too confident about nVidia, as they are yet to show they can compete with the likes of Qualcomm, let alone Intel.
  • R0H1T - Tuesday, May 7, 2013 - link

    They most certainly will not "take the lions-share of the market" because that belongs to the ultra thin margin chipmakers like Mediatek/Allwinner that deliver quad core ARM v7 based SoC in that 10~20$ range where Intel will not & cannot compete because of their relatively high(er) cost structure !
  • Khato - Tuesday, May 7, 2013 - link

    This is an argument that never makes sense to me. Yes, Intel won't go into a market unless the margins make it worthwhile... but do you not realize how cheap it is for Intel to make value processors on a deprecated node? Remember, Allwinner and Mediatek may operate on ultra thin margins, but that's in large part because the majority of the margins on their product go to the foundry they use. aka, when all the high end products are using Airmont cores Intel can keep making use of their 22nm capacity for awhile churning out 'old' Silvermont based products for the value market and simply get closer to the 'operating point' margin for that node.
  • R0H1T - Wednesday, May 8, 2013 - link

    I can't say how much TSMC charges for those chips but from what I know the single biggest cost of operations for Intel, outside of their R&D spending & foundry equipment upgrades, must be manpower & the difference between a Chinese/Taiwanese firm vs Intel in this particular dept would be a major one ! This is the real cost advantage that most smaller firms enjoy vis-a-vis Intel & for the foreseeable future they'll continue with this advantage.
  • xTRICKYxx - Tuesday, May 7, 2013 - link

    I didn't feel like this article is Intel PR crap. I read it all and I looked at all the improvements that are inbound; and I couldn't help but feel excited about Silvermont just like Anand.

    I cannot wait to see some benchmarks in the next few months.
  • Silma - Tuesday, May 7, 2013 - link

    On lack of AMD's comparison: there is nothing to compare and while one should tread cautiously with Intel's slides one should not tread at all with AMD's slides because AMD has a huge legacy of promises not held - how many time did we hear it would catch up in notebook or desktops, in performance or performance/watt. While Intel disappoints from time to time (Pentium 4) AMD disappoints most of the time, its last interesting product was the Opteron. Like most companies without vision it ends up doing stupid mergers instead of concentrating on core business.

    On Intel vs ARM. Silvermont looks promising but Intel needs to accelerate its roadmap. At the end of the year it probably won't compete against a 28nm A15. Qualcomm will not sleep for a year. Also it will have to invest heavily into marketing and OEM incentives if it seriously wants a share of the mobile pile. Will shareholders
  • ET - Tuesday, May 7, 2013 - link

    I'm excited. A 7-8" full Windows tablet with decent performance would be very neat. I'll wait to see what performance this gets in games. I don't need much, just enough to run adventure games and such.
  • R0H1T - Tuesday, May 7, 2013 - link

    Then get ready to shell out upwards of 500$ /:
  • pensive69 - Tuesday, May 7, 2013 - link

    can't stand getting a partially functioning market focused 'hack' on a cellphone.
    if the 22nm drill provides a full computer in a smaller form then factor me in!
    i don't care which firm does it...like those kids in the commercial
    we just want more we want more :).
    love it.
  • Laststop311 - Tuesday, May 7, 2013 - link

    this chip will have to pull off a miracle to drive full windows 8 and the everyday apps people use. Seems like it's going to average maybe slightly over 2x performance. That seems like a lot but when you see how poor current atoms are double that performance still is not enough. Does have potential in android phones/tablets and windows 8 phones/tablets as long as it's windows rt on the tablet. Atom still is not good enough for full windows 8

Log in

Don't have an account? Sign up now