The Silvermont Module and Caches

Like AMD’s Bobcat and Jaguar designs, Silvermont is modular. The default Silvermont building block is a two-core/two-thread design. Each core is equally capable and there’s no shared execution hardware. Silvermont supports up to 8-core configurations by placing multiple modules in an SoC.

 

Each module features a shared 1MB L2 cache, a 2x increase over the core:cache ratio of existing Atom based processors. Despite the larger L2, access latency is reduced by 2 clocks. The default module size gives you clear indication as to where Intel saw Silvermont being most useful. At the time of its inception, I doubt Intel anticipated such a quick shift to quad-core smartphones otherwise it might’ve considered a larger default module size.

L1 cache sizes/latencies haven’t changed. Each Silvermont core features a 32KB L1 data cache and 24KB L1 instruction cache.

Silvermont Supports Independent Core Frequencies: Vindication for Qualcomm?

In all Intel Core based microprocessors, all cores are tied to the same frequency - those that aren’t in use are simply shut off (power gated) to save power. Qualcomm’s multi-core architecture has always supported independent frequency planes for all CPUs in the SoC, something that Intel has always insisted was a bad idea. In a strange turn of events, Intel joins Qualcomm in offering the ability to run each core in a Silvermont module at its own independent frequency. You could have one Silvermont core running at 2.4GHz and another one running at 1.2GHz. Unlike Qualcomm’s implementation, Silvermont’s independent frequency planes are optional. In a split frequency case, the shared L2 cache always runs at the higher of the two frequencies. Intel believes the flexibility might be useful in some low cost Silvermont implementations where the OS actively uses core pinning to keep threads parked on specific cores. I doubt we’ll see this on most tablet or smartphone implementations of the design.

From FSB to IDI

Atom and all of its derivatives have a nasty secret: they never really got any latency benefits from integrating a memory controller on die. The first implementation of Atom was a 3-chip solution, with the memory controller contained within the North Bridge. The CPU talked to the North Bridge via a low power Front Side Bus implementation. This setup should sound familiar to anyone who remembers Intel architectures from the late 90s up to the mid 2000s. In pursuit of integration, Intel eventually brought the memory controller and graphics onto a single die. Historically, bringing the memory controller onto the same die as the CPU came with a nice reduction in access latency - unfortunately Atom never enjoyed this. The reasoning? Atom never ditched the FSB interface.

Even though Atom integrated a memory controller, the design logically looked like it did before. Integration only saved Intel space and power, it never granted it any performance. I suspect Intel did this to keep costs down. I noticed the problem years ago but completely forgot about it since it’s been so long. Thankfully, with Silvermont the FSB interface is completely gone.

Silvermont instead integrates the same in-die interconnect (IDI) that is used in the big Core based processors. Intel’s IDI is a lightweight point to point interface that’s far lower overhead than the old FSB architecture. The move to IDI and the changes to the system fabric are enough to improve single threaded performance by low double digits. The gains are even bigger in heavily threaded scenarios.

Another benefit of moving away from a very old FSB to IDI is increased flexibility in how Silvermont can clock up/down. Previously there were fixed FSB:CPU ratios that had to be maintained at all times, which meant the FSB had to be lowered significantly when the CPU was running at very low frequencies. In Silvermont, the IDI and CPU frequencies are largely decoupled - enabling good bandwidth out of the cores even at low frequency levels.

The System Agent

Silvermont gains an updated system agent (read: North Bridge) that’s much better at allowing access to main memory. In all previous generation Atom architectures, virtually all memory accesses had to happen in-order (Clover Trail had some minor OoO improvements here). Silvermont’s system agent now allows reordering of memory requests coming in from all consumers/producers (e.g. CPU cores, GPU, etc...) to optimize for performance and quality of service (e.g. ensuring graphics demands on memory can regularly pre-empt CPU requests when necessary).

ISA, IPC & Frequency SoCs and Graphics, Penryn-Class Performance
Comments Locked

174 Comments

View All Comments

  • althaz - Monday, May 6, 2013 - link

    I don't think you fully grasp the situation. Whilst Intel definitely can (and realistically should) take a strong leadership position in the mobile sector, companies like Qualcomm aren't going anywhere - Intel still won't (be able to?) compete on price, which means even if they take the lions-share of the market, there will be enough left for others to survive (they'll be a lot better off than AMD who sells more-expensive-to-manufacture chips for cheaper that perform worse and use more power).

    Although I wouldn't be too confident about nVidia, as they are yet to show they can compete with the likes of Qualcomm, let alone Intel.
  • R0H1T - Tuesday, May 7, 2013 - link

    They most certainly will not "take the lions-share of the market" because that belongs to the ultra thin margin chipmakers like Mediatek/Allwinner that deliver quad core ARM v7 based SoC in that 10~20$ range where Intel will not & cannot compete because of their relatively high(er) cost structure !
  • Khato - Tuesday, May 7, 2013 - link

    This is an argument that never makes sense to me. Yes, Intel won't go into a market unless the margins make it worthwhile... but do you not realize how cheap it is for Intel to make value processors on a deprecated node? Remember, Allwinner and Mediatek may operate on ultra thin margins, but that's in large part because the majority of the margins on their product go to the foundry they use. aka, when all the high end products are using Airmont cores Intel can keep making use of their 22nm capacity for awhile churning out 'old' Silvermont based products for the value market and simply get closer to the 'operating point' margin for that node.
  • R0H1T - Wednesday, May 8, 2013 - link

    I can't say how much TSMC charges for those chips but from what I know the single biggest cost of operations for Intel, outside of their R&D spending & foundry equipment upgrades, must be manpower & the difference between a Chinese/Taiwanese firm vs Intel in this particular dept would be a major one ! This is the real cost advantage that most smaller firms enjoy vis-a-vis Intel & for the foreseeable future they'll continue with this advantage.
  • xTRICKYxx - Tuesday, May 7, 2013 - link

    I didn't feel like this article is Intel PR crap. I read it all and I looked at all the improvements that are inbound; and I couldn't help but feel excited about Silvermont just like Anand.

    I cannot wait to see some benchmarks in the next few months.
  • Silma - Tuesday, May 7, 2013 - link

    On lack of AMD's comparison: there is nothing to compare and while one should tread cautiously with Intel's slides one should not tread at all with AMD's slides because AMD has a huge legacy of promises not held - how many time did we hear it would catch up in notebook or desktops, in performance or performance/watt. While Intel disappoints from time to time (Pentium 4) AMD disappoints most of the time, its last interesting product was the Opteron. Like most companies without vision it ends up doing stupid mergers instead of concentrating on core business.

    On Intel vs ARM. Silvermont looks promising but Intel needs to accelerate its roadmap. At the end of the year it probably won't compete against a 28nm A15. Qualcomm will not sleep for a year. Also it will have to invest heavily into marketing and OEM incentives if it seriously wants a share of the mobile pile. Will shareholders
  • ET - Tuesday, May 7, 2013 - link

    I'm excited. A 7-8" full Windows tablet with decent performance would be very neat. I'll wait to see what performance this gets in games. I don't need much, just enough to run adventure games and such.
  • R0H1T - Tuesday, May 7, 2013 - link

    Then get ready to shell out upwards of 500$ /:
  • pensive69 - Tuesday, May 7, 2013 - link

    can't stand getting a partially functioning market focused 'hack' on a cellphone.
    if the 22nm drill provides a full computer in a smaller form then factor me in!
    i don't care which firm does it...like those kids in the commercial
    we just want more we want more :).
    love it.
  • Laststop311 - Tuesday, May 7, 2013 - link

    this chip will have to pull off a miracle to drive full windows 8 and the everyday apps people use. Seems like it's going to average maybe slightly over 2x performance. That seems like a lot but when you see how poor current atoms are double that performance still is not enough. Does have potential in android phones/tablets and windows 8 phones/tablets as long as it's windows rt on the tablet. Atom still is not good enough for full windows 8

Log in

Don't have an account? Sign up now