Intel’s Silvermont Architecture Revealed: Getting Serious About Mobile

Name: Intel’s Silvermont Architecture Revealed: Getting Serious About Mobile
Item: Intel’s Silvermont Architecture Revealed: Getting Serious About Mobile
Author: Anand Lal Shimpi

by Anand Lal Shimpi on May 6, 2013 1:00 PM EST

Posted in
CPUs
Intel
Silvermont
SoCs

174 Comments | Add A Comment

174 Comments

The Silvermont Module and Caches

Like AMD’s Bobcat and Jaguar designs, Silvermont is modular. The default Silvermont building block is a two-core/two-thread design. Each core is equally capable and there’s no shared execution hardware. Silvermont supports up to 8-core configurations by placing multiple modules in an SoC.

Each module features a shared 1MB L2 cache, a 2x increase over the core:cache ratio of existing Atom based processors. Despite the larger L2, access latency is reduced by 2 clocks. The default module size gives you clear indication as to where Intel saw Silvermont being most useful. At the time of its inception, I doubt Intel anticipated such a quick shift to quad-core smartphones otherwise it might’ve considered a larger default module size.

L1 cache sizes/latencies haven’t changed. Each Silvermont core features a 32KB L1 data cache and 24KB L1 instruction cache.

Silvermont Supports Independent Core Frequencies: Vindication for Qualcomm?

In all Intel Core based microprocessors, all cores are tied to the same frequency - those that aren’t in use are simply shut off (power gated) to save power. Qualcomm’s multi-core architecture has always supported independent frequency planes for all CPUs in the SoC, something that Intel has always insisted was a bad idea. In a strange turn of events, Intel joins Qualcomm in offering the ability to run each core in a Silvermont module at its own independent frequency. You could have one Silvermont core running at 2.4GHz and another one running at 1.2GHz. Unlike Qualcomm’s implementation, Silvermont’s independent frequency planes are optional. In a split frequency case, the shared L2 cache always runs at the higher of the two frequencies. Intel believes the flexibility might be useful in some low cost Silvermont implementations where the OS actively uses core pinning to keep threads parked on specific cores. I doubt we’ll see this on most tablet or smartphone implementations of the design.

From FSB to IDI

Atom and all of its derivatives have a nasty secret: they never really got any latency benefits from integrating a memory controller on die. The first implementation of Atom was a 3-chip solution, with the memory controller contained within the North Bridge. The CPU talked to the North Bridge via a low power Front Side Bus implementation. This setup should sound familiar to anyone who remembers Intel architectures from the late 90s up to the mid 2000s. In pursuit of integration, Intel eventually brought the memory controller and graphics onto a single die. Historically, bringing the memory controller onto the same die as the CPU came with a nice reduction in access latency - unfortunately Atom never enjoyed this. The reasoning? Atom never ditched the FSB interface.

Even though Atom integrated a memory controller, the design logically looked like it did before. Integration only saved Intel space and power, it never granted it any performance. I suspect Intel did this to keep costs down. I noticed the problem years ago but completely forgot about it since it’s been so long. Thankfully, with Silvermont the FSB interface is completely gone.

Silvermont instead integrates the same in-die interconnect (IDI) that is used in the big Core based processors. Intel’s IDI is a lightweight point to point interface that’s far lower overhead than the old FSB architecture. The move to IDI and the changes to the system fabric are enough to improve single threaded performance by low double digits. The gains are even bigger in heavily threaded scenarios.

Another benefit of moving away from a very old FSB to IDI is increased flexibility in how Silvermont can clock up/down. Previously there were fixed FSB:CPU ratios that had to be maintained at all times, which meant the FSB had to be lowered significantly when the CPU was running at very low frequencies. In Silvermont, the IDI and CPU frequencies are largely decoupled - enabling good bandwidth out of the cores even at low frequency levels.

The System Agent

Silvermont gains an updated system agent (read: North Bridge) that’s much better at allowing access to main memory. In all previous generation Atom architectures, virtually all memory accesses had to happen in-order (Clover Trail had some minor OoO improvements here). Silvermont’s system agent now allows reordering of memory requests coming in from all consumers/producers (e.g. CPU cores, GPU, etc...) to optimize for performance and quality of service (e.g. ensuring graphics demands on memory can regularly pre-empt CPU requests when necessary).

ISA, IPC & Frequency SoCs and Graphics, Penryn-Class Performance

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

174 Comments

View All Comments

PolarisOrbit - Monday, May 6, 2013 - link
Re: FSB
Intel tried to get rid of the FSB several years ago, but it was seen as anti-competitive because they simultaneously locked out 3rd parties like Nvidia Ion. One lawsuit later, Intel was bound to keep the FSB in their low power architectures until 2013 for 3rd party support. Basically Intel wasn't playing fair and Nvidia burned their ship.
DanNeely - Tuesday, May 7, 2013 - link
There was no usable FSB in anything beyond the first series of atom chips. The rest still had it within the die to connect the CPU with the internal northbridge; but the only external interface it offered was 4 PCIe2(?) lanes. ION2 connected to them; not to FSB.
Kevin G - Tuesday, May 7, 2013 - link
Actually Intel is to keep PCI-e on their chips until 2016 by that anti-trust suit. This allows 3rd part IP, like nVidia's ION, to work with Intel's SoC designs.
tipoo - Monday, May 6, 2013 - link
This makes me wonder if companies that make in-house SoCs (I guess Apple in specific, since Samsung also sells them to others while Apple just does it for themselves) will ever switch mobile devices to Intel if they just can't match the performance per watt of this and future Atom cores.
tipoo - Monday, May 6, 2013 - link
Also won't the much anticipated SGX 600 series/Rogue be out by around then? That's the GPU that's supposed to take these mobile SoCs to the 200Gflop territory which the 360/PS3 GPUs are around.
xTRICKYxx - Tuesday, May 7, 2013 - link
I would think Apple would (or any company) would want all of their software running on the same architecture/platform if they could.
R0H1T - Tuesday, May 7, 2013 - link
And kill what a billion or so iDevices sold with incompatibility ? Me thinks you dunno what you're talking about !
CajunArson - Monday, May 6, 2013 - link
Did somebody pay you to post that reply? Because if so, they aren't getting their money's worth.

Silvermont Atoms are targeted at smartphones in 2-core configurations and tablets in the 4-core Baytrail configurations. Their power consumption is in a completely different league than even the low-end Temash parts. Let me reiterate: a Temash with a 4 watt TDP is going to have substantially higher real-world power consumption than even a beefy Baytrail and will likely only compete with the microserver Atom parts where Intel intentionally targets a higher power envelope.

I'm sure you can't wait to post benchmarks of a Kabini netbook with a higher power draw than Haswell managing to beat a smartphone Atom as proof that AMD has "won" something, but for those of us on planet earth, these Silvermont parts are very interesting and we appreciate hard technical information on the architecture.
nunomoreira10 - Tuesday, May 7, 2013 - link
Jaguar will be available on fanless designs wille haswell wont, you cant realy compare them.
The facto is intel still doesn't hás a good enougf CPU for a good experiency on a legacy windows 8 fanless design, there is this big hole in the market that AMD is trying to seek.
raghu78 - Monday, May 6, 2013 - link
Intel silvermont is the start of the Intelization of the mobile world. within the next 2 - 3 years Intel should have bagged Apple , Google or Samsung. with the world's best manufacturing process which is atleast 2 - 3 years ahead of other foundries and Intel's relentless tick - tock chip development cadence the ARM crowd is going to be beaten to a pulp. Qualcomm might survive the Intel juggernaut but Nvidia will not.

Intel’s Silvermont Architecture Revealed: Getting Serious About Mobile

The Silvermont Module and Caches

Silvermont Supports Independent Core Frequencies: Vindication for Qualcomm?

From FSB to IDI

The System Agent

Post Your Comment

174 Comments

View All Comments

PolarisOrbit - Monday, May 6, 2013 - link

DanNeely - Tuesday, May 7, 2013 - link

Kevin G - Tuesday, May 7, 2013 - link

tipoo - Monday, May 6, 2013 - link

tipoo - Monday, May 6, 2013 - link

xTRICKYxx - Tuesday, May 7, 2013 - link

R0H1T - Tuesday, May 7, 2013 - link

CajunArson - Monday, May 6, 2013 - link

nunomoreira10 - Tuesday, May 7, 2013 - link

raghu78 - Monday, May 6, 2013 - link

Log in

Don't have an account? Sign up now