The CPU

Medfield is the platform, Penwell is the SoC and the CPU inside Penwell is codenamed Saltwell. It's honestly not much different than the Bonnell core used in the original Atom, although it does have some tweaks for both power and performance.

Almost five years ago I wrote a piece on the architecture of Intel's Atom. Luckily (for me, not Intel), Atom's architecture hasn't really changed over the years so you can still look back at that article and have a good idea of what is at the core of Medfield/Penwell. Atom is still a dual-issue, in-order architecture with Hyper Threading support. The integer pipeline is sixteen stages long, significantly deeper than the Cortex A9's. The longer pipeline was introduced to help reduce Atom's power consumption by lengthening some of the decode stages and increasing cache latency to avoid burning through the core's power budget. Atom's architects, similar to those who worked on Nehalem, had the same 2:1 mandate: every new feature added to the processor's design had to deliver at least a 2% increase in performance for every 1% increase in power consumption.

Atom is a very narrow core as the diagram below will show:

 

There are no dedicated integer multiply or divide units, that's all shared with the FP hardware. Intel duplicated some resources (e.g. register files, queues) to enable Hyper Threading support, but stopped short of increasing execution hardware to drive up efficiency. The tradeoff seems to have worked because Intel is able to deliver performance better than a dual-core Cortex A9 from a single HT enabled core. Intel also lucks out because while Android is very well threaded, not all tasks will continually peg both cores in a dual-core A9 machine. At higher clock speeds (1.5GHz+) and with heavy multi-threaded workloads, it's possible that a dual-core Cortex A9 could outperform (or at least equal) Medfield but I don't believe that's a realistic scenario.

Architecturally the Cortex A9 doesn't look very different from Atom:

 

Here we see a dedicated integer multiply unit (shared with one of the ALU ports) but only a single port for FP/NEON. It's clear that the difference between Atom and the Cortex A9 isn't as obvious at the high level. Instead it's the lower level architectural decisions that gives Intel a performance advantage.

Where Intel is in trouble is if you look at the Cortex A15:

 

The A15 is a far more modern design, also out of order but much wider than A9. I fully expect that something A15-class can outperform Medfield, especially if the former is in a dual-core configuration. Krait falls under the A15-class umbrella so I believe Medfield has the potential to lose its CPU performance advantage within a couple of quarters.

Enhancements in Saltwell

Although the CPU core is mated to a 512KB L2 cache, there's a separate 256KB low power SRAM that runs on its own voltage plane. This ULP SRAM holds CPU state and data from the L2 cache when the CPU is power gated in the deepest sleep state. The reasoning for the separate voltage plane is simple. Intel's architects found that the minimum voltage for the core was limited by Vmin for the ULP SRAM. By putting the two on separate voltage planes it allowed Intel to bring the CPU core down to a lower minimum power state as Vmin for the L2 is higher than it is for the CPU core itself. The downside to multiple power islands is an increase in die area. Since Medfield is built on Intel's 32nm LP process while the company transitions to 22nm, spending a little more on die area to build more power efficient SoCs isn't such a big deal. Furthermore, Intel is used to building much larger chips, making Medfield's size a relative nonissue for the company.

The die size is actually very telling as it's a larger SoC than a Tegra 2 with two Cortex A9s despite only featuring a single core. Granted the rest of the blocks around the core are different, but it goes to show you that the CPU core itself (or number of cores) isn't the only determination of the die size of an SoC.

The performance tweaks come from the usual learnings that take place over the course of any architecture's lifespan. Some instruction scheduling restrictions have been lifted, memory copy performance is up, branch predictor size increased and some microcode flows run faster on Saltwell now.

Clock Speeds & Turbo

Medfield's CPU core supports several different operating frequencies and power modes. At the lowest level is its C6 state. Here the core and L2 cache are both power gated with their state is saved off in a lower power on-die SRAM. Total power consumption in C6 of the processor island is effectively zero. This isn't anything new, Intel has implemented similar technologies in desktops since 2008 (Nehalem) and notebooks since 2010 (Arrandale).

When the CPU is actually awake and doing something however it has a range of available frequencies: 100MHz all the way up to 1.6GHz in 100MHz increments.

The 1.6GHz state is a burst state and shouldn't be sustained for long periods of time, similar to how Turbo Boost works on Sandy Bridge desktop/notebook CPUs. The default maximum clock speed is 1.3GHz, although just as is the case with Turbo enabled desktop chips, you can expect to see frequencies greater than 1.3GHz on a fairly regular basis.

Power consumption along the curve is all very reasonable:

Medfield CPU Frequency vs. Power
  100MHz 600MHz 1.3GHz 1.6GHz
SoC Power Consumption ~50mW ~175mW ~500mW ~750mW

Since most ARM based SoCs draw somewhere below 1W under full load, these numbers seem to put Medfield in line with its ARM competitors - at least on the CPU side.

It's important to pay attention to the fact that we're dealing with similar clock frequencies to what other Cortex A9 vendors are currently shipping. Any performance advantages will either be due to Medfield boosting up to 1.6GHz for short periods of time, inherently higher IPC and/or a superior cache/memory interface.

Introduction The GPU, Process & Roadmap
Comments Locked

164 Comments

View All Comments

  • Dribble - Thursday, January 12, 2012 - link

    I see fudzilla managed to get a BenchmarkPi score:
    The HTC Thunderbolt (Snapdragon 1GHz): 888ms
    Lenovo K800 (1.6Ghz Atom): 743ms
    LG Optimus 2X (Tegra 2): 550ms
  • french toast - Thursday, January 12, 2012 - link

    Yea when you get past the Intel marketing and start digging you find its not really thtat special when compared to last years designs. hers some more. Intel medfield 3791 quadrant. samsung galaxy note @1.4ghz 4300+

    http://www.youtube.com/watch?v=k2SzV_bl76k

    If you level the clock speed and use the same software on the ARMs you would get better than this in cafeinemark;

    http://androidandme.com/2012/01/news/intel-medfiel...

    Add that to the other links i posted earlier, and do some multithreaded tests and the Atom doesn't look that impressive compared to duel core A9s on 40nm...let alone quad core kraits on 28nm...
  • dwade123 - Thursday, January 12, 2012 - link

    Give a a few years and we 'll see Intel dominating this market.
  • Targon - Thursday, January 12, 2012 - link

    This is a single-core chip....in an environment that is already going to be dominated by dual-core chips by the time it is released. What is Intel trying to do, emulate Palm, who would announce something that sounds great, then a year later when product is actually shipping, seems pretty weak? Palm died as a result(even though it was under the HP umbrella at the end), and Intel is just following that example of what NOT to do.

    Intel may have process advantages, but Intel doesn't do much when it comes to real innovation.
  • happycamperjack - Thursday, January 12, 2012 - link

    Judging from the BrowserMark and SunSpider, Medfield has tegra 3 beat for about 10% to 30% in a more single threaded application. But in a more threaded application such as photo editing apps, some games and also multitasking, Tegra 3 would come out on top. Not to mention Tegra 3 would probably do a lot better in battery life and 3D games as well.

    But backward compatibility for lower end Windows 8 tablets? Yes please!
  • Lucian Armasu - Friday, January 13, 2012 - link

    A 10% performance different shouldn't be surprising, considering Intel Atom is running at 1.6 Ghz and Tegra 3's first core is running at 1.4 Ghz. This only means that a Cortex A9 core is about as powerful as Atom at the same clock speed. And by the time it's out it will have to compete with Cortex A15, which is twice as powerful as Cortex A9 for the same clock speed. Plus it will be dual core vs the single core Atom. Krait chips should be in the same ballpark as Cortex A15, perhaps a bit weaker, but still much more powerful than Atom.

    As for the compatibility with Windows 8. I don't understand what's the benefit of that? To use programs that are not optimized for touch? Why? If that was such a big deal, you could already use Windows 7 tablets. Whether Microsoft is pushing for ARM tablets, or x86 tablets, they still have to start from scratch, because they need apps that are fully optimized for touch, and not for the mouse. So in this case x86 has no advantage over ARM, at least not more than it already had in the Windows7-era. And if Microsoft were smart, they'd actually push the ARM tablets instead to compete on battery life.
  • happycamperjack - Friday, January 13, 2012 - link

    You don't understand the benefit of backward compatibility?? Are you serious?? How about instant access to biggest libraries of applications ever while Windows 8 apps have time to mature.

    As for the performance of the chip, I was disappointed about Intel's SoC until I realize that it's actually running android 2.3. So it would be more fair to compare the performance against another Android 2.3, Galaxy S II, which benchmarked at half the speed of Intel! But it's GPU is definitely garbage.
  • thunng8 - Friday, January 13, 2012 - link

    The Motorola RAZR is also running 2.3.
  • french toast - Friday, January 13, 2012 - link

    What has been misleading about the Intel pushed benchmarks in this article, is that although the Medfield runs Gingerbread, it also run a heavily updated varient.2.3.7..which according to the boys over at xda, has been optimised to near ICS levels..
    Note that the phones benchmarked against it run stock Gingerbread which can be noticebely slower on older versions.

    Another thing to note, the phones benchmanrked against, also have heavy custom UI skins over the top..aka sense/touchwiz which saps power, hence why uses prefer to root their phone..for that very perforance enhanced reason.
    -Where as the Medfield reference phone does not.

    If you level all software equal, i very much doubt the Medfield would have a lead in any benchmark, and in some cases would likely lose, such as graphics, multhreaded, and battery use scenarios that stress the cpu.

    That is against phones that have been on the market 18months or so by the time Medfield ships AND are lower clocked A9s.
  • CUEngineer - Friday, January 13, 2012 - link

    You guys are hilarious... Obviously there will be an optimized OS version that google and intel worked on, since its using a different ISA then arm, they need to optimize the binaries to do things such as take advantage of instructions intel adds for performance which no ARM IP licenseee company is allowed to do... Any good company will optimize software to run on their hardware to give better results and that is valid...
    Intel has been doing high performance designs for many years now, ARM just designs their IP to work simple and without consuming much power, so it wouldnt be hard to think that intel analyzes certain performance features differently such as handling hits under misses and taking multiple miss requests without bottlenecking the system... an out of order CPU could make this impact less since other instructions might be able to be scheduled while waiting for the miss to be completed..
    Either way all you folks should worry about is how close those power numbers because once intel gets in this space it is going to dominate, and will have attractive offerings since everyone else is basically using the same IP from arm with different wrappers...

Log in

Don't have an account? Sign up now