Fighting Power Consumption...with a Longer Pipeline?

Atom's pipeline is a fairly deep 16 stages, with a 13 stage mispredict penalty. Note that this is longer than the Core 2 Duo's 14 stage pipeline, which is surprising given the low power focus the design team had for Atom.


A 16-stage pipeline complete with 3 instruction fetch and 3 instruction decode stages, more than expected

Longer pipelines are generally associated with greater power consumption especially as of late due to the Pentium 4's tenure. Intel gave us three reasons for the long pipeline:

1) Caches
2) Decoder
3) SMT

When faced with a decision between trading off latency for power, the Austin design team always favored keeping power low, even if it meant increasing latencies. Atom doesn't fire the large banks of its caches unless the cache controller knows there's a true hit in the cache, unfortunately this increases the access latency of the cache. In order to keep clock speeds high, these cache accesses have to be further pipelined. The benefit is that power is kept low; Atom keeps things as physically tagged caches to avoid the power burden of a virtually tagged cache.

The same sort of latency tradeoff is made in the decoding stages. Remember the slow vs. fast paths through the decoders? The slow path is higher latency but is guaranteed to properly decode an instruction, the added latency forces Atom to have three decoding stages instead of two.

Finally there are some algorithms in which SMT added a stage or two to the pipeline, the end result being a fairly lengthy pipeline for such a simple CPU. The reasoning however makes sense; there is no NetBurst nonsense here, all of these decisions were made to keep power consumption as low as possible while hitting the right frequency targets. As a fairly simple two-issue core, Atom needs clock speed in order to give us the sort of performance we are expecting of it.

It Does Multiple Threads Though: The Case for SMT An Unbalanced L1 Cache: We Know Why
POST A COMMENT

46 Comments

View All Comments

  • adntaylor - Tuesday, April 08, 2008 - link

    On that chart with price / power, you need to be clearer...

    For price, you show the combined price for CPU + Chipset. For power, you say just the CPU... so 0.65W for the CPU... but you're conveniently ignoring the >2W figure for the chipset!!! This absolutely flatters Intel wherever possible.

    AMD are just as misleading - they describe the Geode LX as "1W" which excludes the non-CPU core parts of the chip (which is an integrated CPU + GMCH)

    Just please be honest - the figures are out there in the Intel datasheets... it takes 10 minutes to check.
    Reply
  • Clauzii - Friday, April 04, 2008 - link

    I still have a PowerVR 4MB addon card, runnung in tandem with a Rage128Pro. Quite a combination w. 15 FPS in Tombraider. Constant(!) 15FPS, that is..

    Amazing what they actually achieved back in 95!
    Reply
  • Clauzii - Friday, April 04, 2008 - link

    Ooops!

    Totally misplaced that. Sorry.
    Reply
  • wimaxltepro - Friday, April 04, 2008 - link

    The Atom represents a shift in processor architecture that is the most dramatic departure for Intel since introduction of x86 processors... the philosophy of how computing itself occurs from centralized processors to distributed processing based on an extension of the popular x86 instruction set.

    The Atom is not about the immediate prospects for the Atom or Nehalem products: we will likely see members of Intel's new product family be used in embedded applications in consumer products and in areas where specialized communications processors are more the rule. While not optimized for use in specific networking applications, the products capitalize on the wide range of support available in IT/Networking to develop common functions that leverage the low cost, low power/processing capability to be used as a common denominator for a wide range of applications.

    Intel has been built on the 'Wintel' architecture: massively integrated chips needed to handle the massively integrated operating systems and applications of Windows (and Apple) environments. The Atom allows migration and broadening out from that architectural motif to a very highly distributed architecture. So, the increased parallelism found in the internal chip architecture is enabling of changes in external system architectures and device applications that go well beyond the typical domain of Intel.. and right into the domain of 'personal wireless broadband' and SDWN, Smart Distributed Wireless broadband Network.

    The decisions about in-order vs. out of-order instruction streams, memory architecture, I/O architecture have been made in light of the broad vision for how computing, networking and, out of hand, how wireless enabled broadband networking including WiMAX will occur. This should be understood for what it represents as a shift in direction for Intel both in response to broad industry shifts and as a trend setting development.
    Reply
  • jtleon - Friday, April 04, 2008 - link

    Thanks to all the flash player ads, etc., a mobile web device will continuously avoid switching to low power states. Thus one could argue that advertising will be carbon footprint enemy of the internet's future. This is already becoming the case for desktop/laptop machines.

    Without such continuous (arguably wasted) consumption of CPU power, then Intel's engineered power management might have a significant impact on the value of the Atom.

    Regards,
    jtleon
    Reply
  • 0WaxMan0 - Friday, April 04, 2008 - link

    I am definatly much impressed and enthused by intels work here, the future looks interesting esp for those of us who like low power cross compatible computing products.

    However I have to point out that a low power modern x86 cpu has allready been done infact 4 years ago with AMD's Geode. While technically much weaker than the Atom and with out any where near the scalability (single core design etc.) the Geode has been available in the same TDP ranges for a good long while. Take a look here http://www.amdboard.com/geode.html">http://www.amdboard.com/geode.html for some old stuff.

    I do hope that the Intel name and hype makes more of an impact than AMD managed.
    Reply
  • whycode - Thursday, April 03, 2008 - link

    Does the TDP quoted include the chipset? Or is that CPU only? Reply
  • IntelUser2000 - Thursday, April 03, 2008 - link

    Anand, the Pentium M does not feature Macro Ops Fusion. Its Core 2 Duo that started Macro Ops Fusion. Reply
  • Anand Lal Shimpi - Thursday, April 03, 2008 - link

    You're correct, I was referencing micro-op fusion. I've made the appropriate correction :)

    Take care,
    Anand
    Reply
  • squito - Wednesday, April 02, 2008 - link

    Am I the only one shocked to see that Poulsbo is a 130nm part... Reply

Log in

Don't have an account? Sign up now