A Crash Course in CPU Architecture

It’s been years since I’ve gone through the life of an instruction, and when I last did it it was about a very high end desktop processor. I realize that not everyone interested in what’s powering the iPhone 3GS or Palm Pre may have been taken down this path, so I thought some of that knowledge might be useful here.

Applications spawn threads, threads are made up of instructions and instructions are what a CPU “processes”. The actual processing of an instruction is pretty simple; the CPU must fetch the instruction from memory, decode or somehow understand what the instruction is telling it to do (e.g. add two numbers), grab any data that is required by the instruction (e.g. find the numbers to be added), actually execute the instruction and finally write the result of the operation either to a register or memory.


Our basic microprocessor with a 5-stage pipeline

Based on the example above, executing an instruction requires five distinct stages. In a pipelined microprocessor, a different instruction can be active at each stage of the execution pipeline. For example, you can be grabbing data for one instruction, while decoding another and fetching yet another. All modern day processors work this way.


Multiple instructions can exist in the pipeline at once, but only one instruction may be active at any given stage

Each one of these stages should take the same amount of time for the processor to work efficiently; the length of time required at the longest stage actually determines the clock speed of the CPU. If the most complex stage in my example above is the decode stage and it requires 3ns to complete, then my CPU can run no faster than 333MHz (1 / 3ns).

To reach faster frequencies, we need to speed up each stage of the pipeline. You can speed up a stage by implementing some sweet new algorithms, or simply by splitting up complicated stages into simpler ones and increasing the number of stages in your pipeline.

In our previous example, the decode stage required 3ns to complete but if we split decode into three separate stages, each requiring 1ns, then we remove that bottleneck. Let’s say we do that but now some of our other stages become the bottleneck; with a target of a 1ns clock period (1ns spent per stage) we go from five stages to eight:

Fetch
Decode 1
Decode 2
Decode 3
Fetch Operands
Execute 1
Execute 2
Write Output

Now, with each stage running at 1ns, our maximum clock speed goes up from 333MHz to 1000MHz (1GHz). Sweet. Right?

With less work being done in each stage, we reach a higher clock speed, but we also depend on each stage being full in order to operate at peak efficiency.


5-stage pipeline (top) vs 8-stage pipeline (bottom). The 8 stage pipe is more desirable, but also requires more instructions to fill.

In the first CPU example we had a 5 stage pipeline, which meant that we needed to have the pipe full of 5 instructions at any given time to be operating at peak efficiency of 1 instruction completed every cycle. The second example has a ginormous 8 stage pipeline, which requires 8 instructions in the pipe for peak efficiency. In both cases you can only get one instruction out of the pipe every cycle, but the second chip can give us more completed instructions in say, 10 seconds.

Now think for a moment about the time periods we’re talking about here. The first CPU had a clock period of 3ns, where each stage took 3ns to complete. The second CPU had a clock period of 1ns. A single trip to main memory can easily take 60ns for a CPU with a very fast on-die memory controller, or over 100ns otherwise. For the sake of argument let’s say that we’re talking about a 100ns trip to main memory. Remember the Fetch Operands stage? Well if those operands are located in main memory that stage won’t take 3ns to complete, but rather 103ns since it has to get the operands from main memory.

Modern processors will perform a context switch upon any memory access to avoid stalling the pipeline for such an absurd length of time. The contents of the pipeline get flushed and filled with another thread while the data request goes off to main memory. Once the data is ready, the processor switches contexts once more and continues on its execution path. Here’s the problem: it takes time to refill the pipeline, and the longer the pipeline, the longer it takes to refill it. This is a bad, but regular occurrence in a microprocessor. Our instruction throughput drops from its 1 instruction per clock peak to 0; not good.

Other scenarios can create interruptions in the normal flow of things within our microprocessor. Some instructions may take multiple cycles at a single stage to complete. More complex arithmetic may spend significantly longer at the execute stage while the operation works out. With an in-order microprocessor, all instructions behind it must wait.

Again, the more stages in your pipeline, the bigger the penalty for a stall. But when the pipeline is full, a deeper pipeline will give us a higher clock speed and better overall performance - we just need to worry about keeping the pipeline full (which takes a great deal of additional transistors). And yes, there is an upper limit to how deep you can pipeline your processor before you start running into diminishing returns in both a performance and power sense, this was ultimately the downfall of the Pentium 4’s architecture.

Index Superscalar to the Rescue
POST A COMMENT

60 Comments

View All Comments

  • rree - Wednesday, January 06, 2010 - link

    http://ecartshopping.biz">http://ecartshopping.biz

    Air jordan(1-24)shoes $33

    Nike shox(R4,NZ,OZ,TL1,TL2,TL3) $35
    Handbags(Coach lv fendi d&g) $35
    Tshirts (Polo ,ed hardy,lacoste) $16

    Jean(True Religion,ed hardy,coogi) $30
    Sunglasses(Oakey,coach,gucci,Armaini) $16
    New era cap $15

    Bikini (Ed hardy,polo) $25

    FREE sHIPPING
    http://ecartshopping.biz">http://ecartshopping.biz
    Reply
  • shank2001 - Wednesday, July 08, 2009 - link

    Wow, talk about fanboyism. Let me get this straight, just because it is an article on the iPhone you feel there is no knowledge to be gleaned from an in depth article like this one? You would rather that no one wasted YOUR time with more articles about the iPhone, is that right?

    Well I, for one, learned a lot about smartphones in general, and not just the iPhone, but I especially loved this article getting into the nitty gritty of the iPhone.

    The iPhone happens to be the best smartphone out right now... it might not be that way forever.... probably not. But I really like the direction Apple has been taking the smartphone market since they introduced the iPhone.

    After years of using so called "smartphones" running windows mobile, and Pocket PC OS, etc. the iPhone is a breath of fresh air, and is the best smartphone I have ever owned.

    It took Apple to force the market into making a true smartphone. I am glad that Palm has woken up from their stupor and come out with the Pre, I hope it is enough to turn around their fortunes.

    Although, for me, it does not come close to the iPhone, but then again, most of the things that people seem to dislike, I actually LOVE... like the touchscreen typing for example. I am way faster typing on my iPhone than I ever was on my old smartphones with the chicklet keyboards. You just have to trust the autocorrection... once you learn to trust it, it is amazingly speedy. I no longer think twice about typing lengthy comments, such as this one, on my mobile phone any more! I just do it.

    I have not yet heard a single viable reason to hate the iPhone, I think I will coin the term "Hateboy" to describe these boys, they certainly act like little kids filled with illogical hate. No matter if you love the iPhone or hate it though, you have to admit that the iPhone DID revolutionize how smartphones will work from now on. Kudos to Palm for recognizing this fact, hopefully others will as well. The consumer is who wins. And good for Apple for finally getting some recognition for their amazing products that have led the industry from the very beginning in so many ways.
    Reply
  • Myrandex - Wednesday, July 08, 2009 - link

    I will agree that the iPhone revolutionized the smartphone market, and it has done some pretty amazing things pretty darn well good, but I still have some key complaints that prevent me from EVER getting one, unless I see a fundamental change in Apple, which never happens.

    I will not buy a phone without a Standard connection interface. Just beause the iPhone is so popular doesn't mean that its proprietary apple dock connector is standard. Use standard mini USB or micro USB, or I won't touch it.

    The touchscreen keyboard is good, and it does have a nice autocorrection, however I still prefer tactile feedback. As good as that keyboard is, there is no way that I can type without looking at the keyboard like I can on my current HTC smartphone. The keyboard on my phone is very comfortable, and I'd imagine that I am faster on that than some people on their computer keyboards. I've written 4 page emails from my phone with minimal effort. Some words I do not want autocorrected either. Acronyms many times are autocorrected by a phone, or names too, but many times I type these accurate and want them to stay that way, which on a software keyboard it slows you down because of verifying every key. Passwords also are like this, much slower when I'm using an iPhone compared to my smartphone (my fiancee has had both the 1st Gen iPhone and the iPhone 3G).

    iTunes. I hate it. If I have to use it to use a phone, I will not use the phone. I want USB Mass Storage device access, I want to copy and paste music to my phone with my organization that I have already decided upon (whether or not tags are filled out correctly), and I don't want to have to sync it, just copy and paste what I want ONTO it and OFF OF it. I want to be able to copy music from my desktop onto my phone, plug my phone into my laptop, and copy it off of there. No profiles, no syncing, just pure file access. Thanks Microsoft, no thanks Apple.

    Apple controlling application. This is corrected with jailbreaking, however I choose to vote with my wallet. I do not want to support a company that enforces this. Any application that I find that I want written for my phone, I will install on my phone. If it offends someone at Apple, then more power to me to want to install it on my phone. They will not make a dollar off of me for their stongarmed tactics.

    There are probably other small things as well, but those are the major ones.

    Jaosn
    Reply
  • christinme7890 - Thursday, July 09, 2009 - link

    if you are using your keyboard without looking you are most likely driving...a large portion of the accidents in cars happen because idiots are using their phones while driving. If you are doing this please stop or you could kill someone...no joke. I had a friend whose mom died because some idiot was text while driving.

    Second the itunes store is great for most people. Sure it doesn't have what you are looking for which is essentially a hdd but if it did then installing apps would suck. Why do you think the APP store is doing so well. Because it is a one stop shop. If I want a app, i go to the APP store and search for it and then click buy/install then sync and finished.

    If I were using a WinMo device I would have to first find a list of all the devs that offer the app i am looking for and visit each and every webpage and sit through all the trash that they claim their software does. Pay attention to the finding devs that produce the app I am looking for...this takes longer than you think, especially for a n00b consumer. I also have to pay attention to which WinMo OS the app supports. Many times you need the newest and greatest OS in order for it to work and when I had a PPC, verizon didn't let me have the latest upgrade to the OS. I had to hack the PPC to allow me to use the updated OS. Then once I find the right dev, the software is usually a lot more expensive because there is no immediate competition. So i end up paying a ton of money. Sure you can find free software that does similar but it is not backed by a good support system...merely a live forum. Then once I find the software I have to give them my credit card information and email address. I will more than likely end up getting a email daily from the company about their new crappy software. Now that I have spent 20 minutes entering in my underwear size and preferred deodorant brand I download my app. After downloading it I have to go through their custom install procedure. Then I have to hope that it installed correctly. Then if I want it to sync with my desktop I have to buy another piece of software and install it and then figure out how to get it to sync. So much hassle and running around and time wasted. I can install 1 APP store app in less than a minute and it could be a game that is 100mb installed over wifi.

    And about the music. Easy, you create what we call a playlist and drag and drop your music to the playlist and then sync your phone. You then play your playlist. Playlists are the same as folders. Not sure what is so difficult about that. Yes it would be nice to be able to connect my iphone to multiple computers to copy music but then you get what APPLE, along with every other business wants to avoid, and that is illegal sharing...duh. Just because people don't let you do anything you want doesn't mean you have to spew hate.

    "If it offends someone at Apple, then more power to me to want to install it on my phone." This attitude is selfish and you are probably one of those people that loves to steal and pirate software all the time. You care little for the developer and only about yourself.
    Reply
  • shank2001 - Wednesday, July 08, 2009 - link

    That was a good rebuttal comment. I agree with some of what you say, especially the having to look at the keyboard. It is a definite necessity. I do miss that about an actual physical keyboard.

    Reply
  • iwodo - Wednesday, July 08, 2009 - link

    This is no way fanaticism. But the general enthusiasm in terms of great technology improvement. It was an GREAT article from Anand. ( Yes i read it all, give me lollipops :D ) May be you are too young to figure this out.

    What drive you to see more SSD stuff? I want to see more SSD review too. Why? Simply because HDD it is the SINGLE bottleneck living inside out current Computer. Be it PC or Mac.
    Upgrading from an Core2Duo to Core2Quad or even Core i7, Double Channel to Tri Channel, DDR2 to DDR3, 2GB to 4GB Memory, Geforce 9500 to GTX 290.... If you are not a gamer, any of these upgrades, or even if you do ALL of these upgrades, wont even land you a 10% overall performance increase in your 90+% day to day usage of computer. And even if they do show more then 10% in benchmark. There is a very small chance these are even human / user perceivable.
    You will properly feel your system being faster if you reinstall Windows rather then upgrading your Hardware.
    That is why SSD is so important and many people want one. It actually brings Significant perceivable speed improvement that is not seen FOR MANY YEARS.

    The last time we seen any improvement was in the Pre Pentium 4 days...

    iPhone, or Internet Mobile Devices, are in exactly the same period technology growth when PC were in the 486 and Pentium Era. 700% increase in Graphics, 100% increase in CPU speed? When was the last time you seen any of these in PC.

    The next technological advance are in the Mobile / Phone space. They are the new PC. Just like how X86 manage to utilize its Desktop strength to gain market share in server space. May be ARM could finally dethrone x86. ( At least i hope so )
    Reply
  • iGo - Tuesday, July 07, 2009 - link

    Absolutely agree with cdrsft, never mind that guy. He probably read just the first and last page.

    Not always you come across the article which provides lot of information, on and off topic... and not always you find more than required information in an article which is actually useful and help you learn. Not to mention, all this written in absolutely enjoyable manner. :)

    Thank you, Mr. Shimpi for such wonderful article... and many more before this.

    Reply
  • cdrsft - Tuesday, July 07, 2009 - link

    never mind that guy - your article was great and very helpful for those of us who want to understand more....... thank you! Reply
  • bowtech - Monday, November 08, 2010 - link

    can u explain why cortex a8 did not beat arm 11 in almost any of these tests then.http://www.pengutronix.de/development/kernel/arm-b... Reply
  • MassiveTurboLag - Tuesday, April 30, 2013 - link

    By the look of that video screenshot Anand drives a Porsche Cayenne. I hope he didn't see Jeremy Clarkson's video on it. Reply

Log in

Don't have an account? Sign up now