Superscalar to the Rescue

If deepening the pipeline gives us higher clock speeds and more instructions being worked on at a time, but at the expense of lower performance when things aren’t working optimally, what other options do we have for increasing performance?

Instead of going deeper, what about making our chip wider? In our previous example only a single instruction could be active at any given stage in the pipeline - what if we removed that limitation?

A superscalar processor is one that allows multiple instructions to be active at any given stage in the pipeline. Through some duplication of resources you can now have two or more instructions at the same stage at the same time. The simplest superscalar implementation is a dual-issue, where two instructions can go down the pipe in parallel. Today’s Core 2 and Core i7 processors are four issue (four instructions go down the pipe in parallel); the high end hasn’t been dual issue since the days of the original Pentium processor.

The benefits of a superscalar chip are obvious: you potentially double the number of completed instructions at any given time. Combine that with a reasonably pipelined, high clock speed architecture and you have the makings of a high performance processor.

The drawbacks are also obvious; enabling a multi-issue architecture requires more transistors, which drive up die size (cost) and power (heat). Only recently have superscalar designs made their way into mobile devices thanks to smaller and cooler switching transistors (e.g. 45nm). You also have to worry even more about keeping the CPU fed with instructions, which means larger caches, faster memory buses and clever architectural tricks to extract as much instruction level paralellism as possible. A dual issue chip is a waste if you can’t keep it fed consistently.

Raw Clock Speed

The previous two examples of architectural enhancements are major improvements in design. To design a modern day CPU with more pipeline stages or to go from a single to dual-issue design takes a team years to implement; these are not trivial improvements.

A simpler path to improving performance is to just increase the clock speed of the CPU. In the first example I provided, our CPU could only run as fast as the most complex pipeline stage allowed it. In the real world however, there are other limitations to clock speed.

Manufacturing issues alone can severely limit clock speed. Even though an architecture may be capable of running at 1GHz, the transistors used in making the chip may only be yielding well at 600MHz. Power is also a major concern. A transistor usually has a range of switching speeds. Our hypothetical 45nm process may be able to run at 300MHz at 0.9500V or 600MHz at 1.300V; higher frequencies generally mean higher voltage, which results in higher power consumption - a big issue for mobile devices.

The iPhone’s processor is based on a SoC that can operate at up to 600MHz, for power (and battery life) concerns Apple/Samsung limit the CPU core to running at 412MHz. The architecture can clearly handle more, but the balance of power and battery life gate us. In general, increasing clock speed alone isn’t a desirable option to improve performance in a mobile device like a smartphone because your performance per watt doesn’t improve tremendously if at all.

In terms of sheer performance however, just increasing clock speed is preferred to deepening your pipeline and increasing clock speed. With no increase in pipeline depth you don’t have to worry about keeping any more stages full, everything just works faster if you increase your clock speed.

The key take away here is that you can’t just look at clock speed when it comes to processors. We learned this a long time ago in the desktop space, but it seems that it’s getting glossed over in the smartphone market. A 400MHz dual-issue core is going to be a better performer than a 500MHz single-issue core with a deeper pipeline, and the 528MHz processor in the iPod Touch is no where near as fast as the 600MHz processor in the iPhone 3GS.

A Crash Course in CPU Architecture Putting it in Perspective
Comments Locked

60 Comments

View All Comments

  • MrJim - Wednesday, July 8, 2009 - link

    Why no mention of the heat issues?
  • ViRGE - Wednesday, July 8, 2009 - link

    Anand, if you haven't already, jailbreak the 3GS and grab SysInfoPlus from Cydia. It may be able to tell you the clock speed of the 3GS's ARM, although to what extent I'm not sure since it hasn't been specifically programmed for the A8.
  • ltcommanderdata - Wednesday, July 8, 2009 - link

    I don't suppose that program can also tell the GPU clock speed too?

    I always thought that the MBX work at bus speed, ie. 103MHz for the iPhone/3G and 133MHz for the 2nd Gen iPod Touch instead of the 60Mhz that Anand has speculated. Assuming the iPhone 3G S has a 150MHz bus speed, the SGX could run at 150MHz which is a reasonable compromise between Anand's 100MHz and 200MHz estimates.
  • fyleow - Wednesday, July 8, 2009 - link

    How useful is the new GPU? The iPhone's performance has come a long way from the first generation but I don't see developers taking full advantage of the jump. If you bump up the graphics of your game it might run smoothly on the 3GS but end up lagging on the 1st gen iPhone.

    The increase in load times and battery life is much welcomed, but when do we get to see some apps that take advantage of the upgraded hardware in other more interesting ways? I can see a resolution increase as being one way to do that. The game would look better on a higher resolution screen but performance wouldn't suffer on the older models because the lower resolution would place less demand on the hardware.

    2010 will be an interesting year. There should be a bigger upgrade to the iPhone, most likely a resolution bump and a significantly modified OS that supports background tasks. Apple has been keeping all the devices on the iPhone platform on feature parity so far with the OS upgrades (minus obvious limitations due to hardware differences). It would be interesting to see how they handle the switch and the resulting two classes of phones that come from it (i.e. old "legacy" iPhones/Touch vs new iPhones/Touch).
  • ltcommanderdata - Wednesday, July 8, 2009 - link

    You're right that it's difficult to take full advantage of the SGX without writing a separate dedicated code path for it and one for the MBX. However, there are simpler ways to take advantage of the iPhone 3G S power without writing 2 separate code paths. For example, you can scale draw distance based on hardware. Firemint demonstrated the iPhone 3G S accelerating 40 cars in Real Racing compared to 6 in the iPhone 3G, so the potential for better scalable AI is there. For a RPG, perhaps having more NPCs walking around to make the environment more lifelike. This can all be done using existing OpenGL ES 1.1 code playable on all iPhones/Touches, optimizing for each device, without making older iPhone users feel like they are playing some Lite version of the game as implementing shaders and HDR using OpenGL ES 2.0 in the iPhone 3G S might do.

    I believe the reluctance of Apple to change the resolution is that it could break the interface layout for existing apps and/or make things ugly if apps haven't used vector graphics. It would have been nice if they had enforced resolution independence early on, but I don't believe they did. Resolution independence is also what is needed for Apple to introduce an iPhone nano with a smaller screen and presumably smaller resolution.
  • smallpot - Wednesday, July 8, 2009 - link

    Thanks for the article Anand. Your long-form articles are the reason Anandtech is my number one tech website. I'm thinking of articles such as this, your articles on SSD performance, and the long-form story behind the RV770. After reading such articles, I really feel like I've learned something, rather than just had performance metrics thrown at me without context.
  • Baron Fel - Tuesday, July 7, 2009 - link

    Interesting article.

    As far as portable gaming goes, the Ipod Touch/iPhone/Zune HD dont have a chance against the DS or even the PSP. The software support just isnt there.

    PSP hardware runs circles around the DS, so why is the DS killing it in sales? Good games.

    and are we getting more SSD articles anytime soon? I think thats what we want to see :D
  • ltcommanderdata - Tuesday, July 7, 2009 - link

    Given all the media attention about discoloration and possible heat issues with the iPhone 3G S, I was wondering if you could comment on your experience in this area. Do you think it's a real concern or just stories popularized to generate page hits as Apple related stories tend to do? The latest reports on discoloration indicate that it might actually be from a reaction with some third-party cases that may be reversed by cleaning with alcohol.

    Similarly, there have been lower-key reports of build quality issues with the Palm Pre having a wobbly screen from it's slide-out keyboard. Has this been a major issue for you and do you think it'll be an issue over time?
  • Anand Lal Shimpi - Tuesday, July 7, 2009 - link

    I haven't seen anything to indicate heat as being a bigger concern with the 3GS. It's a new processor so there's bound to be some bad chips out there, but I wouldn't be too concerned.

    The build quality on the Pre did bother me. It's something that I think bothered me more because of my experience with the iPhone. The screen was a bit wobbly and overall the device just didn't feel as well put together. Part of it is because of the slide out keyboard, but part of it has to be cost/experience related. I think you'd get used to it over time, but if you then held an iPhone you'd quickly grow tired of the build quality issues once again :)

    Take care,
    Anand
  • tomoyo - Tuesday, July 7, 2009 - link

    Btw Anand, the chart for number of stages in the cpus shows the Iphone 3GS as 8 stage instead of 13.

Log in

Don't have an account? Sign up now