Three very interesting things happened over the past couple of weeks here at AnandTech:
  1. Intel’s Spring IDF 2005 turned out to be a multi-core CPU festival, with Intel being even more open than ever before about future plans for their multi-core microprocessor architectures.   Intel has over 10 multi-core CPU designs in the works, and they made that very clear at IDF.
  2. At GDC 2005, AGEIA announced that they had developed a Physics Processing Unit (PPU) that could be used to enable extremely realistic physics and artificial intelligence models.
  3. Johan De Gelas went one step further in his quest for more processing power earlier this week to find that there’s quite a lot of potential for multi-core CPUs in the gaming market, at the expense of increasing development times.
So, what do these three things have in common?   The aggregate of the three basically summarize what we’ve come to know as the Cell microprocessor - a multi-core CPU, part of which is designed for parallel physics/AI processing for which it will be quite difficult to program.

Cell, at a high level, isn’t too difficult to understand; it’s how the designers got there that is most intriguing.   It’s the design decisions and building blocks of Cell that we’ll focus on here in this article, with an end goal of understanding why Cell was designed the way it was.

A joint venture between IBM, Sony and Toshiba, the Cell microprocessor is the heart and soul of Sony’s upcoming Playstation 3.   However, this time around, Sony and Toshiba are planning to use Cell (or parts of it) in everything from consumer electronics to servers and workstations.   If you don’t already have the impression, publicly, Cell has been given some very high aspirations as a microprocessor, especially a non-x86 microprocessor.

Usage Patterns
POST A COMMENT

64 Comments

View All Comments

  • microbrew - Thursday, March 17, 2005 - link

    "System on a Chip (SoC)"

    What will make or break the Cell is the tools available, especially the operating system and libraries.

    I would like to see what they're doing in terms of marketing the chip to consumer electronics, telecom, military and other embedded applications. I could see the Cell as a viable alternative to the usual mixures of PowerPcs, ARMs and DSPs.

    I also agree with Final Words; I don't see the Cell breaking into the consumer PC market any time soon either.
    Reply
  • Locut0s - Thursday, March 17, 2005 - link

    #17 Yeah that was a bit too harsh I agree. Reply
  • Eug - Thursday, March 17, 2005 - link

    I'm just wondering how well a dual-core PPE-based 4+ GHz chip would do in general purpose (desktop) code.

    And I also wonder how cool/hot such a chip would be. The Xbox 2's CPU is probably a 3-core PPE, but it runs at 3 GHz, and we don't have power specs for it anyway.
    Reply
  • Filibuster - Thursday, March 17, 2005 - link

    #11 (well, everyone should if they haven't before) read the Arstechnica article on PS2 vs PC - static applications vs dynamic media. Cell is taking it to the next level.

    http://arstechnica.com/articles/paedia/cpu/ps2vspc...

    Very nice article Anand!
    Reply
  • Googer - Thursday, March 17, 2005 - link

    Besides a release date, is there any news or knowledge of a Linux Kit for Playstation 3 like there was for PS2? Does anyone KNOW OF Either? Reply
  • Illissius - Thursday, March 17, 2005 - link

    Damn. Awesome article. If I hadn't known the site and author beforehand, I would've guessed Ars and Hannibal. Seems he isn't the only one with a talent for these kinds of articles ;)
    You should do more of them.
    Reply
  • scrotemaninov - Thursday, March 17, 2005 - link

    #22: This is just a guess so don't rely on this. The POWER5 has 2way SMT. Each cycle it fetches 8 instructions from the L1I cache. All instructions fetched per cycle are for the same thread so it alternates (round robin). It also has capabilities for setting the thread priority so that you effectively run with 1 thread and it just fetches 8 instructions per cycle for the one running thread.

    I would expect the PPE to be similar to this, fetching 2 instructions for the same thread each cycle. The POWER5 has load balancing stuff in there too - if one thread keeps missing in L2 then the other thread gets more instructions decoded in order to keep the CPU functional unit utilisation up. I've no idea whether this kind of stuff has made it over into the PPE, I'd be a little surprised if it has, especially seeing as this is in-order anyway so it's not like you're going to be aiming for high utilisations rates.
    Reply
  • scrotemaninov - Thursday, March 17, 2005 - link

    #23: True, but I believe that when the SPE's access the outside memory they go through the cache. Sure it's a lower coherancy than we're used to but it's not much worse. Reply
  • Houdani - Thursday, March 17, 2005 - link

    18: Top Drawer Post.
    20: Thanks for the links!
    Reply
  • fitten - Thursday, March 17, 2005 - link

    "Given the speed of the interconnect and the fact that it is cache-coherant,"

    Only the PPC core has cache. The individual SPEs don't have cache - they have scratchpad RAM.

    #22: I believe the PPC core is a dual issue core that just happens to be 2xSMT.
    Reply

Log in

Don't have an account? Sign up now