I've repeatedly called the PS3's Cell the more powerful processor out of the two when compared to the 3-core PPC chip in the 360. I've also said that the difference in real world performance between the two chips may be very different from the on-paper performance differences.

The strength of Cell is truly derived from its SPE array; with reference to 3D graphics and gaming, we've long since known that two things result in the best performance: lots of bandwidth, and specialized hardware. All of the previous generation consoles implemented (in one way or another) these fundamental principles of making stuff fast. At the same time, PCs always caught up by, at first, embracing the GPU and then by simply increasing general purpose CPU speed by leaps and bounds from one year to the next.

The 3-core PPC processor in the Xbox 360 is no slouch either. Remember that just one of these cores, regardless of its clock speed, isn't exactly the most powerful core on the market. But being relatively narrow 2-issue cores, if you stick a bunch of them together you get something fairly powerful - especially if the applications you're running on them are properly multithreaded.

The main difference between these two CPUs is the general purpose vs. specialized hardware approach. If the goal of either of these consoles was a machine that could run any application well, then the 360 has the upper hand. You don't really see people running MS Office on their MPEG-2 decoder chips. But, if you're talking about tons of physics calculations, 3D calculus and other complex floating point math, similar to what's required in video decoding as well as 3D gaming, then specialized hardware will always give you better performance. To use the MPEG-2 decoder example, there's a reason why video decode and encode assist was pulled off of general purpose CPUs in PCs as fast as possible - there are some things that can simply be done better with specialized silicon. We saw another example of this with the move to the GPU and away from CPU based software rendering of games. Ageia's announcement of the PhysX PPU also echoed the need for specialized hardware when dealing with the complex physics and AI modeling that must be done for the next generation of 3D games. It is because of the Cell's extensive use of specialized hardware that I refer to it as the more powerful processor, on paper.

The distinction "on paper" is particularly important because a lot of the performance debate will really come down to two things: 1) how much processing power will be needed for the next generation of games, and 2) how much of it will be taken advantage of on Cell.

Tim Sweeney made it a point to mention that their Unreal 3 tech demo (which was rendered in real time) only took two months of work on the PS3 hardware they received. The sheer number of demos and quality of demos that were shown off at the press event leads me to believe that the PS3 isn't impossible to program for (given that all developers should have had similar amounts of time with the dev kits). But the question isn't whether or not the PS3 will be impossible to develop for, it is how much of its power will be used.

The first hurdle is obviously getting game developers to multithread their engines. This is a much bigger hurdle than optimizing for Cell or the 360's 3-core PPC processor. I have a feeling that it may take a while before we see properly multithreaded game engines running on consoles (the current estimate is year-end 2006 for multithreaded game engines to appear on the PC), so the first generation of games for the 360 and PS3 may end up being more of a competition of GPU horsepower. From what I've seen thus far, the demos that are being showcased aren't really focusing on the physics or AI aspects of what these next-generation consoles can do, rather mostly focusing on the fact that we finally have consoles with GPUs powerful enough to render scenes at 720p or 1080p resolutions.

Some of the PS3 demos did show off the rag doll physics but nothing appeared to be any more complex than what we've already seen in Half Life 2.

If that is the case, and the first generation titles aren't really well multithreaded then the performance argument for Cell begins to fall apart. The question then becomes whether or not its performance potential will be truly seen during the lifetime of the console. I have a feeling it will, but I'm not much of a fortune teller.

So when will PCs catch up? The console vs. PC debate has always been a balance, consoles would always debut more powerful than PCs, then PCs would catch up and surpass consoles during their ~5 year lifespan. The difference this time around is that the desktop CPU industry is going through a big of a transitional period, it may take a little longer than usual for desktop CPUs to be able to outclass (in all areas) their console counterparts. As far as GPUs go, by the end of this year I'd expect to see 360 and PS3 class (or faster) GPUs offered for high end PCs. By the time the PS3 is released, I would say that the upper mid range GPUs will offer similar (or very close) performance.

The truly limiting factor will be the transition to 65nm on the desktop, the faster that can happen, the quicker the PC will regain its power advantage. But despite any power advantage, this next generation of consoles will definitely be powerful enough to tempt away some PC gamers...at least for a while.
Comments Locked

32 Comments

View All Comments

  • Anonymous - Thursday, June 2, 2005 - link

    AndyKH knows what he's talking about. Listen to him.
  • Michael2k - Saturday, May 21, 2005 - link

    Personally I can't wait to see what this implies for future PowerMacs; multicore and 3GHz+
  • AndyKH - Saturday, May 21, 2005 - link

    #26
    I think we are arguing in slightly different directions.
    IIRC, the original MIPS R2000 had an ok C compiler made for it, and the compiler was able to take advantage of the branch delay slot. So while I appreciate the difficulties in making a good compiler, I don't agree with you on the point that it will be that difficult to achieve a respectable performance in branch heavy code. You might end up with an empty branch delay slot(s) at times, but finding instructions that will be executed regardless of whether the branch is taken or not, is fairly easy for the compiler compared to grouping instructions for concurrent execution on the very wide Itanium architecture.
    However, if the part you wrote about abstracting away from the architecture means abstracting *completely* away from all the peculiarities in the Cell processor (mainly the fact that SPEs can’t handle general purpose code), then I agree with you. However, the PS2 also have a “few” peculiarities of its own and the dev houses still manage to write well performing code.
    I think the big hurdle will be to develop a multithreaded game engine, and then “just” adapt the engine for “the other” architecture. AFAIK, the truly impressive PS2 games have not been achieved by just writing the code once and compile to multiple architectures.
    All that said, I still believe the XBOX cpu will be easier to program for, because all the threads will be executing on the same architecture. But…. It all depends on the tools available, and Sony might have some fairly good ones ready by now.
  • Ritesh - Friday, May 20, 2005 - link

    Anand,

    I would like to point you to a blog entry made by Major Nelson. http://www.majornelson.com/2005/05/20/xbox-360-vs-... Its a four part issue comparing the two console in detail. Altough posted by someone whose in the Microsoft camp, i would like to hear your comments on the figures he posted. They definately seem to be reliable and in detail, but i would like to hear an unbiased opinion on these figures from someone who isnt from either MS or Sony.
  • Reflex - Friday, May 20, 2005 - link

    The G5's are not able to be used fully, some of the components in them are designed for operations the 360 CPU will not have. So they are not able to harness the full power of 2 G5's, the only reason for dual is for multi-threading practice as well.

    The 360 core, btw, is 3 CPU's @ 3.2Ghz. So a dual CPU 2.5Ghz dev kit would be roughly 1/2 the performance assuming everything about the CPU's was equal, which it is not. So the estimation is not off base.
  • Anonymous - Friday, May 20, 2005 - link

    "It takes two Apple G5s to power a 30% capacity Xbox 360 demo?"

    So the MS claim is this:
    I'm assuming Dual 2.5GHz G5s - so that's four of them - four 2.5GHz G5s from IBM running three times slower than a single, yet to be seen, 3.2GHz CPU from IBM.

    Either IBM has made an unheard of leap in CPU performance, or the Macs were running slow code, or we won't see anything like this king of jump in performance.

    What's the betting?
  • Reflex - Friday, May 20, 2005 - link

    #25: The only way I see that being the case is if:

    a) coders optimize every application like hell for it, and are intimatly familiar with how it works in order to avoid poor coding decisions. The drawback to this is their code would be very difficult to write and would not bbe very portable.

    b) A very very advanced compiler is developed that takes all the work out of coding for the architecture. Unfortunatly compilers take years to write and refine, they are still finding refinements to C compilers 25 years after they were created. I wouldn't hold your breath for this one.

    I'm not saying it isn't possible, but it would only happen if Cell is a permanant new processor line that will be used for generations and refined/advanced over time. I don't see that happening any more than the Emotion Engine was, Sony will use Cell for 5 years and then do something different for the PS4. Without platform stability there is no real purpose to learning the ins and outs of an architecture well enough to take something that was created to be specialized and make it general purpose.
  • AndyKH - Friday, May 20, 2005 - link

    #17 & #19
    It might just be that it won't even suck that much at branch heavy code. All the articles I have read about the subject speculate that the pipeline will be very short, and that is the logical thing to assume when you remove all of the out of order logic from the core. One could speculate that it might be entirely possible to do the branch target or jump target calculation already in the second pipeline stage as it is the case in the wonderful MIPS R2000 processor, which (for all I know) is THE preferred architecture when teaching university students about computer architecture. If you couple that with a very advanced compiler that probably do a very good job of filling the branch delay slot(s) with usable instructions, then you could potentially have a very respectable performance in branch heavy code as well. Of course problems might arise when executing very tight loops with very few instructions in each loop, but I really don’t think it will be such a big problem as you make it to be.

    Anand, what is your take on this?
  • oysteini - Friday, May 20, 2005 - link

    I think the pc's biggest problem, is the fact that it will take a lot of time until most normal users has a multi-processor(core) system. Which will make it a bit risky for developers to rely on this kind of architecture. And what will be the standard? 2, 4 8 processors? A physics psu? Standards is the main advantage consoles have, and this time it might just take a bit longer for pc's to catch up, because of the new multi-processor architecture found in next gen consoles.
  • Chris - Thursday, May 19, 2005 - link

    Can't wait to see what the PC world counters with.

    3.2 GHz Athlon FX X4 with 4 cores?

    That would eat both of these CPUs alive.

Log in

Don't have an account? Sign up now