More Registers Equals More Performance?
As mentioned earlier, Intel's SSE implementation has essentially double the registers as AMDs 3DNow implementation. This means that register with register operations can be done much more efficiently and without constantly repacking the data into the registers. Take for example the task of transforming a list of vertices using a 4 x 4 master matrix. Storing the master matrix will exhaust 3DNow of registers. This means that all register manipulations will have to use registers which are already used for storing the master matrix. This means repacking the master matrix all over again, time consuming. Luckily, according to Tim Sweeney, lead programmer at Epic Games,
"Since register-memory instructions are as fast as
instructions, I don't usually need to use more than 4 registers."
However, Sweeney goes on to state that:
"The register limit will probably hurt other types
of apps (signal
processing, sound) more than 3d transformations."
This is perhaps a technical reason why many application developers, such as Adobe, may prefer SSE over 3DNow.
Since AMDs 3DNow instructions share the same register set as do MMX registers, programmers can use MMX instructions to "hack" the 3DNow registers. Register hacking is used for bit shifts and problem specific optimizations. Some were skeptical that this would not be possible to do with SSE, since SSE uses an entirely new set of registers. Luckily, Intel provided us with a significant amount of "bit twiddling" functions which allow programmers to do virtually everything possible with MMX instruction mixing. (This is one of the reasons why SSE is composed of 70 instructions, as opposed to 3DNow's 21)
Last but not least, both of these great SIMD instruction sets are useless without software support. Currently, 3DNow has an advantage in this area because it already has an established user base of millions of PCs. SSE, on the other hand, has an infant user base. 3DNow's current user base may not be enough to compete with Intel's dominance of the market, as Anandtech concluded in the Pentium III review. Intel is promoting SSE aggressively and intends on unveiling the Pentium III with a load of apps available or to be released around March 1st. A more complete list is available here. Only time will tell whether or not SSE can catch up to 3DNow's 9-10 month head start.