Final Words

Finding good SSE3 benchmarks wasn't as easy as we would have liked. Other encoding suites react the same way that DivX and AutoGK do. This seems to indicate that the K8 architecture is simply resilient when it comes to unaligned 128bit loads. In the case of Intel's NetBurst, the lddqu instruction may have more impact.

As far as physics and graphics go, the added instructions show potential in our synthetic test. For DCC, CAD, scientific, and other workstation software, the E4 stepping could offer a bit of a performance boost.

In the consumer space, Athlon 64 may not see as much benefit from SSE3, especially since our encoding tests turned up so little performance impact. SSE3 can be used in games, but the impact of this will likely be minimal. As most games will likely remain graphics limited, improvements will have a hard time shining through. Of course, for those who like to use lower cost Athlon 64 processors in cheaper workstations, there could be some advantage.

When we take a look at the Opteron 252 in a workstation environment, we will be able to get a better view of what the total package has to offer. As our workstation tests will be in a DP environment, we'll be able to see how the higher bandwidth helps the Opteron shine.

We would like to have tested more applications in this report on SSE3 performance under the new AMD core. Of interest to us are LINPACK, FLOPS, STREAM, and various other tests that would require us to recompile them with proper SSE3 support. As the Intel compiler is designed to optimize for Intel processors, we haven't had a viable source for high quality SSE3 compilation. Hand optimizing these benchmarks for SSE3 on Opteron would take a little more time than this short investigation will allow. We may look into using GCC for this purpose in future tests. As for real world tests using SSE3, we haven't been able to find many suitable candidates beyond video encoders.

It will likely be the case that current SSE3 optimized code paths will also not show their strengths on Opteron/Athlon until the processors are in developers' hands for a while. The Intel compiler is also hands and feet above any resource AMD have up their sleeve. But since SSE3 offers more choices for optimization and code simplification, compilers may have an easier time generating efficient code. Hand optimized code is still important for tight loops in critical sections of performance oriented code. In this case, more powerful and simple options implemented in hardware will help programmers better optimize their own code.

SSE3 Performance Analysis
Comments Locked

48 Comments

View All Comments

  • SkAiN - Thursday, February 17, 2005 - link

  • bigpow - Thursday, February 17, 2005 - link

    Funny.
    I work at one of the largest high tech company today and I can't find any of these Opteron servers. My friends also notice the same trend.
    Large corporations are sticking with Intel, enough said.

    Nice step forward for AMD, still far away to catch Intel.

    For my PC, I use AMD AthlonXP (soon-to-be A64). I wouldn't go with Intel for my use. But then again, I wouldn't go with Opteron too.

    Who's buying this Opteron again?
  • DerekWilson - Thursday, February 17, 2005 - link

    Unfortunately, the platforms I have available to test the Opteron on (nforce 3 pro and nforce pro 2200) only offer overclocking in the form of nTune. And these platforms do not like being pushed out of spec.

    We also have many more tests to run on these processors and platforms and don't wish to see an unfortunate lab accident consume our samples before we squezee all the data out of them we are looking for.

    If we finish all our planned tests with Opteron 252, we may look into overclocking. But that will sit on the back burner for some time either way.
  • dannybin1742 - Thursday, February 17, 2005 - link

    isn't this rev supposed to use strained silicon too?
  • ozzimark - Thursday, February 17, 2005 - link

    i know these are opterons, but are we going to get an overclocking article on the new core soon?
  • skiboysteve - Thursday, February 17, 2005 - link

    its funny how intel comes out with SSE, SSE2, SSE3... to compensate for weak x87 FP and a long pipe, but because of marketing AMD has to adopt these instructions as well on a very resiliant cpu that doesnt have such pickyness about code... so slap on SSE2 sticker and the performance is no better.

    you could almost blame the kick ass FP performance?

    im not trying to be biased, but i mean, look at the numbers, its the truth. it takes allot of work to make a long pipe work great in all areas.
  • Fricardo - Thursday, February 17, 2005 - link

    Do you guys have any word on when the revision E stepping comes out for the Athlon 64's? I wonder how long of a gap AMD wants to leave before releasing their desktop parts.
  • jimmy43 - Thursday, February 17, 2005 - link

    In any case, AMD is slowly catching up to Intel in the media encoding segment..Hey more features, im not complaining!

Log in

Don't have an account? Sign up now