...and why SMT can be impressive!

If you want to know what is going to happen in the future, it is always a good idea to look at the big iron. After all, many of the techniques that are now popular in low budget x86 CPUs originated from there: SIMD (Cray-1, ILLIAC IV), 64 bit (MIPS R4000) and CMP (IBM Power 4) are just a few examples.

The IBM Power 5 is a very good example of a CPU that is really made for SMT instead of just having it glued on. Up to 8 instructions can be executed in parallel on one of the two cores, while 5 instructions per thread can be fetched and retired. That means that with one thread, you can have up to 5 instructions in parallel, and with two threads running, up to 8 instructions in parallel. Combine this with massive buffers, a decently large L1 (32 KB instructions, 64 KB data) and huge amounts of memory bandwidth, and the SMT capability can really show its potential. IBM reports a performance boost of 40%, while SMT increased the die size by 24%.

This SMT makes much more effective use of processor resources than multi-core. If only one thread is running, and there is a lot of instruction level parallelism, it has all the execution resources to its disposal and the CPU acts as a massive parallel superscalar CPU. If two or more threads are running, they can make optimum use of the available execution slots. For each percentage that the die size increases, SMT gives you more than one percentage of performance back. In contrast, a second core doubles the die size, but rarely improves performance with more than 70%. SMT can be a superb feature to boost the performance of a multi-core CPU without increasing the die size too much.

Bringing it all together...

Intel and AMD are playing different trump cards while getting their next generation of quad core designs ready for the server market. It is clear, however, that clock speed will only increase slowly, and will no longer be the most important performance indicator.

Intel can leverage their experience with the power saving features of the P-m to design quad core CPUs with remarkably low TDP. SMT might well be one of Intel’s most important weapons to enable relatively high IPC per core. The fact that the current implementation called Hyperthreading offers only mediocre performance improvements is not a reason to believe that SMT will not have a bright future. SMT added to a high IPC core might even give Intel the edge in the server market. The shared L2-cache in the next generation multi-core CPUs (Merom, Conroe, Woodcrest, Whitefield) should also eliminate Intel’s high cache to cache latency.

AMD’s current dual core architecture is vastly superior to Intel’s. The more than twice as fast cache-to-cache communication does not pay off in all multithreaded applications, but it should give AMD a scaling advantage in OLTP and some rendering and HPC applications. It will be very easy for AMD to make communications between the cores even faster, by attaching a shared L2-cache to the SRQ. AMD can also leverage their knowledge and experiences with the on die northbridge to lower the latency and increase the bandwidth of the memory subsystem.

I like to express my thanks to the following people who helped to make this article possible:

References

[1] Hyper-Threading Technology Architecture and Microarchitecture
http://www.intel.com/technology/itj/2002/volume06issue01/art01_hyper/p01_abstract.htm

SMT Dead?
Comments Locked

28 Comments

View All Comments

  • Houdani - Wednesday, May 18, 2005 - link

    'Splain to me what you believe are the alleged "false assumptions."

    The only outright assumption I observed was located in the comments section. Specifically number two.
  • Ahkorishaan - Wednesday, May 18, 2005 - link

    Intel is by no means panicking, they're riding out a storm, and things will be dicey starting about 2/3 through 2006. AMD has the advantage now, but I honestly don't know if they can hold up against the R&D budget Intel has at it's fingertips.

    When P-m features get integrated into Intel's lineups AMD will be faced with the hotter, hungrier chip, and though they have more experience with the on-die Memory controller, and a nice head of steam, that might not be enough.

    I'm a fan of AMD and I applaud their foresight, but they need to keep on the ball if they expect to stay ahead for another year.
  • allanw - Wednesday, May 18, 2005 - link

    All this talk of databases and no mention of PostgreSQL? Cmon..
  • flatblastard - Wednesday, May 18, 2005 - link

    Oh great....more fuel for the "Intel panics" thread fire.
  • Rand - Wednesday, May 18, 2005 - link

    I haven't finished the article yet, but would you care to clarify your objections Questar?

    At least through the third page I haven't come across any assumptions or even real solid opinions he's put forth as yet.
    Thus far it's merely a technically oriented analysis of their respective offerings, nothing that I've read is particularly new or debateable/controversial.

  • Rapsven - Wednesday, May 18, 2005 - link

    Holy ****, Questar. That's all I'm going to say for you.

    Very informative. Though a lot of the more technical parts of the article flew right by me.
  • Questar - Wednesday, May 18, 2005 - link

    Wow, another AMD fanboy opinion piece based upon false assumtions. Go Anandtech!
  • sprockkets - Wednesday, May 18, 2005 - link

    not this time...

    nice pic on the last page, but I have no idea of the scale

Log in

Don't have an account? Sign up now