The K8 is here to stay

One of the most interesting points that we came away from our discussion of future AMD architectures was Weber's stance that the K8 execution core is as wide as they are going to go for quite some time.  Remember that the K8 execution core was taken from the K7, so it looks like the execution core that was originally introduced in the first Athlon will be with us even after the Athlon 64. 

What's even more interesting is that Intel's strategy appears to confirm that AMD's decision was indeed the right one. After all, it looks like the Pentium M architecture is eventually going to be adapted for the desktop in the coming years.  Based on the P6 execution core, the Pentium M is inherently quite similar (although also inferior) to the K7/K8 execution core that AMD has developed.  Given that Intel is slowly but surely implementing architectural features that AMD has done over the past few years, we wouldn't be too shocked to see an updated Pentium M execution core that was more competitive with the K7/K8 by the time that the Pentium M hits the desktop. 

Fred went on to say that for future microprocessors, he's not sure if the K8 core necessarily disappears and that in the long run, it could be that future microprocessors feature one or more K8 cores complemented by other cores.  Weber's comments outline a fundamental shift in the way that microprocessor generations are looked at.  In the past, the advent of a new microprocessor architecture meant that the outgoing architecture was retired - but now it looks as if outgoing architectures will be incorporated and complemented rather than put out to pasture.  The reason for this reuse instead of retire approach is simple - with less of a focus on increasing ILP, the role of optimizing the individual core decreases, and the problems turn into things like: how many cores can you stick on a die and what sort of resources do they share? 

In the past, new microprocessor architectures were sort of decoupled from new manufacturing processes.  You'd generally see a new architecture debut on whatever manufacturing process was out at the time and eventually scale down to smaller and smaller processes, allowing for more features (i.e. cache) and higher clock speeds.  In the era of multi-core, its the manufacturing process that really determines how many cores you can fit on a die and thus, the introduction of "new architectures" is very tightly coupled with smaller manufacturing processes.  We put new architectures in quotes because often times, the architectures won't be all that different on an individual core basis, but as an aggregate, we may see significant changes. 

How about a Hyper Threaded Athlon?

When Intel announced Hyper Threading, AMD wasn't (publicly) paying any attention at all to TLP as a means to increase overall performance.  But now that AMD is much more interested and more public about their TLP direction, we wondered if there was any room for SMT a la Hyper Threading in future AMD processors, potentially working within multi-core designs. 

Fred's response to this question was thankfully straightforward; he isn't a fan of Intel's Hyper Threading in the sense that the entire pipeline is shared between multiple threads. In Fred's words, "it's a misuse of resources."  However, Weber did mention that there's interest in sharing parts of multiple cores, such as two cores sharing a FPU to improve efficiency and reduce design complexity.  But things like sharing simple units just didn't make sense in Weber's world, and given the architecture with which he's working, we tend to agree. 

Weber’s Thoughts on Cell An Update on Turion and Final Words
Comments Locked

35 Comments

View All Comments

  • stephenbrooks - Thursday, March 31, 2005 - link

    I'm a bit confused by the terminology in places. Doesn't ILP mean "Instruction-Level Parallelism", i.e. that applies to distributing instructions between different execution units, and perhaps other tricks like out-of-order execution, branch prediction etc. But it certainly does NOT include "frequency", as seems to be implied by the first page! Unless it means that the longer pipeline will be interpreted as more parallelism (which is true). But that's not the only way to increase clock speed... a lot comes from the process technology itself.
  • MrEMan - Thursday, March 31, 2005 - link

    I just realized that the link to "IDF Spring 2005 - Predicting Future CPU Architecture Trends" requires that you go to the next page, and not the one the link points to, and it is there where ILP/TLP is explained.
  • MrEMan - Thursday, March 31, 2005 - link

    What exactly is ILP/TLP ?
  • sphinx - Thursday, March 31, 2005 - link

    #12 PeteRoy

    I would have to agree with you.
  • Son of a N00b - Thursday, March 31, 2005 - link

    Great article Anand! I feel better informed and this was something that filled up my little spot of curiosity I had saved for the future of processors.


    It seems as if AMD will continue to keep up the great work. I will be a customer for a long time.
  • hectorsm - Thursday, March 31, 2005 - link

    blckgrffn I did not see your post until now. Your explanation seem to make a lot of sense. I guess is now a matter of opinion to how much is "30%" worth in terms of heat and transistor.

    thanks.
  • hectorsm - Thursday, March 31, 2005 - link

    Thanks Filibuster. The article confirms the up to 30% gain in processing power under certain multithreaded scenarios. But I am still confused to why this is a waste of resources specially when HT was design for multiple thread use.

  • blckgrffn - Thursday, March 31, 2005 - link

    The point of hyperthreading being a waste of resources is that it costs A LOT to put features like that into hardware, and the die space and tranistors used to do HT could probably have been used in better way to create a more consistent performance gain, or could have been left out all together, reducing the complexity, size, power use/heat output of the processor and putting a little bit more profit per chip sold at the same price into Intels pocket. That is why it is a misuse of resources.

    Nat
  • BLHealthy4life - Thursday, March 31, 2005 - link

    Just release the FX57 already...
  • hectorsm - Thursday, March 31, 2005 - link

    "not sure what you mean by "processing efficiency". all HT does is virtually separate the processor into two threads. maybe I'm missing something, but I can't figure out why everyone associates HT with performance gain. "

    There are supposedly fewer misprediction in the pipeline since there are two threads sharing the same pipes. Even when the total processing power is cut in halth, the sum of the two appears to be greater with HT. It has been reported up to 30% increase in total output when running two intances of folding@home and HT.

    So I am still wondering why Fred is calling it a "misuse of resources". Maybe he knows something we don't. It would be interesting to know more about this. Maybe someone at AnandTech get get a clarification from Fred?

Log in

Don't have an account? Sign up now