Final Thoughts

In the last few years, AMD hasn't really been able to fight against Intel in the high-end CPU market. Pretty much since the release of the Nehalem microarchitecture in late 2008, Intel has held the crown of fastest CPUs and AMD has only been the best option for budget builds. Bulldozer has suffered from delays and recently AMD delayed it even more because the performance didn't meet their expectations. However, Bulldozer could have the potential to shake Intel's position in other than the budget CPU market.

According to leaked product positioning slides, Zambezi is aimed to fight against Intel's Core i5 and i7 lineups. Zambezi will feature up to eight cores, which is twice as many as i7-2600(K)'s four cores. AMD said that they won't join the Hyper-Threading club and they will deliver as many physical cores as Intel delivers physical and virtual cores combined. It looks like AMD is keeping their word, though they're only delivering half as many "FP/SSE cores". Intel will probably still provide the best single-threaded performance but AMDs aggressive approach with many physcial cores may bring them the trophy of best multi-threaded performance. We shall hopefully see this very soon.  

In the server market, AMD's role is a lot more complex. For some HPC applications, AMD offers the best performance at a much lower price. In the midrange, AMD based servers offer more cores (quad-socket) and (in most cases) higher performance for a relatively small price premium over the typical dual-socket Xeon servers. At the same time, if your applications cannot make good use of all those cores, dual-socket Xeon servers can offer a better performance/watt ratio and lower response times. In the high end, Intel Xeon E7 completely dominates, and AMD has left this market for now. In the low power market, Intel's low power Xeons offer a better performance/watt and AMD can only compete when every dollar counts. In most cases, the price of the server CPU is less important in the grand TCO scheme.

In other words, AMD really needs a server CPU with a much higher performance per core and a better performance/watt ratio. TDP Power Cap or configurable TDP helps AMD's server CPUs keep the electricity bill down by avoiding "bursty" power usage. At the same time, with their implementation, TDP Power Cap should have little effect on the real world (not pure throughput benchmarking) performance if you do not lower the TDP too much. We won't be sure until we have measured it, but it looks like a big step in the right direction: lower TCO and more predictable power usage without a (large) performance penalty.

AMD's Future Plans

Second Generation AMD Fusion lineup
Codename Krishna and Wichita Trinity Komodo Sepang Terramar
Architecture Enhanced Bobcat NG Bulldozer NG Bulldozer NG Bulldozer NG Bulldozer
SOI 28nm 32nm 32nm 32nm 32nm
Core count 1-4 2-4 6-10 Up to 10 Up to 20
DX11 IGP Yes Yes No No No
Socket N/A N/A N/A C2012 G2012

Bulldozer will make its way to mainstream CPUs in 2012. Llano's successor, Trinity, will feature up to four next-generation Bulldozer cores. Next-generation (NG) in this context appears to mean that AMD will tweak the architecture because the CPUs will still be manufactured using 32nm SOI. Zambezi's successor, Komodo, will again increase the core count and make it up to 10.

As for the server market, AMD's approach will be a bit more aggressive. AMD will again increase the amount of cores to up to 20 NG Bulldozer cores. Valencia's successor will be 10-core Sepang and Interlagos' will be 20-core Terramar. The server CPUs will also feature PCIe 3.0 support.

Krishna and Wichita will also replace AMD's current Ontario and Zacate APUs. There will be a die shrink from 40nm to 28nm so at this point, Krishna and Wichita look the most interesting from the 2nd gen Fusion lineup. Doubling the cores should yield a nice performance boost in heavily threaded scendarios, though single-threaded performance is still a sore spot for Bobcat compared to other architectures.

Bulldozer's Power Management
Comments Locked

59 Comments

View All Comments

  • ltcommanderdata - Friday, July 15, 2011 - link

    "According to leaked product positioning slides, Zambezi is aimed to fight against Intel's Core i5 and i7 lineups. Zambezi will feature up to eight cores, which is twice as many as i7-2600(K)'s four cores. AMD said that they won't join the Hyper-Threading club and they will deliver as many physical cores as Intel delivers physical and virtual cores combined. It looks like AMD is keeping their word, though they're only delivering half as many "FP/SSE cores". "

    With hyperthreading and now Bulldozer's double integer core/shared FPU design, core counts are becoming increasingly a difficult metric to compare. It's important to note that while Bulldozer has doubled the number of integer cores compared to Istanbul, each integer core is actually weaker since Bulldozer only uses 2 non-symmetric ALUs and 2 AGUs compared to 3 symmetric ALUs and 3 AGUs in Istanbul. Perhaps other architectural efficiencies can make up the difference, but I wouldn't be surprised if clock-for-clock each of Bulldozer's integer cores is slightly slower than Istanbul's. I believe Sandy Bridge's integer performance is clock for clock better than Istanbul, so Bulldozer likely need very well threaded code for it's doubled integer cores to shine.

    FPU resources look to be be beefed up from 3 units in Istanbul to 4 units in Bulldozer. Compared to Sandy Bridge, Intel's big advantage is native 256-bit AVX units compared to Bulldozer which only has 128-bit FP/SSE resources and needs to split 256-bit AVX instructions halving performance. So if Intel can convince developers to quickly adopt 256-bit AVX, Sandy Bridge should have a pretty large SIMD advantage.
  • duploxxx - Friday, July 15, 2011 - link

    dude, you just sound like a horrified Intel fanboy. "convince developers to adopt 256bit AVX). Then what about FMA3 and FMA4 which intel doesn't even have.....

    A single BD Module can handle a 256bit AVX or can deside to split into 2 x 128 for each core . It is a decision from AMD to go that way just like intel decides to have a 256bit full for a PH + HT core..... 2 x 256 logic would just need more die space without usage, just like the choice to go for 2 ALU/AGU while the usage of 3 is almost no gain in server loads besides benchmarking....

    While the FPU 128+128 might be a bit slower we are talking here about perhaps 2-3% since all other parts like cache and memory are shared for a single module and very neglictable difference unless you are a fanboy which is obvious.
  • ltcommanderdata - Friday, July 15, 2011 - link

    "Then what about FMA3 and FMA4 which intel doesn't even have....."

    I believe Bulldozer supports FMA4, but not FMA3 due to Intel flip-flopping on which one they'll support at the last minute breaking commonality. While FMA4 is a great capability to have, you pointing out that Intel doesn't have it is the concern. AVX could see faster adoption because it's supported by both Bulldozer and Sandy Bridge.

    "While the FPU 128+128 might be a bit slower we are talking here about perhaps 2-3% since all other parts like cache and memory are shared for a single module and very neglictable difference unless you are a fanboy which is obvious."

    I mention AVX performance, because I'm under the impression that Bulldozer gangs it's two 128-bit FMACs together to do 1 AVX per module per cycle while Sandy Bridge has 3x256-bit AVX units per physical core. Sandy Bridge's AVX units are non-symmetric and there are no doubt other factors that will impact performance so it won't be a 3x performance difference, but I'd think it'd be more than 2-3% given the big difference in raw processing resources.
  • duploxxx - Friday, July 15, 2011 - link

    my 2-3% was only the difference between a single 256 vs 2 x 128, not against the intel part... lets see first how much AVX will be really used and how much will end up being 128 bit... doesn't mean something which is 256bit is always better then 128bit.
  • silverblue - Friday, July 15, 2011 - link

    I believe I heard once that Intel's implementation can execute either one 128-bit or one 256-bit instruction per clock. Bulldozer's fused implementation may give up on AVX throughput, but only AVX.
  • rnssr71 - Friday, July 15, 2011 - link

    'It's important to note that while Bulldozer has doubled the number of integer cores compared to Istanbul, each integer core is actually weaker since Bulldozer only uses 2 non-symmetric ALUs and 2 AGUs compared to 3 symmetric ALUs and 3 AGUs in Istanbul.'

    why does everyone get hung up on this? yes, phenom had 3 ALUs and 3 AGUs. big deal! it could only complete 3 instructions per clock- any combination of ALU and AGU instructions but no more than 3. so how often could it process 3 ALUs consecutively?
    AMD has said that removing the 3rd AGU won't hurt performance and core 2, nehalem, and sandy bridge all have 2 AGU's.
    Bulldozer can complete 4 instructions per clock- same as core 2, nehalem and sandy bridge. granted, the all have 3 ALU's available, but how often is the extra one used?
  • SanX - Friday, July 15, 2011 - link

    Got kids Phenom II X6 1055T based PC for their games like GTA and just for fun ran on it some scientific FP-oriented tests - parallel algebra codes and some single-core ones.
    Was shocked that at its 2.8GHz stock clock it is twice faster then my overclocked to 4GHz Intel processors. Is this what you guys get too? Kind of contradicting to all these game- and office-oriented and benchmarks where Intel is always on the top.

    So i'm waiting for these 8-core 32nm chips in the hope to drive them to 4.5 GHZ and get additional factor of 2

    Anyone wants to repeat them ?
  • cosminmcm - Friday, July 15, 2011 - link

    You mean compared to your Intel Pentium 4 @ 4 GHz?
  • GaMEChld - Friday, July 15, 2011 - link

    I too am curious as to what Intel chip was used in that comparison.
  • beginner99 - Friday, July 15, 2011 - link

    Most certainly a dual core with 1/3 of the cores or one of the slowest Core 2 Quads. Sure not a nehalem or sb Quad

Log in

Don't have an account? Sign up now