Back to Article

  • kenyee - Tuesday, February 09, 2010 - link

    Intel has some cool features like virtual port assignment, etc. that AMD currently doesn't have (so databases and gigabit network cards go faster on Intel VT hardware in the upper end Xeons).
    Does the new AMD core have any improvements in this?
  • mino - Saturday, February 13, 2010 - link

    AMD does not produce NIC chips. Go ask Broadcom about that. It a pure NIC feature.
  • yanfei - Sunday, July 25, 2010 - link

    ======= Reply
  • haplo602 - Tuesday, February 09, 2010 - link

    I think I recall that AMD expects the GPU to take the most of the FPU load in the future, so maybe the APU chip will be an FP op monster when the GPU part is not used much. Pair that with an external GPU and theoreticaly unused on die GPU in AM3 and there is actualy no problem for AMD to make the chip AM3 compatible..

    Anyway I don't give a damn about Direct Compute whatever version it is. I want good OpenCL performance and finaly decent OpenGL drivers from ADM. And finaly a working and usable Linux driver would be nice.
  • mjw25a - Wednesday, February 10, 2010 - link

    Correct, worth taking a look at">

    If there's sone thing interesting about the two next generation (Bulldozer and Bobcat)architures (modules) it's that they both appear to have been designed with far less floating point power than previous AMD CPU's.

    This tends to indicate that both next gen modules will likely have a GPU on the same die to shunt floating point operations to.

    Traditionally GPU's completely smash the x86 architecture when it comes to floating point performance so this will be a good move.

    The Llano looks to be a the use of the current gen Propus (Athlon II) core with an APU to massively boost floating point performance.

    Not a big deal as it looks like they're positioning Llano as their mainstream product whereas Bulldozer will fit into the same niche as the i7 is currently in. Bobcat is their atom equivalent which should beat that handily.

    I'll be keenly watching the renewed CPU wars next year. This integration of the GPU has the potential for AMD to leapfrog Intel in all performance segments if they can pull it off. Intel's graphics chips so far have been abysmal.

    The Larabee will not help them in this respect as it's essentially a large grid of dumbed down x86 cores. As I've previously mentioned, floating point operations have always been x86's weakness.

    Next year will be interesting indeed.
  • hwhacker - Tuesday, February 09, 2010 - link

    We pretty much already know from the die shot the 'APU' is 480sp, which implies 8 ROP and 24 TMU. A little stronger than 5670 architecturally speaking.

    GF has said 28nm bulk should allow for a 40% increase in clock speed compared to the same size die and TDP of a 40nm bulk chip, and 28nm is a 10% linear shrink of 32nm. Of course SOI should have better characteristics than bulk, but that gives you an idea of what to expect. Because of the given die size of the core minus L2 in the article, we now know Fusion is 13x13mm2, 169mm2, or exactly the same size as Propus (Athlon II X4), so THIS DOES APPLY.

    Imagine a very possible scenario where the GPU is clocked at 1/4 of the CPU standard. It's very plausible this could start at the 3.2-3.6ghz (800-900mhz gpu) and creep up to 4ghz with a 1ghz GPU clock contained in a 95W TDP. Wouldn't be surprised by an ~875mhz set clock either.

    Add to that the probability of Sideport going to two chips instead of one. When the 900 series chipset(s) launches, GDDR5 will be in 2Gb form. This means a likely 512MB of decent on-board memory, perhaps of the 7Gbps variety. That's 56Gbps on a 64-bit bus.

    What we'll likely end up with is a GPU that will be faster than an 8800gt, sometimes by a lot, slower than 5750/gts250, likely questioning the usefulness of a 128sp 64-bit 28nm Fermi for either platform, and perhaps a xfire partner to the smallest Northern Island. This also brings the defacto standard for gaming to this level, which is GREAT news, because lots of people own old 8800gt or similar-performing cards as hand-me-downs.

    On the CPU side, just simply compare Athlon X4 to Clarkdale. I would imagine the 2c version of Sandy Bridge is essentially Clarkdale with GPU on die, with similar cpu clockspeeds, tdp, and die size to fusion. That would mesh with the 4c die shot that's been on the net for 6 months (Citing 3-3.8ghz clocks and 1-1.4ghz gpu). The thing is, AMD will have a CAPABLE gpu.

    Personally, I think it's going to be cake-and-eat-it for anyone gaming below 1680x1050 at that point, say a 720/768p htpc, or a good casual all-arounder + GPGPU perks. Plus, If you look at a 5670, there's plenty of stuff it can run at decent rez...This could be up to a third faster in some cases, if such speculation pans out.
  • tcube - Saturday, March 20, 2010 - link

    Your asumption that the gpu will not have the same clock as the cpu is kind of odd to say the least... It's the same(exactly the same) process as the cpu why on earth would you want to have it artifically downclocked? Since a) soi is much more efficient and dense and b) you're working on a smaller node with much better technology with much less heatdissipation and so on... The 2 things aren't even 2 dies like in the intel solution ... they are on the same friggin silicon... If it will feature 480 sp's @ 3+ GHz it would behave as a ~1600 sp bulk gpu (eerm... remember HD5870...) so what's the deal? Plus think of all the v/frequency synch you need to do between the 2 cores to get them in one line... Why even bother with all that if you can put a gpu+cpu on a silicon that can outrun any current generation laptop solution at far less power drain?? Reply
  • LeftSide - Wednesday, September 01, 2010 - link

    I don't think you understand the difference between GPUs and CPUs. Because of the way they are designed CPUs clock higher. This has to do with the pipeline in a CPU. Ever wonder why the highest clocked Pentium 4 still runs faster than the fastest i7? Reply
  • iamezza - Tuesday, February 09, 2010 - link

    no way the GPU would be THAT fast. It would completely destroy ATI sales. Reply
  • Am1R - Thursday, February 11, 2010 - link

    well by that time Ati discrete GPU will be twice as fast Reply
  • knowom - Tuesday, February 09, 2010 - link

    I hate new socket boards hopefully AMD will make some 32nm cpu's for their current AM2/AM3 motherboards. Reply
  • Kiijibari - Tuesday, February 09, 2010 - link

    Sure they do .. ever heard of the Bulldozer core ? Take 8 of them, put them on a die together with a memory controller, caches and Hypertransport and say hello to the Zambezi - fitting into AM3.">

    Much better than an old K10 @32nm :)
  • GaMEChld - Wednesday, February 10, 2010 - link

    I suspect that the Zambezi will be 4 Bulldozer modules, each showing up to the OS as 2 cores. I think they will market it as 8, but it will be 4 Bulldozer modules. Not that I'd mind that or anything (My current 4 Cores / 4 Threads aren't exactly being strained). Just want to make sure they aren't trying to slip one past us.

    The main thing I want is for Bulldozer to be much faster clock for clock than STARS.
  • DigitalFreak - Monday, February 08, 2010 - link

    Essentially a quad core Athlon II vs Intel's next gen technology? Gee, I wonder how that will work out. Reply
  • JKflipflop98 - Wednesday, February 10, 2010 - link

    :) I don't think it's going to work out so well for the little guy. Reply
  • Calin - Tuesday, February 09, 2010 - link

    They will hit a lower price bracket than Intel is doing right now.
    Same computing power at lower cost? Bring it on!
  • Calin - Tuesday, February 09, 2010 - link

    They will hit a lower price bracket than Intel is doing right now.
    Same computing power at lower cost? Bring it on!
  • GaiaHunter - Monday, February 08, 2010 - link

    This is just their test run on APUs.

    They will pit (or hope yo) Bulldozer vs higher end Intel.
  • Blessedman - Monday, February 08, 2010 - link

    So with the integration of CPU/GPU, who cuts through the muck and produces a CPU that is more like a GPU as far as parallel processing goes? To clarify it seems GPU's are innovating faster and becoming quicker at doing all purpose functions of mainstream CPU's. So when do they ditch the idea of 2-6 thick core CISC chips in favor of thin mass multi core RISC chips that can emulate CISC commands quicker then native CISC CPU's? Or am I just way way off base? Reply
  • tcube - Saturday, March 20, 2010 - link

    You see, basically you're right ... practlcally you're not. To create an emulator you need extra power. Creating generic wireing for a jack of all trades cpu you need technology that's in my opinion at least 10-20 years out. Besides the fact that you need to manage the entire wireing of small unified processing units, you need to translate the input than you need to make this entire thing happen quicker then to have command->dedicated wireing->output(per cicle). That means that your jack-of-all-trades needs to be so insanely efficiently engineered that it does not waste extra energy cost you extra clock cicles and so on and so foth. And this my friend is not on anybodys drawingboards at this moment. Even AMD has sensed that this is the future back then when they decided to go ahead with fusion (And only now intel has gotten the idea and are going with larrabee)... but they just didn't realise how complex the task is... luckilly for us intell did, and gave us some good cpus while AMD was in the s(t)inkhole... now they seem to come back, let's wellcome this advancement and hope for the best competition 2010+ Reply
  • Calin - Tuesday, February 09, 2010 - link

    "that can emulate CISC commands quicker then native CISC CPU's"

    We already did. The last x86 processor that was entirely CISC (I too may be off base) is the 386. The 486 started using the pipeline (I think), Pentium used , and by the time of the Pentium M the micro-ops fusion appeared (CISC instructions are broken into RISC-like micro operations, and Pentium M could "fuse" together micro-ops to execute them in a single clock cycle - even if the micro-ops were from different x86 instructions).

    As for emulating x86 with a different internal architecture, you should look at the Transmeta processors (which were very wide RISC-like internally, and ran x86 code thru some kind of interpreter)
  • qwertymac93 - Monday, February 08, 2010 - link

    With DX11 support, comes support for the latest direct compute model. This support is meaningless for games as these "APU"s will be too weak to play games making use of it, but applications like video rendering might make use of this and accelerate encoding dramatically, we are already seeing this with the avivo video encoder, and 3rd party apps that use ati stream(and CUDA). the support of dx11 tells me this apu will be much more powerful then 3300/4200. with the launch of 5450, we saw that adding support for dx11 took up a very large amount of space, which is why 5450 has the same amount of stream units, despite using the much smaller 40nm process. I think dx11(and open cl) will play a bigger part in these new cpus and an igp with say, 120spu's would accelerate encoding(and general fp performance i bet) dramatically. all of this makes sense as bulldozer will have half as many FP units as integer, the igp will have to pick up the slack. Reply
  • Griswold - Tuesday, February 09, 2010 - link

    Its not meaningless per say because that GPU could be used to calculate physics via direct compute and leave the horsepower of the discrete card to the rendering of graphics. Reply
  • ExarKun333 - Monday, February 08, 2010 - link

    Let's see some benchmarks, otherwise this is just PR stuff I could get from Reply
  • Basilisk - Monday, February 08, 2010 - link

    to me why a new socket is necessitated. I'm not doubting you, and I can think of ways that on-chip graphics might benefit from a few pins, but I'd appreciate a clarification. Reply
  • Jellodyne - Tuesday, February 09, 2010 - link

    I dunno why the completed frames could be sent out on the HT bus -- a relatively simple HT-to-video out device, either standalone or in the southbridge, could unpack them to DVI or a DAC if you're going to a VGA out. And the MB would work without cpu with onboard video -- the vid outs just wouldn't function. Reply
  • Calin - Tuesday, February 09, 2010 - link

    I don't know how much of the integrated graphics are on the microprocessor. However, to output a high definition image (1920 by 1080 at 60 Hz, 16 millions colours) one needs to send out about 480 MBps of data (8 MBytes for a frame, times 60 frames a second).
    Now, the processor has quite a bit of memory bandwidth, and quite a bit of HyperTransport bandwidth - but it will be eaten into by the need to use some of it for GPU memory access (remember that current generation graphic cards use tens of GBps of memory bandwidth, and are in some cases limited by memory bandwidth). Contrasting this, CPUs hardly go over 10 GBps memory bandwidth.
    These being said, I think the processor will have direct video output, and for this reason it needs some more dedicated pins (nine for VGA, maybe another ten for DVI, and some others of those pins for HDMI, DisplayPort, whatever else). This will make the mainboards cheaper and easier to build, would allow the use of existing north/south bridges and so on.
  • JKflipflop98 - Wednesday, February 10, 2010 - link

    We try to keep as few pins on the processor as needed. I would imagine there's only 8-10 video out pins that will need to be ran though a conversion chip of some sort to convert to monitor output. The CPU die itself is incapable of handling the +/- 5V needed to power your DVI connection. Reply
  • FaaR - Tuesday, February 09, 2010 - link

    No reason to add piles and piles of extra pins to the CPU to support the current mess of analog and digital interconnects, as well as waste die space on DACs, TDMS transmitters and so on when all of that belongs much more in the chipset.

    Consider AMDs current infatuation with multi-monitor support. Just how MANY pins would you like AMD to add to support 3+ monitors? It'd require dozens. You pack all that shit into the chipset instead and voila - you can support virtually as many monitors as you like. The hypertransport bus will be virtually unused with on-die graphics (save for some intermittent disk and network I/O which is marginal compared to the total bandwidth of the interface), it could support an almost "unlimited" number of outputs, or at least as many as the system has room for on the backplate. :)

    Not sure how you figure monitor outputs coming from the CPU would be "cheaper" or "easier to build" than them coming from the chipset. Seems about the same from my perspective.
  • GeorgeH - Monday, February 08, 2010 - link

    That threw me as well; Intel doesn't need a new socket for P55 vs H55, so why would AMD? Reply
  • qwertymac93 - Monday, February 08, 2010 - link

    they did need a new socket, what did you think socket 1156 was? Reply
  • GeorgeH - Monday, February 08, 2010 - link

    Market segmentation?

    AM3 was only released ~6 months before 1156 (and not in 2006), so it doesn't seem entirely out of the question that AMD could have planned for video out just as easily as Intel did.
  • stlbearboy - Monday, February 08, 2010 - link

    They could have, but then socket AM3 processors would not fit into Socket AM2+ boards. AM3 was designed to be backwards compatible with AM2. The question is will AMD at some point stop including IGP on their north bridges after the 800 series chipsets. Reply
  • Kiijibari - Monday, February 08, 2010 - link

    Because intel already designed the 1156 socket for that purpose, it is just some month old.

    AMDs socket AM3 however is based on AM2 and that was already 4 years ago ... nobody thought about onchip graphics back then.
  • weilin - Monday, February 08, 2010 - link

    It's very likely, when Intel overhauled their chipsets going from 775 to 1156 (Late 2009), they took into account the pins necessary for video whereas when AMD overhauled theirs from 939 to AM2/AM3 (~2006), this idea of fusion was just that... an idea... Reply
  • StevoLincolnite - Monday, February 08, 2010 - link

    I personally think that the new processors will work in AM3 boards, however doing so would then deactivate the Graphics, AMD has enjoyed some great backwards compatibility as of late and it's a good selling point, so it wouldn't surprise me if they engineered a solution around it.

    Plus if you were going to throw this chip into an AM3 board chances are you have a Radeon 3xxx or higher IGP anyway, and the GPU on the CPU wouldn't be required in such a case. (Especially considering there is a substantial amount of boards with Side-port memory, why make that go to waste?)

    I guess we have to do the waiting game for awhile yet for more details to emerge, but Currently I'm running an AM3 Athlon 2 620 in an ancient AM2 board (Not AM2+) and works a treat, if a new board is required for me to upgrade that system I wouldn't loose any sleep over it. :P
  • mesiah - Tuesday, February 09, 2010 - link

    Buying this chip then disabling the gpu would totally defeat the purpose of the chip. Reply
  • Griswold - Tuesday, February 09, 2010 - link

    You mean just like many are doing with the current clarkdales? Reply
  • mino - Saturday, February 13, 2010 - link

    CPU vs. GPU usability :
    Clarkdale usability 10:1
    Liano 10:5

    Hope you get the picture. Clarkdale has just a on-package bundled G45 northbridge.
    Liano is/will be a fully integrated design.
  • Jjoshua2 - Monday, February 08, 2010 - link

    What I would like to know is if there is any advantage for the common user, and if there is any advantage for the common AT reader. I think common user is fine with atom that can play hulu, while AT reader wants games and is interested in CUDA projects like folding@home, while this help that? Reply
  • techwriters4breakfast - Monday, February 08, 2010 - link

    went to this cpu article with anand as the usual author,

    then read first phrase: "After cashing Intel’s check..."

    i was prepared for some intel propaganda hehe.
  • acejj26 - Monday, February 08, 2010 - link

    Does no one edit slides before they publish them? "Across the broad?" Shouldn't PR slides go through several editors before being made public? Reply
  • jjjpflynn - Tuesday, February 09, 2010 - link

    Perhaps you should read more carefully.

    "...across the broad operating range of this core..."

    Nothing wrong with it at all.
  • UNHchabo - Monday, February 08, 2010 - link

    The sentence involved the phrase "...across the broad operating range..."

    I can see why you thought the line ended there though.
  • Kibbles - Monday, February 08, 2010 - link

    In that X ray, can you tell where the CPU/GPU are? Reply
  • Kiijibari - Monday, February 08, 2010 - link

    NOO, because it is just a plot of the x86 core ... it says so on the picture ... Reply
  • Kibbles - Monday, February 08, 2010 - link

    so that's one single core? Reply
  • Calin - Tuesday, February 09, 2010 - link

    Yes, one single core.
    The final processor will have four of those (each core includes L2 cache), then some "glue" logic, memory controller and so on, then the graphic unit.
  • Hiravaxis - Monday, February 08, 2010 - link

    I don't think DirectX 11 is of much value to a first iteration CPU/GPU integration like this.

    The GPU won't be powerful enough to take advantage of any of the Dx11 functions, so it's really just a check box for AMD.

    But kudos to them if they can get this out to compete with Sandy Bridge!
  • tcube - Saturday, March 20, 2010 - link

    just do the math man... this is NOT bulk we're talking about this is SOI/HkMg ... that means you will get it in 3ghz flavours, meaning it can push about 2TFLOPS at that horrendous power it is an equal match to todays descretes. Think: 4x the speed, 32 nm vs 40 nm, think much denser SOI vs bulk process.(provided it will not have less then 4x less rops then HD5870) This baby is going to play crysis like a charm.... not to mention new games. All this provided this thing is not going to be memory starved! So hell yeah! Give me 4 laned 2ghz DDR3 and I won't need a discrete any longer! This thing will prove or fail depending on io it will have... let's pray amd gets the balance right this time around! And let's pray we get a SOI incarnation for the HD5870 as the next generation... Reply
  • stlbearboy - Monday, February 08, 2010 - link

    They are not using DX11 for the graphics, they are the DirectCompute/Open CL functions. The hope is that software will be compiled using those API calls and thus will be faster on AMD silicon. Reply
  • MadMan007 - Monday, February 08, 2010 - link

    If a program is compiled for a standardized API it won't favor one architecture over another due to the compiler, rather the architecture that is a better implementation will run the code faster. Reply
  • Tanclearas - Tuesday, February 09, 2010 - link

    ... so the "architecture" that has DX11 hardware (the "AMD silicon") would presumably be faster (or "favored") if a program is compiled with the standardized DC/OCL API.

    That's pretty much what was said, so I'm not sure what you were getting at in your reply.
  • stlbearboy - Monday, February 08, 2010 - link

    DX11 has more compute features than DX10. So if a program is compiled in OpenCL, it will be able to perform more functions using the IGP on the AMD system than the Intel system. Kind of a reverse of optimizations on the CPU in which Intel normally included the instructions first. Reply
  • Alexvrb - Monday, February 08, 2010 - link

    Bingo. DirectCompute 5.0 requires a DirectX 11 chip. With DX10 you get DC 4.0, with DX10.1 you get DC 4.1. Both are largely inferior to DC 5.0. Reply
  • Hiravaxis - Tuesday, February 09, 2010 - link

    Will DC 5.0 go anywhere without Intel backing it?
    Or will this tech overlap at all with Nvidia's efforts with Fermi?
  • Alexvrb - Tuesday, February 09, 2010 - link

    As Mr Perfect points out, DC 5.0 runs on any DX11 card. I'm sure it will run on future DX12 cards too. Even integrated DX11 solutions from Nvidia/AMD will be able to run DC 5.0 code, and thus accelerate a lot of parallel code. Not to mention that a piece of software can be written with both DC 4.0 and DC 5.0 code, much like having multiple rendering paths in 3D games. Reply
  • Mr Perfect - Tuesday, February 09, 2010 - link

    AMD/ATI, Nvidia and Microsoft have all signed onto DX11 (which contains DirectCompute 5), so it's not waiting on Intel. Reply

Log in

Don't have an account? Sign up now