Motherboards Memory Storage Cases/Cooling/PSUs IT Computing Displays Mobile Mac CPUs & Chipsets Video Digital Cameras Linux Gadgets Systems Trade Shows Guides Home Increase Font Size Decrease Font Size Change Page Size
The Era of Tera: Intel Reveals more about 80-core CPU
The Era of Tera: Intel Reveals more about 80-core CPU
Date: February 11th, 2007
Topic: CPU & Chipset
Manufacturer: Intel
Author: Anand Lal Shimpi
Buy the Intel BX80571E7500 Duo E7500 2.93GHz
Blank
 Directron $129.99
 Newegg $119.99
 CircuitCity $119.99
 
 

With no Spring Intel Developer Forum happening this year in the US, we turn to the International Solid-State Circuits Conference (ISSCC) for an update on Intel's ongoing R&D projects. Normally we'd hear about these sorts of research projects on the final day of IDF, these days presented by Justin Rattner, but this year things are a bit different. The main topic at hand today is one of Intel's Tera-scale computing projects, but before we get to the chip in particular we should revisit the pieces of the puzzle that led us here to begin with.

Recapping Tera-Problems

At the Spring 2005 Intel Developer Forum, Justin Rattner outlined a very serious problem for multi-core chips of the future: memory bandwidth. We're already seeing these problems today, as x86 single, dual and quad core CPUs currently all have the same amount of memory bandwidth. The problem becomes even more severe when you have 8, 16, 32 and more cores on a single chip.

The obvious solution to this problem is to use wider front side and memory buses that run at higher frequencies, but that solution is only temporary. Intel's slide above shows that a 6-channel memory controller would require approximately 1800 pins, and at that point you get into serious routing and packaging constraints. Simply widening the memory bus and relying on faster memory to keep up with the scaling of cores on CPUs isn't sufficient for the future of microprocessors.

So what do you do when a CPU's access to data gets slower and more constrained? You introduce another level in the memory hierarchy of course. Each level of the memory hierarchy (register file, L1/L2/L3 cache, main memory, hard disk) is designed to help mask the latency of accessing data at the level immediately below it. The clear solution to keeping massively multi-core systems fed with data then is to simply put more memory on die, maybe an L4 cache perhaps?

The issue you run into here is that CPU die space is expensive, and the amount of memory we'd need to keep tens of cores fed is more than a few megabytes of cache can provide. Instead of making the CPU die wider, Intel proposed to stack multiple die on top of each other. A CPU die, composed of many cores, would simply be one layer in a chip that has integrated DRAM or Flash or both. Since the per-die area doesn't increase, the number of defects don't go up per die.

 

Memory bandwidth improves tremendously, as your DRAM die can have an extremely wide bus to connect directly to your CPU cores. Latency is also much improved as the CPU doesn't have to leave the package to get data stored in any of the memory layers.

Obviously there will still be a need for main memory, as Intel is currently estimating that a single layer could house 256MB of memory. With a handful of layers, and a reasonably wide external memory bus, keeping a CPU with tens of cores fed with data now enters the realm of possibility.

A year and a half later, Rattner was back but this time he was tackling another aspect of the era of tera - bus bandwidth. Although 3D die stacking will help keep many cores on a single die fed with data, the CPU still needs to communicate with the outside world. FSB technology, especially from Intel, has remained relatively stagnant over the past several years. If we're talking about building CPUs with tens of cores, not only will they need tons of memory bandwidth but they'll also need a very fast connection to the outside world.

Intel's research into Silicon Photonics has produced a functional hybrid silicon laser demonstrated at the Intel Developer Forum late last year. The idea is that optical buses can offer much better signaling speed and power efficiency than their electrical equivalents, resulting in the ideal bus for future massively multi-core CPUs.

 

Justin Rattner's keynotes talked about some of Intel's Tera-scale projects, with 3D die stacking delivering terabytes of bandwidth needed for the next decade of CPUs and silicon photonics enabling terabits of I/O for connecting these CPUs to the rest of the system. The final vector that Rattner spoke about, was delivering a teraflop of performance. The CPU Rattner spoke of was a custom design by Intel that featured 80 cores on a single die, and today Intel revealed a lot more about its Teraflop CPU, the architecture behind it and where it fits in with the future of Intel CPUs.

 

The Chip   Next Page

 
  Index

Tools Share
Find lowest prices Find the lowest prices
Digg   del.icio.us   E-mail  
Print This Article Print this article  

25 Comments - Last by F1N3ST, 1005 days ago
Username:
Password:
pwn dongle by joex444, 1013 days ago
attack switch

Reply
In other words... by Justin Case, 1013 days ago
In other words, Intel is doing the same that IBM and AMD (with Cell and Torrenza + Fusion), only with some made-up numbers and more Powerpoint charts. Unless they vastly improve their compilers' paralellization, or come up with a full suite of software optimized for multi-core chips (80? It's hard enough to take full advantage of 4!), this will remain something that "can be done", but which most people will have no use for.

Reply
here is a solution, where is the problem by sprockkets, 1013 days ago
We have a solution to the problem of ever increasing CPU speed. My question is, who here needs it?

For those who need to open 80 Firefox tabs, video encoding, virus scanning and watching a HD movie, at the same time?

Data sets did need to get bigger, but check this out: Music files started out at small sampling rates till about Win98 they got to the cd standard. It stopped there since no one needs it bigger than that, that is, 44.1khz and 16 bit resolution. If you can hear 96/192khz 24bit music better, fine, but we have others saying that 128kbps mp3 was cd quality.

Video resolutions made their way from 640x480 to now around 1600x1200, and widescreen varients of that. Color depth sits at around 32bit. Can you see it improving beyond that?

OK, so we can what, go 3D now, holographic?

Sorry to you Intel and AMD, but the vast majority of the people you sell your technology to can live off a $30 processor and $50 of RAM, the smallest HDD, and a $30 optical drive which does everything.

Would be cool to see a motherboard with built in DDR3 or 4 memory for the cpu/gpu AMD Fusion core, and have 2GB of it, with 32GB of flash built on as well. Let's go for silent computing, you know, back in the day when all processors only had tiny heatsinks on them!!!

Reply
RE: here is a solution, where is the problem by mino, 1012 days ago
They need it (to sell). Period.

Reply
RE: here is a solution, where is the problem by cscpianoman, 1012 days ago
The average consumer might not need it, but large industries will be grabbing at these things faster than you can imagine. Think of health care, for example, the trend is to move towards genetic manipulations/prescreening. These industries want to download a person's entire genetic information, process it, and return it to you with the results of Alzhiemer's, cancer, and heart problems in a matter of minutes. Furthermore, the entertainment industry would love to create more special effects and render them that much faster. I'm sure if they could Pixar would already be placing an order for these. There are hundreds of applications out there that require the power and capability of multi-cores. Sure the consumer may not need it, but the consumer only accounts for less than approx. 5% of what Intel, AMD or whoever makes.

Reply
RE: here is a solution, where is the problem by Navitron, 1012 days ago
In the words of bill gates "No one will need more than 637 kb of memory for a personal computer." You sound just like him :P Don't bash the technology just because "right now" we don't need it. But what about in 10-20 years, you still think your core 2 duo is gonna cut it in 15 years? Can a IBM 80386 run doom 3? will todays AMD and Intels run -insert game here- 10 years from now.

So don't assume just because we don't need it now doesn't mean we wont need it in 3 years.

Reply
RE: here is a solution, where is the problem by Larso, 1012 days ago
So, why did we ever bother invent plastic materials? Or why invent the laser? The laser is a good example of an invention that was expected to be a useless curiosity but turned out to be hugely usefull.

But this case isn't even comparable to that. There are indeed problems waiting to be solved with this solution. All servers with more than a handfull of CPU's could be cut down in size and power usage tremendously, and what about supercomputers? They are going to be extra extremely powerfull when they change to this kind of cpu's.

And by the way, you have to be quite narrowminded to not see the (sales) potential of supercomputing at home. Lets have computer games with scary intelligent AI's :)

Reply
RE: here is a solution, where is the problem by joex444, 1012 days ago
What part of the article was confusing to you?

NOT FOR RETAIL SALE, COMMERCIAL USES ONLY.

I got the idea, guess you didn't. PWNT!

Reply
3rd paragraph doesn't make sense? by notposting, 1012 days ago
quote:

The obvious solution to this problem is to use wider front side and memory buses that run at higher frequencies, but that solution is only temporary. Intel's slide above shows that a 6-channel memory controller would require approximately 1800 pins, and at that point you get into serious routing and packaging constraints. Simply widening the memory bus and relying on faster memory to keep up with the scaling of cores on CPUs isn't sufficient for the future of microprocessors.


The picture above this shows the Terascale slide:
http://images.anandtech.com/reviews/cpu/intel/terascaleISSCC/terascale.jpg

Reply
What idiot said this: by mino, 1012 days ago
"Since the per-die area doesn't increase, the number of defects don't go up per die."

Any sensible person knows that defect-rate is(mostly) dependent on the number of functional units(i.e. transistors), provided the defect-rate off a single unit is set.

The fact that it is NOW mostly tied to die-area is caused exactly by the fact we do NOT use stacked-die aproach yet.

Otherwise a nice news piece. Thanks AT.

Reply
Comments Page 1 of 3

Unlicensed Software at Your Last Company
Anonymously Report Unlicensed Software with Our Form Now. Get Up to $1 Million.
We Buy Laptop and PC Memory! Sell to Us!
Min of 25 pieces required. Call us today at 239.354.1230.
Special Offer from The Economist
Get 12 issues of The Economist for $12. US subscribers only.
Free Forrester Risk Management Report
Demystifying Enterprise Risk Management. Download Free With Registration.
Download Microsoft Visual Studio ® Team System
Streamline Dev processes, Reduce time to market. Try Microsoft Visual Studio Team System, FREE!




Latest news by
DailyTech

 November 20, 2009

Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank

 November 19, 2009

Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank


Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
Blank
more CPU & Chipset Discussions



pipeboost
Copyright © 1997-2009 AnandTech, Inc. All rights reserved. Terms, Conditions and Privacy Information.
Click Here for Advertising Information