Intel's Larrabee Architecture Disclosure: A Calculated First Move
by Anand Lal Shimpi & Derek Wilson on August 4, 2008 12:00 AM EST- Posted in
- GPUs
A Tribute to Michael Abrash: The ISA
Some people idolize athletes. Others gravitate towards entertainers. While Derek is a hockey fan and a musician who loves watching movies, his real passion lead him in a different direction. And he's also going to devolve into first person singular for a minute to tell you a little more about that.
At the time I was a high school student who needed a good project outside the curriculum to teach to our C++ programming class (this was another one of the excellent projects Jo Adams set her students upon). My good friend Tom Macleod and I had just learned enough calculus and advanced geometry to be dangerous: we decided to write a 3D graphics engine in order to learn and teach graphics programming to the class.
To support this endeavor, I spent a bit of cash (well, my parent's cash anyway) on some graphics and game programming books for the occasion, and the one that really stood out (the one that set the course of my life) was Michael Abrash's Graphics Programming Black Book Special Edition. This giant tome contained quite a bit of collected wisdom regarding the art and science of code optimization and graphics programming as well as some great details about the development of Quake.
Not only was his book an incredible source of information and inspiration for me personally, but if there was ever an x86 assembly guru and graphics programming god that could help take the design of an instruction set architecture for Larrabee to a whole other dimension, it is Michael Abrash. And our information indicates that he has done just that.
This isn't to say that others on the Larrabee team don't deserve a spotlight; it's just exciting to see the guy who got me hooked on computer graphics programming (which lead to my interest in hardware) show up on such an impressive graphics hardware design team.
For those who haven't idolized Abrash, his Wikipedia entry helps explain his luminary status in the game industry:
"Michael Abrash is a highly regarded technical writer, and one of the top optimization and 80x86 assembly language programmers, a reputation cemented by his 1990 book Zen of Assembly Language Volume 1: Knowledge. Before getting into technical writing, Abrash was a game programmer, having written his first commercial game in 1982. After working at Microsoft on graphics and assembly code for Windows NT 3.1, he returned to the game industry in the mid-1990s to work on Quake for id Software. Some of the technology behind Quake is documented in Abrash's Graphics Programming Black Book. After Quake was released, Abrash returned to Microsoft to work on natural language research, then moved to the Xbox team, until 2001. In 2002, Abrash went to work for RAD Game Tools, where he co-wrote the advanced Pixomatic software renderer, which emulates the functionality of a DirectX 7-level graphics card and is used as the software renderer in such games as Unreal Tournament 2004."
Intel brought Abrash on as a consultant to help define the Larrabee instruction set. For the longest time, extensions to x86 (e.g. SSE4) were done by Intel engineers at the request of the software community. With every iteration of SSE the game industry was always happier but never truly satisfied with the extensions to x86 that Intel introduced. When Intel set out to define the extensions to x86 that would be used in Larrabee, it sought out visionaries within the game industry to help define that spec rather than creating hardware and defining the ISA internally. One thing we've consistently heard from game developers about Larrabee is that the ISA makes more sense than any other approach they have seen from ATI or NVIDIA. Larrabee's ISA was designed in part by the game industry, for that very industry.
Interestingly enough, while reluctant to go into details about the Larrabee ISA itself, Intel did tell us that fewer than 5% of the instructions are graphics specific. What they found is that creating overly specialized instructions doesn't always do that much good as they can be hard for compilers to use effectively and difficult to hand optimize with as well. Rather, having a good selection of generally applicable and powerful instructions is a better way to go.
One of the advantages of developing the compiler in parallel with the ISA itself is that they can easily test and adapt both as needed to understand how best to balance the ISA. As the vast majority of developers will rely on compilers to generate highly performant code, making sure the ISA is a good fit for compilers is essential. At the same time, because of the renewed interest in software graphics engines Larrabee is stirring up in the Old Guard of real-time 3D computer graphics, having icons like Michael Abrash on the team will help make sure that the ISA is not only compiler friendly but will also be attractive to those who wish to achieve Zen through assembly optimization.
Which brings us to an interesting point.
101 Comments
View All Comments
Shinei - Monday, August 4, 2008 - link
Some competition might do nVidia good--if Larrabee manages to outperform nvidia, you know nvidia will go berserk and release another hammer like the NV40 after R3x0 spanked them for a year.Maybe we'll start seeing those price/performance gains we've been spoiled with until ATI/AMD decided to stop being competitive.
Overall, this can only mean good things, even if Larrabee itself ultimately fails.
Griswold - Monday, August 4, 2008 - link
Wake-up call dumbo. AMD just started to mop the floor with nvidias products as far as price/performance goes.watersb - Monday, August 4, 2008 - link
great article!You compare the Larrabee to a Core 2 duo - for SIMD instructions, you multiplied by a (hypothetical) 10 cores to show Larrabee at 160 SIMD instructions per clock (IPC). But you show non-vector IPC as 2.
For a 10-core Larrabee, shouldn't that be x10 as well? For 20 scalar IPC
Adamv1 - Monday, August 4, 2008 - link
I know Intel has been working on Ray Tracing and I'm really curious how this is going to fit into the picture.From what i remember Ray Tracing is a highly parallel and scales quite well with more cores and they were talking about introducing it on 8 core processors, it seems to me this would be a great platform to try it on.
SuperGee - Thursday, August 7, 2008 - link
How it fit's.GPU from ATI and nV are called HArdware renderers. Stil a lot of fixed funtion. Rops TMU blender rasterizer etc. And unified shader are on the evolution to get more general purpouse. But they aren't fully GP.
This larrabee a exotic X86 massive multi core. Will act as just like a Multicore CPU. But optimised for GPU task and deployed as GPU.
So iNTel use a Software renderer and wil first emulate DirectX/OpenGL on it with its drivers.
Like nv ATI is more HAL with as backup HEL
Where Larrabee is pure HEL. But it's parralel power wil boost Software method as it is just like a large bunch of X86 cores.
HEL wil runs fast, as if it was 'HAL' with LArrabee. Because the software computing power for such task are avaible with it.
What this means is that as a GFX engine developer you got full freedom if you going to use larrabee directly.
Like they say first with a DirectX/openGL driver. Later with also a CPU driver where it can be easy target directly. thus like GPGPU task. but larrabee could pop up as extra cores in windows.
This means, because whatever you do is like a software solution.
You can make a software rendere on Ratracing method, but also a Voxel engine could be done to. But this software rendere will be accelerated bij the larrabee massive multicore CPU with could do GPU stuf also very good. But will boost any software renderer. Offcourse it must be full optimised for larrabee to get the most out of it. using those vector units and X86 larrabee extention.
Novalogic could use this to, for there Voxel game engine back in the day's of PIII.
It could accelerate any software renderer wich depend heavily on parralel computing.
icrf - Monday, August 4, 2008 - link
Since I don't play many games anymore, that aspect of Larrabee doesn't interest me any more than making economies of scale so I can buy one cheap. I'm very interested in seeing how well something like POV-Ray or an H.264 encoder can be implemented, and what kind of speed increase it'd see. Sure, these things could be implemented on current GPUs through Cuda/CTM, but that's such an different kind of task, it's not at all quick or easy. If it's significantly simpler, we'd actually see software sooner that supports it.cyberserf - Monday, August 4, 2008 - link
one word: MATROXGuuts - Monday, August 4, 2008 - link
You're going to have to use more than one word, sorry... I have no idea what in this article has anything to do with Matrox.phaxmohdem - Monday, August 4, 2008 - link
What you mean you DON'T have a Parhelia card in your PC? WTF is wrong with you?TonyB - Monday, August 4, 2008 - link
but can it play crysis?!