Inside Intel: From Silicon to the Worldby Anand Lal Shimpi on February 11, 2002 3:58 AM EST
- Posted in
The next type of technology that interested us was a brand new adder circuit. An adder circuit is obviously used in the integer addition of numbers and is a critical part of any CPU located in the ALU. When adding two numbers together in most of today’s CPUs the addition is performed using what’s known as 2’s complement addition. How is this any different from normal addition? It’s not really when you’re just adding two positive numbers; the numbers are added bit by bit so if we had two 4-bit numbers, 2 and 4 they would add like this:
The need for 2’s complement addition comes into play when you’re dealing with negative numbers. We’ll spare you the details of why this is done but when finding the negative or complement of a number using the 2’s complement method you simply take the positive number, flip all the bits, and add 1. For example, the answer we got before, 6, negated would be:
Why is this important? Say your extremely fast processor is adding away and all of the sudden it has to subtract a number. The processor must stop what it’s doing and find the 2’s complement of that number before it can continue adding. With processor pipelines getting longer and longer, such a stall would wreak havoc on performance. So modern day ALUs use what is known as Dual Rail Domino logic for their adders.
The theory behind a Dual Rail Domino adder is that you should compute the complement of a number in parallel with the actual computation of that number. Then if you have to subtract a number you don’t have to wait until the complement is calculated as it’s already ready for use. The problem with this approach is obvious; you need to have twice the amount of circuitry to implement a single adder as well as an adder circuit to calculate the complement in parallel.
The Pentium 4’s double-pumped ALUs are actually only 16-bits wide, thus requiring a single clock to produce one 32-bit result. It’s fairly obvious that Intel will be moving to 32-bit ALUs in the future (considering that all the high-speed ALU demos we’ve seen have been of 32-bit ALUs) which will mean that they will have even more circuitry to include in processors to calculate larger 32-bit integers and larger 32-bit complements.
Necessity is the mother of invention and thus Intel has come up with a way to reduce the size of their dual rail domino adders. Obviously the methods employed in reducing the size of the complementary adder circuit are held pretty dear to Intel so they aren’t readily revealed. They call their reduced size implementation a Complementary Signal Generator (CSG) because of its function as logic that generates the complement to a number represented through a signal. Intel’s CSG is lower power than conventional designs and uses fewer transistors which will be a bigger issue as they move towards higher-speed multi-GHz 32-bit ALU designs.