Instructions Gone Wild: Safe Instruction Recognition

The biggest fear with conventional in-order architectures is what happens if you have a high latency instruction that needs a piece of data that isn't available in the caches.

Since in-order microprocessors have to execute the instructions in order, the execution units remain idle until the CPU is able to retrieve the data it needs from main memory - a process that could easily take over a hundred clock cycles. The problem is that during these clock cycles, power is expended but no work is getting done - which is the exact opposite of what we want in an ultra low power microprocessor.

Out of order processors would get around this problem by simply scheduling around the dependent instruction. The scheduler would simply select the next instruction that was ready for execution and work would progress while the data dependent instruction waited for data for main memory. We've already established that a full OoOE core would be too power hungry for Atom, but relying on a pure in-order design also has the potential to be inefficient. Intel's Austin team found a clever middle ground for Atom.

It's called the Safe Instruction Recognition (SIR) algorithm and it works like this. If Atom is executing a long latency floating point operation followed by a short latency integer op you would traditionally stall until the FP op is complete (as we described above). The SIR algorithm looks at the two instructions and determines whether or not there are any data dependencies between the two (e.g. C = A + B followed by D = C + F), if there aren't then Atom will allow the "younger", shorter latency operation to proceed ahead of the longer FP operation.

SIR addresses a very specific case but it sprinkles a little bit of out-of-order goodness into the Atom's otherwise very strict in-order design. I wouldn't be too surprised if future iterations of Atom expand the situations in which these sort of out-of-order tricks are allowed.

2-Issue and In-Order: Intel's Version of the Cell's PPE Return of the CISC: Macro-op Execution
Comments Locked

46 Comments

View All Comments

Log in

Don't have an account? Sign up now