Turning Single into Multi-Threaded with Speculative Threading

Intel had a particularly interesting research project being demonstrated called Mitosis, a hardware and compiler solution to implementing speculative threading.

On modern day Out of Order microprocessors, the CPUs themselves will speculatively execute code based on what it thinks will need to be executed in the future - thus improving processor utilization and overall performance. This research project proposes that the same sort of speculative execution be applied on a thread level, meaning that threads are created on the fly and speculatively executed by idle cores in a system in order to improve performance on future multi-core CPUs.

The Mitosis project relies on both hardware and software (compiler) support to work. First, on the software side, blocks of code that have very few inputs and outputs are detected and considered for use as a separate thread.

The entry and exit points of the current working thread are marked, and the portion of the thread that would be split off is separated. The new thread is then fed the appropriate input data that it needs to begin its execution.

With the single thread now split into two threads, they are both sent to the multi-core processor and executed in parallel. At the end of the execution of the thread, its result is checked to make sure that the data is still valid, and if so, the result is committed and all is well. If the result is invalid the thread must be thrown away, but since we're talking about a single threaded application to begin with, there is no wasted performance, only wasted power as the core this thread was running on would have been idle had it not been for the speculative thread generation.

On the hardware side, there is one major change needed to implement Mitosis:

The inclusion of a global register file and a register versioning table to keep track of which cores have the latest and most correct register valuesis necessary. You also need some additional logic to help validate the outcome of these threads.

The end result of all of this is pretty promising, especially for single threaded applications that would otherwise get no benefit from being run on a multi-core CPU.

In order to demonstrate the performance potential, the Intel researchers working on the project took a look at performance in the Olden benchmark suite. The Olden suite was chosen because it is a set of code that is extremely difficult to parallelize. The graph below shows performance improvement over a single Out of Order core:

The green bars show the performance improvement going from single to dual core, the red bars show the performance improvement from having the benchmark and its data stored entirely within L1 cache (no cache misses) and finally the yellow bars show the performance improvement due to the use of the Mitosis compiler/hardware modifications with a dual core CPU.

As you can see, offering in many cases a 2 - 3x performance improvement is nothing short of impressive. But keep in mind, this project is in its very early stages of research and as promising as this looks, it may take 5 - 10 years for the research to make its way into the real world.

ATI's CrossFire on Intel 955X Motherboards Intel's BTX, Back at the Show
POST A COMMENT

14 Comments

View All Comments

  • Questar - Wednesday, August 24, 2005 - link

    The 5-10 year part is speculation of Anand. Intel never said it would take that long. I'll bet two years. It doesn't take 5 years to write a compiler or add a chip feature. Reply
  • Anand Lal Shimpi - Wednesday, August 24, 2005 - link

    The Intel rep that did the demo was the one that provided the 5 - 10 year estimate. This research is in its very early stages, but the promising first results means it will probably get more support.

    Take care,
    Anand
    Reply
  • drpepper128 - Wednesday, August 24, 2005 - link

    Is it just me or are we missing something here?
    To me it seems that the real power of Mitosis is that companies would not have to worry about writing code that is mult-threaded. Instead they can have single-threaded code and use the compiler to multi-thread it. This is where the real power of multi-core processors could come from. Some day when we have 100 core processors we will need something like a compiler to figure things out for us; otherwise a company's costs would skyrocket. Think somewhere along the lines of graphics cards.
    Reply
  • JarredWalton - Wednesday, August 24, 2005 - link

    I was thinking if they could get Mitosis into the chips (rather than required compiler support) then it would benefit practically *any* application. The only time it wouldn't help performance would be when your CPU was either fully loaded on every core, or perhaps if the multiple threads start using up resources that could be better used on stuff other than speculative execution. Reply

Log in

Don't have an account? Sign up now