Thread It Like Its Hot

Hyper Threading was a great technology, simply first introduced on the wrong processor. The execution units of any modern day microprocessor are power hungry and consume a lot of die space, the last thing you want is to have them be idle with nothing to do. So you implement various tricks to keep them fed and working as often as possible. You increase cache sizes to make sure they never have to wait on main memory, you integrate a memory controller to ensure that trips to main memory are as speedy as possible, you prefetch data that you think you'll need in the future, you predict branches, etc...

Enabling simultaneous multi-threaded (SMT) execution is one of the most power efficient uses of a microprocessor's transistor budget, as it requires a very minimal increase in die size but can easily double the utilization of a CPU's execution units. SMT, or as Intel calls it, Hyper Threading does this by simply dispatching two threads of instructions to an individual processor core at the same time without increasing the available execution resources. Parallelism is paramount to extracting peak performance out of any out of order core, double the number of instructions being looked at to extract parallelism from and you increase your likelihood of getting work done without waiting on other instructions to retire or data to come back from memory.

In the Pentium 4 days enabling Hyper Threading required less than a 5% increase in die size but resulted in anywhere from a 0 - 35% increase in performance. On the desktop we rarely saw a big boost in performance except in multitasking scenarios, but these days multithreaded software is far more common than it was six years ago when Hyper Threading first made its debut.


This table shows what needed to be added, partitioned, shared or unchanged to enable Hyper Threading on Intel's Core microarchitecture

When the Pentium 4 made its debut however all we really had to worry about was die size, power consumption had yet to become a big issue (which the P4 promptly changed). These days power efficiency, die size and performance all go hand in hand and thus the benefits of Hyper Threading must also be looked at from the power perspective.

I took a small sampling of benchmarks ranging from things like POV-Ray which scales very well with more threads to iTunes, an application that couldn't care less if you had more than two cores. What we're looking at here are the performance and power impact due to Hyper Threading:

Intel Core i7-965 (Nehalem 3.2GHz) POV-Ray 3.7 Beta 29 Cinebench R10 1CPU Race Driver GRID
HT Disabled 3239 PPS 207W 4671 CBMarks 161.8W 103 fps 300.7W
HT Enabled 4202 PPS 233.7W 4452 CBMarks 159.5W 102.9 fps 302W

 

Looking at POV-Ray we see a 30% increase in performance for a 12% increase in total system power consumption, that more than exceeds Intel's 2:1 rule for performance improvement vs. increase in power consumption. The single threaded Cinebench test shows a slight decrease in both performance and power consumption (negligible) and the same can be said for Race Driver GRID.

When Hyper Threading improves performance, it does so at a reasonable increase in power consumption. When performance isn't impacted, neither is power consumption. This time around Hyper Threading has no drawbacks, while before the only way to get it was with a processor that was too hot and barely competitive, today Intel offers it on an architecture that we actually like. Hyper Threading is actually the first indication of Nehalem's true strength, not performance, but rather power efficiency...

Intel's Warning on Memory Voltage Is Nehalem Efficient?
POST A COMMENT

74 Comments

View All Comments

  • Spectator - Monday, November 3, 2008 - link

    that sht is totally logical.

    And Im proper impressed. I would do that.

    you can re-process your entire stock at whim to satisfy the current market. that sht deserves some praise, even more so when die shrinks happen. Its an apparently seemless transition. Unless world works it out and learns how to mod existing chips?

    Chukkle. but hey im drunk; and I dont care. I just thought that would be a logical step. Im still waiting for cheap SSD's :P

    Spectator.
    Reply
  • tential - Monday, November 3, 2008 - link

    We already knew nehalem wasn't going to be that much of a game changer. The blog posts you guys had up weeks ago said that because of the cache sizes and stuff not to expect huge gains in performance of games if any. However because of hyperthreading I think there also needs to be some tests to see how multi tasking goes. No doubt those gains will be huge. Virus scanning while playing games and other things should have extremely nice benefits you would think. Those tests would be most interesting although when I buy my PC nehalem will be mainstream. Reply
  • npp - Monday, November 3, 2008 - link

    I'm very curious to see some scientific results from the new CPUs, MATLAB and Mathematica benchmarks, and maybe some more. It's interesting to see if Core i7 can deliver something on these fronts, too. Reply
  • pervisanathema - Monday, November 3, 2008 - link

    I was afraid Nehalem was going to be a game changer. My wallet is grateful that its overall performance gains do not even come close to justifying dumping my entire platform. My x3350 @ 3.6GHz will be just fine for quite some time yet. :)

    Additionally, its relatively high price means that AMD can still be competitive in the budget to low mid range market which is good for my wallet as well. Intel needs competition.
    Reply
  • iwodo - Monday, November 3, 2008 - link

    Since there are virtually no performance lost when using Dual Channel. Hopefully we will see some high performance DDR3 with low Latency next year?
    And which means apart from having half the core, Desktop version doesn't look so bad.

    And since you state the Socket 1366 will be able to sit a Eight Core inside, i expect the 11xx socket will be able to suit a Quad Core as well?

    So why we dont just have 13xx Socket to fit it all? Is the cost really that high?
    Reply
  • QChronoD - Monday, November 3, 2008 - link

    How long are they going to utilize this new socket??
    $284 for the i7-920 isn't bad, but will it be worth the extra to buy a top end board that will appreciate a CPU upgrade 1-2 years later? Or is this going to be useless once Intel Ticks in '10?
    Reply
  • steveyballme - Monday, November 3, 2008 - link

    We worked side by side with Intel to be sure that Vista was optimised for running on this thing!

    http://fakesteveballmer.blogspot.com">http://fakesteveballmer.blogspot.com
    Reply
  • Strid - Monday, November 3, 2008 - link

    Great article. I enjoyed reading it. One thing I stumbled upon though.

    "The PS/2 keyboard port is a nod to the overclocking crowd as is the clear CMOS switch."

    What makes a PS/2 port good for overclockers? I see the use for the clear CMOS switch, but ...
    Reply
  • 3DoubleD - Monday, November 3, 2008 - link

    In my experience USB keyboards do not consistently allow input during the POST screen. If you are overclocking and want to enter the BIOS or cancel an overclock you need a keyboard that works immediately once the POST screen appears. I've been caught with only a USB keyboard and I got stuck with a bad overclock and had to reset the CMOS to gain control back because I couldn't cancel the overclock. Reply
  • Clauzii - Monday, November 3, 2008 - link

    I thought the "USB Legacy support" mode was for exactly that? So legacy mode is for when the PC are booted in DOS, but not during pre? Reply

Log in

Don't have an account? Sign up now