The basic building block of Bulldozer is the dual-core module, pictured below. AMD wanted better performance than simple SMT (ala Hyper Threading) would allow but without resorting to full duplication of resources we get in a traditional dual core CPU. The result is a duplication of integer execution resources and L1 caches, but a sharing of the front end and FPU. AMD still refers to this module as being dual-core, although it's a departure from the more traditional definition of the word. In the early days of multi-core x86 processors, dual-core designs were simply two single core processors stuck on the same package. Today we still see simple duplication of identical cores in a single processor, but moving forward it's likely that we'll see more heterogenous multi-core systems. AMD's Bulldozer architecture may be unusual, but it challenges the conventional definition of a core in a way that we're probably going to face one way or another in the not too distant future.


A four-module, eight-core Bulldozer

The bigger issue with Bulldozer isn't one of core semantics, but rather how threads get scheduled on those cores. Ideally, threads with shared data sets would get scheduled on the same module, while threads that share no data would be scheduled on separate modules. The former allows more efficient use of a module's L2 cache, while the latter guarantees each thread has access to all of a module's resources when there's no tangible benefit to sharing.

This ideal scenario isn't how threads are scheduled on Bulldozer today. Instead of intelligent core/module scheduling based on the memory addresses touched by a thread, Windows 7 currently just schedules threads on Bulldozer in order. Starting from core 0 and going up to core 7 in an eight-core FX-8150, Windows 7 will schedule two threads on the first module, then move to the next module, etc... If the threads happen to be working on the same data, then Windows 7's scheduling approach makes sense. If the threads scheduled are working on different data sets however, Windows 7's current treatment of Bulldozer is suboptimal.

AMD and Microsoft have been working on a patch to Windows 7 that improves scheduling behavior on Bulldozer. The result are two hotfixes that should both be installed on Bulldozer systems. Both hotfixes require Windows 7 SP1, they will refuse to install on a pre-SP1 installation.

The first update simply tells Windows 7 to schedule all threads on empty modules first, then on shared cores. The second hotfix increases Windows 7's core parking latency if there are threads that need scheduling. There's a performance penalty you pay to sleep/wake a module, so if there are threads waiting to be scheduled they'll have a better chance to be scheduled on an unused module after this update.

Note that neither hotfix enables the most optimal scheduling on Bulldozer. Rather than being thread aware and scheduling dependent threads on the same module and independent threads across separate modules, the updates simply move to a better default cause of scheduling on modules first. This should improve performance in most cases but there's a chance that some workloads will see a performance reduction. AMD tells me that it's still working with OS vendors (read: Microsoft) to better optimize for Bulldozer. If I had to guess I'd say that we may see the next big step forward with Windows 8.

AMD was pretty honest when it described the performance gains FX owners can expect to see from this update. In its own blog post on the topic AMD tells users to expect a 1 - 2% gain on average across most applications. Without any big promises I wasn't expecting the Bulldozer vs. Sandy Bridge standings to change post-update, but I wanted to run some tests just to be sure.

The Test

Motherboard: ASUS P8Z68-V Pro (Intel Z68)
ASUS Crosshair V Formula (AMD 990FX)
Hard Disk: Intel X25-M SSD (80GB)
Crucial RealSSD C300
Memory: 2 x 4GB G.Skill Ripjaws X DDR3-1600 9-9-9-20
Video Card: ATI Radeon HD 5870 (Windows 7)
Video Drivers: AMD Catalyst 11.10 Beta (Windows 7)
Desktop Resolution: 1920 x 1200
OS: Windows 7 x64 SP1 w/ BD Hotfixes
Single & Heavily Threaded Workloads Need Not Apply
POST A COMMENT

79 Comments

View All Comments

  • Beenthere - Friday, January 27, 2012 - link

    The Win 7 Hot Fix speaks for iteslf. As noted it's a small bump - but it's free. It's not AMD's fault that Microsucks O/Ss sucks. It's reported Linux does a better job of scheduling, probably because it's used on a lot of servers with heavy work loads.

    I always tell people to buy what makes them happy. If you're happy with a product from a convicted criminal corporation and chose to support their efforts to eliminate consumer choice and drive up PC hardware prices - that's your choice and you're perfectly free to do so. Bashing AMD is not going to change reality however, no matter how disappointed you are in them.

    In reality ANY of todays current model CPUs have more than enough computing power for 90+ percent of PC users. If all you do is run benchmarks then you could be misinformed...

    http://www.theinquirer.net/inquirer/news/2120866/i...
    Reply
  • gamerk2 - Friday, January 27, 2012 - link

    Funny, considering different Linux distributions use different schedulers. Lets not also forget there is OVERHEAD to doing a lot of processing within the scheduler, and keeping track of thread/resource use can be a pain. Reply
  • sor - Saturday, January 28, 2012 - link

    Huh? The process scheduler in Linux is dependent on which version of the kernel you have. Any current distribution should be using CFS. You may be confusing this with the options of IO schedulers. Reply
  • B3an - Friday, January 27, 2012 - link

    Oh grow up you immature moron (typical Linux user!). And Apple are FAR worse now than MS ever was, as well as bigger. BTW it's not the 1990's anymore, Reply
  • frozentundra123456 - Saturday, January 28, 2012 - link

    I cant believe people are still blaming Microsoft for bulldozers failure. It seems to me that the responsibility of a company is to bring out a product that works in the current environment, i.e. that works efficiently with win 7. Especially when you control a small portion of the market, you should make a product that "just works". You shouldnt expect the software to be rewritten for your product. And Intel doesnt seem to have any problem making processers that work efficiently with the current environment. Reply
  • Morg. - Tuesday, January 31, 2012 - link

    I like the *could* be misinformed -- if Intel didn't want benchmarks and reviewers to like them, I'm pretty sure they wouldn't do anything for it ;) Reply
  • cigar3tte - Friday, January 27, 2012 - link

    I don't normally bother, but there is way too many here...
    Worth wild = worthwhile
    It's = its
    Their = There
    Your = you're
    Losses = loses

    And yeah, it's more of a free fix than free performance. Bulldozer users are getting back what was lost, rather than gaining something.
    Reply
  • snouter - Friday, January 27, 2012 - link

    "there are way too many"

    I don't normally bother either.
    Reply
  • jonup - Saturday, January 28, 2012 - link

    You can always look at a glass as half-full or half-empty.

    @typos: Sad part is that some of them are native speakers.
    Reply
  • gevorg - Friday, January 27, 2012 - link

    Can the Sandy Bridge CPUs benefit from this by any chance? Reply

Log in

Don't have an account? Sign up now