FSB Impact on Performance

We've alluded to FSB bandwidth being a fundamental limitation in Intel's multiprocessor architecture, and now we're here to address the issue a bit further.
A major downside to Intel's reliance on an external North Bridge is that it becomes very expensive to implement multiple high speed FSB interfaces as well as a difficult engineering problem to solve once you grow beyond 2-way configurations. Unfortunately Intel's solution isn't a very elegant one; regardless of whether you're running 1, 2 or 4 Xeon processors they all share the same 64-bit FSB connection to the North Bridge.

The following diagram should help illustrate the bottleneck:

In the case of a 4-way Xeon MP system with a 400MHz FSB, each processor can be offered a maximum of 800MB/s of bandwidth to the North Bridge. If you try running a single processor Pentium 4 3.0GHz with a 400MHz FSB you'll note a significant performance decrease and that's while still giving the processor a full 3.2GB/s of FSB bandwidth; now if you cut that down to 800MB/s the performance of the processor would suffer tremendously.

It is because of this limitation that Intel must rely on larger on-die L3 caches to hide the FSB bottleneck; the more information that can be stored locally in the Xeon's on-die cache, the less frequently the Xeon must request for data to be sent over the heavily trafficked FSB.

What's even worse about this shared FSB is that the problem grows larger as you increase the number of CPUs and their clock speed. A 2-way Xeon system won't experience the negative effects of this FSB bottleneck as much as a 4-way Xeon MP; and a 4-way Xeon MP running at 3GHz will be hurting even more than a 4-way 2.0GHz Xeon MP. It's not a nice situation to be in, but there's nothing you can do to skirt the issue, which is where AMD's solution begins to appear to be much more appealing:

First remember that each Opteron has its own on-die North Bridge and memory controller, so there are no external chipsets to deal with. Each Opteron CPU features three point-to-point Hyper Transport links, delivering 3.2GB/s of bandwidth in each direction (6.4GB/s full duplex). The advantage is clear: as you scale the number of CPUs in an Opteron server there are no FSB bottlenecks to worry about. Scalability on the Opteron is king, which is the result of designing the platform first and foremost for enterprise level server applications.

Intel may be able to add 64-bit extensions to their Xeon MPs, but the performance bottlenecks that exist today will continue to plague the Xeon line until there's a fundamental architecture change.

A Confusing Market Hyper Threading and The Tests
Comments Locked

58 Comments

View All Comments

  • Fraggster - Tuesday, March 2, 2004 - link

    intel=pwnd again :)
  • Jason Clark - Tuesday, March 2, 2004 - link

    64Bit tests are next on our agenda, once there is an Extended 64bit version of SQL Server.... :) We're looking into other avenues as well.

    Andreas, windows 2003 enterprise is what we used.
  • fukka - Tuesday, March 2, 2004 - link

    Would the Opterons gain any advantage using a 64bit OS (aka Linux) and a database that is much bigger than 4GB in size?

    That would be interesting to see, but I suppose the IA32e will address that advantage...
  • andreasl - Tuesday, March 2, 2004 - link

    Hey Anand have you thought about moving to Server 2003 instead of running 2000? And any chance of seeing 64-bit results anytime soon? (does a 64-bit version of your app even exist?)
  • christophergorge - Tuesday, March 2, 2004 - link

    Opteron only works with ECC registered memory. They only come up to DDR333.
  • raptor666 - Tuesday, March 2, 2004 - link

    Maybe because 4 way boards might not support it.

    Just a guess but honestly i'm not sure.

    Peter

  • tolgae - Tuesday, March 2, 2004 - link

    Stupid question probably but why didn't you use DDR400 on the Opteron?
  • CRAMITPAL - Tuesday, March 2, 2004 - link

    No surprises here... Anyone with a clue has known for a year that Opteron/A64 is a far superior architecture to anything Intel bulds, sells, or plans to produce in the next two years.

Log in

Don't have an account? Sign up now