PCI Express Graphics

When the first 440LX motherboards hit the streets with AGP support, it was so exciting to finally have a new slot on motherboards that had been littered with PCI and ISA slots for so long. The excitement of having that new slot is once again duplicated with the new PCI Express x16 slots that have found their way onto 925X and 915 boards. So, what is the big deal behind PCI Express as a graphics bus?




Click to enlarge.


For starters, AGP/PCI are parallel and PCI Express is serial. What this means is that rather than sending multiple bits at a time, PCI Express only sends one bit per clock in each direction. Mildly confusing is the fact that multiple PCI Express lanes can be connected to one device (giving us PCI Express x4, x8, x16 and so on). Why is an array of serial interfaces different than a parallel interface? We're glad that you asked.

Signaling is generally more difficult using a parallel communication protocol. One of the problems is making sure that all the data being sent in parallel makes it to its destination in a timely fashion (along with all the signaling and control flow data that are included with each block of data sent). This makes circuit board layout a little tricky sometimes, and forces us to keep signaling over cables to relatively short distances using equal length lines (e.g. an IDE cable). The fact that so much care needs to be taken about getting all the bits to their destination intact and together also limits signaling speed. Standard 32bit PCI speed is 33MHz. DDR memory is connected to the rest of the system in parallel and runs at a few hundred MHz. On the other hand, one PCI Express lane is designed to scale well beyond 2GHz.

The downside of this enhanced speed and bandwidth is bus utilization. Obviously, if we are sending data serially, we are only sending one bit every clock cycle. This is 32 times less data per clock cycle than the current PCI bus. Add on to that, the fact that all low level signaling and control information need to come over the same single line (well, PCIe actually uses a differential signal - two lines for one bit - but who's counting). On top of that, serial communications don't really react well to long strings of ones or long strings of zeros, so extra signaling overhead is implemented to handle those situations better. Parallel signaling has its share of problems, but a serial bus will always have lower utilization. Even in cases where a serial bus has a bandwidth advantage over a parallel bus, latency may still be higher with the serial bus.

Fortunately, PCI Express is a very nice improvement over the current PCI bus. It's point to point, so we don't need to deal with bus arbitration; its serial, so it will be easy to route on a motherboard (with just four data wires for PCI Express x1) and will scale up in speed more easily. It's also backwards compatible with PCI from the software's perspective (which means developers will have an easier time porting their software).

Unfortunately, it will be harder for users to "feel" the advantages of PCI Express over PCI, especially while the transition is going on and motherboards will be supporting "legacy" PCI slots and busses, and companies will have to find the sweet spot between their PCI and PCI Express (or AGP and PCI Express) based cards. Software won't immediately take advantage of the added bandwidth because it is common practice (and simply common sense) to develop for the widest audience and highest level of compatibility when dealing with any type of computing.

Even after game developers make the most of PCI Express x16 in its current form, end users won't see that much benefit - there's a reason that high end GPUs have huge numbers of internal registers and a 35GB/s connection to hundreds of megs of local GDDR3. By the time games come out that would even think of using 4GB/s up from and 4GB/s down to main memory, we'll have even more massive amounts of still faster RAM strapped on to graphics boards. The bottom line is that the real benefit will present itself to applications that require communication with the rest of the system, like video streaming and editing, or offloading some other type of work from the CPU onto the graphics card.

Visual Studio Compile Time PCI Express Graphics Cards
Comments Locked

39 Comments

View All Comments

  • khirareq - Friday, April 1, 2005 - link

    Um, sorry, but i feel that i need to point something out


    You state a number of times that the pins need to be twisted in order to secure the HSF - If you read the leaflet thats included with the CPU, it staes that the Pins are twisted in order to relase the HSF for removal

    Intels Manual DOwnload (>10meg):
    http://support.intel.com/support/processors/sb/CS-...

    Screenshot of the page:
    http://photobucket.com/albums/v337/khirareq/?actio...

    I discovered this at work the other night after spending some time trying to work out how to remove one, and resigned to reading the manual (turn out the HSF was faulty and jammed in the board anyways)
  • Pete - Tuesday, June 29, 2004 - link

    Anand, not to get too confrontational, but have I offended you in such a way that you choose not to reply to my questions? I'm not sure why my surprise at the 6800U's gains in Far Cry aren't worth remarking on.

    I'd appreciate an answer. If you take exception to my questioning your numbers, I'd be satisfied with a reply to that effect, and I'd readily apologize if I've offended you with my perhaps overly blunt questioning.
  • justly - Sunday, June 27, 2004 - link

    Anand, thank you for the response, and for the effort you put forth in getting it.

    A few (minor) questions could still be asked about mechanical stability, but it is much more believable than the electrical issue.

    Again, thank you.
  • Anand Lal Shimpi - Friday, June 25, 2004 - link

    justly

    As promised, I got together with Intel to talk about their statement. Intel has revised their statement and instead state that the ~40 lbs of pressure is used for mechanical stability and not for the stability of the electrical connections - good call :)

    As you already mentioned, LGA-775 is a different story since it needs the pressure to keep the contact with the pins. Apparently the heatsink doesn't need to apply as much pressure as before since the mechanical stability isn't an issue with LGA-775.

    So in the end it wasn't a heat transfer issue or an electrical issue, purely mechanical.

    I've made the appropriate corrections to the piece.

    Take care,
    Anand
  • Pete - Tuesday, June 22, 2004 - link

    Hi Anand,

    Any comment re: my previous post on the 6800U Far Cry numbers? Just checking if they're right. Thanks.
  • Cygni - Tuesday, June 22, 2004 - link

    Ive actually discussed Prescott a little with a designer at Intel's Folsom facility (although this person worked on the Granite Bay chipset and then some Centrino work). He cant really figure out the chip either, but he believes that the entire purpose of Prescott hasnt been taken out from under wraps yet. Possibly mechanisims to combat the problems with increases in clock speed etc... things that are on the core, just not activated (ala HT). I guess we will see. Maybe the purpose of Prescott is to ready technologies and proccesses to combat Hammer's successor when it appears? Neither of us were sure.
  • stephenbrooks - Tuesday, June 22, 2004 - link

    Those software compilation scores do not look pretty for Intel. Looks like they'll be approaching 5GHz before a Prescott-like processor will beat even an FX-53! 8-\ New CPU core, please...
  • araczynski - Tuesday, June 22, 2004 - link

    very nice article, like the depth.

    sounds like the bottom line (for my tastes) is to get the 6800U and forget the intel line for another year.
  • Anand Lal Shimpi - Tuesday, June 22, 2004 - link

    ThePlagiarmaster

    Sorry, I completely forgot to post my reply to your post :)

    We started using Gordion Knot because that's what we found was most recommended for high quality DivX ripping. Instead of just benchmarking every codec/ripping tool for our CPU reviews, what I'd rather do is compare all of the codecs/tools and figure out which one truly offers the best quality - then it's the performance using that configuration that matters. After all, who cares if AMD or Intel is faster if it's on an application that no one actually uses; that's not the point of a real world benchmark.

    Give us time, and we will not disappoint. I've already talked to Derek about doing such an article, but now I think I'm going to push up its priority a bit.

    Take care,
    Anand
  • ThePlagiarmaster - Tuesday, June 22, 2004 - link

    Anand.

    I take it no comment means you're off benchmarking dvd2avi for a divx showdown?? :)

    Pumpkinierre,

    You're welcome :) Hopefully we'll get some benchmarks here, proving once and for all who's rules the divx roost. At least Anand's users would be more informed in the end. For anyone interested LOOK HERE:
    http://www.hardocp.com/article.html?art=NjMwLDU=
    Looks like a 20% victory for AMD64 in Divx (dvd2avi). A quick look lower on the page shows Intel(3.6ghz 3.4EE) with about the same 20% victory in Divx(Xmpeg frontend). Perhaps Anand can end it all by testing one against the other?

    Maybe a whole article could be done on this? With say, Ripping to Divx, Ripping R9 Retail to DVD5 (CCE/Tmpeg etc?), Ripping MP3's etc. I'm sure there are more CPU intesive ideas, but the point is finding the best app to do the same job on both platforms. Rather than a blanket statement like 'intel is better than amd at divx' when it's not clear that's true. Not with so many frontends to choose from that do the same job, and CLEARLY they perform DRASTICALLY different on each cpu (amd/intel). With games it's cut and dried (no frontends, just the game itself), but apps are a different story.

    Plag

Log in

Don't have an account? Sign up now