"PWShort" and "Broadcast" Communication Technologies

The NVIDIA nForce 790i (Ultra) SLI chipset will introduce a couple new data transfer technologies created in order to reduce some of the typical latencies experienced by GPU-to-GPU communications and generally improve bandwidth utilization whenever possible. "Posted-Write Shortcut" (PWShort) and "Broadcast" allows NVIDIA-based system with multiple GPUs to accelerate some of the standard functionality that has, in some configurations, resulted in performance bottlenecks.

Posted-Write Shortcut



In a 2-way SLI configuration, with the cards installed in the two PCI Express 2.0 slots, the 790i SPP handles the routing of some portion of the traffic between each GPU as well as requests for access to main system memory and communications to and from the CPU. Traffic can also be relayed between cards by way of the SLI bridge. Typically, data sent from one GPU to the other would travel by way of the PCIe controller. This would forward the packet on to the memory controller where the message would be decoded, parsed, and eventually returned to the PCIe controller for final disposition. This is a somewhat inefficient use of both memory controller and PCIe controller resources, not to mention the need for a tremendous amount of additional bandwidth needed to support what should be a simple GPU-to-GPU transfer.



The nForce 790i SPP now includes the ability to peek inside each data packet coming from any device serviced by the chipset's PCIe controller; it can then forward the message directly to its final destination if the intended target is indeed the tandem GPU. Most of these types of communications are data transfers required to keep the two cards' frame buffers synchronized so that each GPU is always reading and modifying the most recent video information. The improved point-to-point data interface made possible by PWShort technology acts to greatly reduce the latency of these types of transfers and ensure more efficient use of the memory-to-PCIe controller link resources.

Broadcast



With systems containing multiple GPUs, the CPU often needs to send the same data to each GPU - for example, when texture, geometry, or other common rendering data must be transfer over the FSB, ultimately intended for receipt by all GPUs. Normally the CPU must send a separate set of data packages in order to target each GPU. This causes FSB congestion due to the increased CPU-to-GPU traffic, which could be better used for other purposes such as main memory read or write requests or cache coherency with quad-core CPUs.



In addition to the direct link technology previously explained, Broadcast technology allows only one message to be sent by the CPU where it is then received, replicated, and broadcasted to all GPUs, eliminating the need for multiple, near-identical transfers over the FSB. This allows for more timely access to other resources that must share this common interface. We're not exactly sure how the SPP can control what the CPU does and does not send over the FSB, and it seems likely that the system drivers also have changes to accommodate this new technology; either way, there's definitely more to this than we have shared so far. We will be sure to pass along anything we learn.

790i users with single GPU systems should not see any appreciable gains due to these new technologies. In both cases, they are aimed at improving communication efficiencies associated with multi-GPU configurations. It's also difficult for us to quantify exactly what kind of impact these new features have on overall system performance. Isolating and attributing gains to either of these innovations is exceedingly difficult, since we can't just turn these features on and off at whim. While 2-way SLI stands to benefit, we can't help but think the PWShort and Broadcast technologies stand to benefit 3-way and 4-way SLI configurations even more; perhaps we will see better scaling with additional GPUs on 790i.

EVGA 790i Ultra SLI Board Layout and Features (Cont'd) Enthusiast System Architecture and the Improved NVIDIA Control Panel
Comments Locked

19 Comments

View All Comments

  • ATWindsor - Wednesday, March 19, 2008 - link

    I couldn't agree more, features is all well and good, but only if things works and are stable. No wonder people find it daunting to build a computer, even when you have done it several times you risk going into som "trap" with things not working the way it should, more focus on this in reviews please.

    AtW
  • theYipster - Tuesday, March 18, 2008 - link

    I agree with Lopri in every regard. AT needs to stop masquerading these technical showcase articles as reviews. In addition to what Lopri already mentioned, I would add that AT failed to a) address the long standing concerns held throughout the enthusiast community over nForce product quality (regarding the paragraph on PWM design... very undwerwhelming considering that it doesn't offer support to its claim) and b) failed to provide a fair assessment of the value proposition these boards provide. The article states that the 790i provides a noticeable step up in performance over previous generations, and that owners of previous boards would find upgrading worthwhile. This is a bold claim, as such an upgrade would cost nearly $1000 (when factoring in new DDR3 RAM) and would not even include a new CPU or graphics card. Yes, the NB runs a bit cooler and can OC a bit farther, but how and why is that worth $1000, even to the enthusiast who can afford it easily? Lets also consider the grander scheme of things. What worth is it for someone who enjoys the latest and greatest to spend $350 on a board when Nehalem will change all the rules in less than a year. At least previous generations (as well as Intel's X38) provide some shelf life.

    In any case, Overclock3d.net has a very informative review of the Striker II Extreme which covers almost everything Lopri mentioned.

  • ssiu - Tuesday, March 18, 2008 - link

    "The EVGA 790i Ultra also handled our QX9770 sample with relative ease. We were able to benchmark and play games without incident at 400MHz FSB, our mark of excellence when it comes to quad-core overclocking."

    That is a low standard of excellence for a high-end chip. The Q9300/Q9450 overclockers are going to cry.
  • greylica - Tuesday, March 18, 2008 - link

    Mwahaha, some will say :
    " Now we can finnaly play crysis ! "
    Well done, 66 fps...
  • n0nsense - Tuesday, March 18, 2008 - link

    We can for a very long time.
    I do it with 1920x1200 at all Med + 4AA
    I have the 680i (P5N32-E SLI) + E6300@2.8GHz (not the maximum, but lower fan speed = less noise) + 4GB OCZ ReaperX @ 800MHz 4-4-3-12 1T and single reference design 8800GT from ASUS at stock clock (the only modified sing, is stock cooler replaced with Arctic Cooling Accelero S1 which reduced card temp by 25C)
    As you can see MB - year old, CPU 1.5 years old.
    I can't tell you the exact fps, but it's completely smooth playing.
    I expect next generation to bring same smooth play at all very high + all filterings for existing games.

    BTW, where 9800x2 in SLI tests on this 790i ?
  • SpaceRanger - Tuesday, March 18, 2008 - link

    When do you think nVidia will be putting out these boards for AMD CPU's. The only thing I see for AMD CPU's are boards that support CROSSFIRE, but not SLI.
  • ap90033 - Tuesday, March 18, 2008 - link

    It just costs to MUCH. I got 8 GIGS DDR2 800 an E8400 and a Single 8800GTS 512 meg, and I have the CPU Running at 3.6 (I am looking to try 3.8 maybe) and I can play any game maxed except Crysis. I can play it at high at 1024x768. I looked at SLI but its to danged expensive, I had 1220$ to spend and decided to get the most performance for the money. I wish they would quit going up in price on these motherboards, hey Nvidia, you do know I can get a GREAT Overclockers motherboard with good features (NO SLI OF COURSE) for $80 right? Why would I pay $250+ more for the board, another $200+ more for DDR3 Ram, and another $250 for another 8800GTS just so "some" games would run 15% faster? Are you nuts??? 10-15% but it costs like $800 MORE???? I think Ill save my $800 or so and use it on my next video card upgrade, my next CPU upgrade, and the next video card upgrade after that! LOL
  • krnmastersgt - Tuesday, March 18, 2008 - link

    Because this isn't meant at the people that want the best price/performance, this is for the uber-high end user, the extreme benchmarker/extreme gamer, of course by your logic SLI and CrossFire are stupid wastes of money since the performance doesnt scale linearly, but this is meant for enthusiasts and therefore you shouldn't compare it with something like a P35 board.
  • crimson117 - Tuesday, March 18, 2008 - link

    As an example, I was helping configure a Dell for a home office user, non-gamer, no video editing, etc, but he was fairly well-off money-wise. While picking options, at one point I said something about some component being "plenty for most users" and he replied (in a nice way) "I'm not most users"; so we went with the upgraded version even though the price performance, especially for his usage pattern, didn't make fiscal sense.

    The moral is there are people out there who get satisfaction over having the absolute best no matter the price.

    Relatedly, an experiment found that people perceive $90 wine as tasting better than $10 wine, even when it was secretly http://www.news.com/8301-13580_3-9849949-39.html">the same exact wine.

Log in

Don't have an account? Sign up now