Original Link: http://www.anandtech.com/show/1366

NV45 Preview: On Package HSI

by Derek Wilson on June 28, 2004 1:00 AM EST



Introduction

In our recent look at the new Intel 925X platform and PCI Express graphics, we bore witness to the final verdict on bridged vs. native performance. It is easy to see that both ATI and NVIDIA have a negligible difference when moving between PCIe and AGP under current DirectX and OpenGL games. Of course, we will take another look at the question when we get boards with both AGP and PCIe slots on them just to further seal the deal.

But performance differences weren't the only complaint that opponents of non-native PCIe GPUs levied against NVIDIA. And after last year's struggle through NV3x, NVIDIA just can't afford to lose any battles. So, what are NVIDIA doing about it? We brought you the answer from Computex 2004: NV45.



NVIDIA's NV45 looks just like a PCIe 6800 Ultra.


In short, NV45 is NV40 with an on package PCIe to AGP bridge (which NVIDIA calls "HSI" or "High Speed Interconnect"). And just what do we get from this package integration? Let's find out, shall we.




Bridged or Native: Six of One, Half Dozen of Another?

Before looking at how NV45 solves some of NVIDIA's problems, we must first learn what those problems are. To understand this, we need to take a step back and look at the transition to PCI Express on a larger scale, and why ATI chose native while NVIDIA decided on bridged.

If ATI had gone with a bridged solution rather than native, we wouldn't even be having this discussion. Their choice to go native was based on their internal assessment of the playing field. Obviously, we will eventually end up with all GPUs sporting native PCIe interfaces. Bridging does add a little latency and reduces the maximum possible bandwidth benefit from PCIe (especially if it wasn't possible to overclock the AGP interface to what would be 16x speeds as NVIDIA has done with their solution). Native solutions are also easier for OEM's to integrate onto their boards. Based on all this, we can't really say that ATI made a bad decision to go with native PCI Express. So, why did NVIDIA take a difference stance on the issue?

NVIDIA is trying to be cautious about the issue. With native PCI Express support, it would be necessary to fabricate twice as many different types of GPUs as it currently does with the only difference being the PCIe interface. Regardless of the size of a company, putting an ASIC to silicon is not something to be taken lightly (it's really expensive, especially when talking about potentially low yield (222M transistor) parts). At the same time, with separate GPUs specifically targeted at PCIe, it is necessary to estimate how many PCIe parts will be needed. This means estimating the adoption rate of PCIe. It's much easier to make a bunch of GPUs and bridges and play it by ear. This all makes logical sense, and as long as there is a negligible impact on performance and costs (in other words, value) delivered to the end users, we can't fault NVIDIA for going in this direction.



NV45 without the fan and shroud installed.


And since we've shown performance not to be an issue, in terms of the end users, only cost remains to be seen. Of course, in our assessment of why NVIDIA went with a bridge, we didn't answer one of the advantages that ATI has with its native solution: ease of integration by OEMs. More importantly, the disadvantage of the bridge solution isn't simply the inclusion of another discreet component (and another possible point of failure), but the continued need to route another parallel bus on an already packed card. Granted, the close proximity of the HSI to the GPU on NVIDIA cards makes the routing problem much less significant, but it would still be easier just to route serial lines from the connector to the GPU.

There are multiple squishy issues surrounding this, so we will do a little speculation using what we know as a basis. OEMs are essentially locked into selling their video cards at set price points, such as the current famous $499, $399, and $299 targets. They can't be competitive unless they meet these goals. They can't meet these goals and stay profitable unless ATI and NVIDIA keep the cost of their components down. It's certainly more cost effective for NVIDIA to pass on the cost of a bridge chip along with its GPUs than for ATI to have to deal with the impact of constructing a whole separate line of parts. Of course, we have no clue what kind of prices that ATI and NVIDIA charge for their components. As there is competition in the market place, it would make sense for parts targeted at similar markets to have similar prices. But how much does NVIDIA take into consideration the integration cost increase of its bridge to OEMs? Maybe, if ATI has to charge more to cover its fabrication costs and NVIDIA has some room to breathe, they can. But it's more likely that the OEMs would have to eat the cost, or pass it on to the consumer.

The unfortunate bottom line is that we really don't know the facts and details of what goes on behind the scenes, but there is a certain allure to not having to deal with a discrete bridge component. So, what's an IHV who's decided against native support to do? Enter NV45.




NV45's on Package PCIe to AGP HSI

Our discussion brings us to the conclusion that an external bridge is not going to incur a tangible performance hit, but it may cause headaches and added cost on the OEM's side due to complications with additional component layout and parallel bus routing (which certainly complicates pricing and profitability). And the solution is to move the bridge off of the card and onto the GPU package.



The tiny rectangle underneath the GPU is the HSI.


Now, not only is the HSI out of the vendors' hair, but it will also be cooled by the same HSF that sits on the GPU. This GPU looks and feels like a native solution to vendors (and by our benchmarks, it even acts very much like a native solution).

Dropping this thing on the package may seem out of left field, but even Intel has gone the additional-component-on-package route in the past; the original Pentium Pro had on package cache. A precedent like this begs the question: why didn't anyone think of this in the first place? This really makes perfect sense, especially if rumors are true that it's difficult to get a hold of HSI components unless bundled with a GPU.

So what's the down side? Well, there's still the issue of having less than PCIe bandwidth. This isn't going to be an issue for games in their current state, and won't likely be an real bottleneck in the future. Even on AGP, framebuffer read speed was only a couple of hundred MB/s on NVIDIA cards (and even less on ATI). The ability to get good traffic back over the bus has more to do with the GPU than the availability of bandwidth at this point.

The real thing that NVIDIA loses out on with a bridge is the ability to run multiple video streams up and down the PCIe bus with as much theoretical ease as ATI has. We are working hard to come up with some up and downstream PCIe benchmarks that will tell the real story about what is possible with framebuffer reads, and what is possible with video I/O on native and bridged PCIe GPUs.

But for now, we have NV45. No tangible performance impact under current games due to bridging (though a few odd numbers here and there with our current beta drivers), and now no added headache or extra development cost to OEMs over the ATI solution. With all that cleared up, let's head on to the tests.




The Card and The Test

Well, we don't know what NV45 will be called, we don't know when it will be available, and we don't actually know if these are the final clocks. But that's what we get with a preview or first look. Please note that these speeds could change before the final product is released. Our NV45 is running at 435MHz core and 1.1GHz memory. If our 6800 Ultra Extreme sample from NVIDIA had not been DOA (despite a couple hours of on-site BIOS flashing help from NVIDIA's Jim Black), this is the speed at which it should have run.

Of course, we are hearing something closer to 460MHz core clock (1.2GHz memory) for most vendors who have 6800 Ultra Extreme parts coming out, but that remains to be seen. In any event, since there's no real performance impact, we will finally be able to bring you numbers that are representative of what we should have seen, if we had gotten a working Ultra Extreme sample. Of course, it's on a 3.4GHz P4 EE running DDR2 RAM, so it's not really comparable to the number that we ran on the AMD Athlon 64 3400+, but we have some vendor's 6800 Ultra Extreme parts coming to the lab soon enough, and perhaps before then, we'll switch around our graphics test platform a little.

Here's the test platform we used.

 Performance Test Configuration
Processor(s): Intel Pentium 4 3.4GHz EE Socket 775
Intel Pentium 4 3.4GHz EE Socket 478
RAM: 2 x 512MB Micron DDR2 533
2 x 512MB Corsair 3200XL
(Samsung 2-2-2-5)
Hard Drive(s): Seagate Barracuda 7200.7
Video AGP & IDE Chipset Drivers: Intel Chipset Driver 6.0.0.1014
Intel Application Accelerator 4.0.0.6211
Video Card(s): nVidia NV45
nVidia GeForce 6800 Ultra PCI Express
nVidia GeForce 6800 Ultra AGP 8X
ATI Radeon X800 XT PCI Express
ATI Radeon X800 XT AGP 8X
Video Drivers: nVidia 61.45 Graphics Drivers
ATI Catalyst 4.6 beta
Operating System(s): Windows XP Professional SP1
Power Supply: HiPro 470W (Intel)
Vantec Stealth 470W Aluminum
Motherboards: Intel 925XCV (Intel 925X) Socket 775
Intel D875PBZ (Intel 875P) Socket 478

And here are the numbers that we've all been waiting for.




EVE: The Second Genesis Performance

It seems like even at 1600x1200 with everything turned on, EVE is a little more CPU or system limited than graphics card limited. This game is, however, a beautiful way to spend circa 33fps.

EVE: The Second Genesis

EVE: The Second Genesis




F1 Challenge '99-'02 Performance

With everything but 16x12 with 4xAA and 8xAF enabled, this benchmark looks a little CPU limited. The PCIe NV45 still can't put in a better performance than the AGP version of the 6800 Ultra.

F1 Challenge '99-'02

F1 Challenge '99-'02

F1 Challenge '99-'02

F1 Challenge '99-'02




FarCry Performance

Without AA and AF, FarCry ends up seeing a massive performance drop for anything that's an NVIDIA PCIe solution. We are still looking into this matter. With AA and AF enabled, however, a 6.7% increase in performance over the AGP version of the card is an impressive recovery.

FarCry

FarCry

FarCry

FarCry




Final Fantasy XI Benchmark Performance

We aren't seeing the NV45 do as well as we would expect here. We have only a negligible improvement over the PCIe version of the 6800U. Of course, this benchmark is very sensitive to multiple factors, so this could be a driver issue.

Final Fantasy XI Benchmark 2




Halo Performance

Even though the ATI cards still maintain the performance lead here, NV45 makes a valiant effort to catch up. It seems that core clock speed really helps out when it comes to Halo. Hopefully this will bode well for overclockers of next generation GPUs.

Halo

Halo




Homeworld 2 Performance

The performance increase here is definitely a good return on investment. NV45 comes out on top in both tests that we ran under Homeworld 2.

Homeworld 2

Homeworld 2




Jedi Knight: Jedi Academy Performance

With about a 6.4% increase at 16x12, NV45 does pretty well for itself. The benefit doesn't come out as much at lower resolutions, but that could be due to a little CPU limitation being imposed at 12x10.

Jedi Knight: Jedi Academy

Jedi Knight: Jedi Academy




Neverwinter Nights: Shadow of the Undrentide

Here, NV45 doesn't give much of an advantage, but it does keep the part up at the top of the heap. It does seem that in the two benchmarks without AA and AF, NV45 helped alleviate the slight lag in the PCIe 6800U.

Neverwinter Nights: Shadow of the Undrentide

Neverwinter Nights: Shadow of the Undrentide

Neverwinter Nights: Shadow of the Undrentide

Neverwinter Nights: Shadow of the Undrentide




Unreal Tournament 2004

Without AA and AF, (and even with AA and AF at a low resolution) NV45 keeps up just fine with the top performers. At high resolution with AA and AF on, the GPU with the on package HSI pulls ahead of the NVIDIA cards to just under the ATI offerings.

Unreal Tournament 2004

Unreal Tournament 2004

Unreal Tournament 2004

Unreal Tournament 2004




Warcraft III: The Frozen Throne

One would tend to think that VSYNC was enabled for this benchmark, but honestly, while it runs, we see FRAPS hitting higher and lower numbers. To be fair, in the future, we may start looking at this benchmark in OpenGL mode. We spoke with Blizzard recently, and they mentioned that the two rendering methods converged on visual quality. They maintain that the only differences are due to inherent properties of the different APIs, which result in graphics not being able to create duplicate scenes, pixel for pixel, in Warcraft III.

Regardless of all that, there's no real difference between any of the top cards in either of these benchmarks.

Warcraft III: The Frozen Throne

Warcraft III: The Frozen Throne




Wolfenstein: Enemy Territory

The New Low Res(TM), 1280x1024, looks CPU limited here with not much difference between NV45 and the 6800U flavors. When we turn on AA/AF or increase the resolution (or both), NV45 pulls ahead. At best, we get a 7.5% performance increase for a little more than 8% clock speed increase over 6800 Ultra. That's definitely not too shabby.

Wolfenstein: Enemy Territory

Wolfenstein: Enemy Territory

Wolfenstein: Enemy Territory

Wolfenstein: Enemy Territory




Final Words

So, what's the verdict? An on-package bridge will help NVIDIA to alleviate OEMs of any fears that they may have had about working with a non-native solution, and we aren't seeing any real performance drop due to the existence of a bridge. There is the bandwidth issue, which is very application-specific and doesn't affect anything that we normally test in our graphics suite (though we are looking for ways to incorporate a test).

On the other hand, NV45 in its current incarnation is only a 35MHz overclock beyond the 6800Ultra. This does help close the already small gap between the occasional PCIe benchmark that lags AGP. In a case or two, we see performance gains approaching that of the percent difference in clock speed (at 1600x1200 with 4xAA/8xAF anyway).

If NV45 doesn't make its way out until later, it's possible that we will be seeing this at a slightly different clock speed (the sample that we saw at Computex was clocked much lower than the sample we have in our labs). And in the end, it's not just a performance question, but a value question. Will NVIDIA charge more money for this part than for the discrete HSI solution paired with a GPU? Will OEMs save significant amounts of money on board costs? We will really have to wait and see what happens in the retail market over time to understand the answers to these questions.

There are still those PCIe adoption issues to worry about on both sides. How many on package chips to produce versus how many discrete HSI components to build is a tough decision. Having options is what NVIDIA is all about this time around. Last year, they were painted into a bit of a corner with NV3x and a 130nm fab process that wasn't quite what they wanted (or so the rumors go). This year, NVIDIA wants to be ready for anything. Hopefully, both ATI and NVIDIA will get what they deserve for building quality parts this time around.

Log in

Don't have an account? Sign up now