Original Link: http://www.anandtech.com/show/1583


One of the first things we thought of when we heard that NVIDIA was going to try to bring back the multi-GPU craze was the single board solution. Even back in the 3dfx days, there was Obsidian ready with the single board SLI solution. Gigabyte is hitting multi-GPU technology hard out of the gate with a single board 6600 GT solution dubbed the 3D1. We were able to get our hands on this and two new motherboards from Gigabyte last week for a round of holiday testing.

The two major focuses of the article will be to explore any advantages offered by the 3D1 over two-card SLI solutions, and to take a first look at the performance of the GA-8AENXP Dual Graphic Intel SLI offering from Gigabyte. This is the 925XE version of the earlier announced 915P based Dual Graphic board.

The reader should understand this before beginning the review: these solutions are somewhat limited in application until NVIDIA changes its philosophy on multi-GPU support in ForceWare drivers. In order to get any multi-GPU support at all, the driver must detect an SLI capable motherboard. This means that we had to go back to the 66.81 driver in order to test Intel SLI. It also means that even if the 3D1 didn't require a special motherboard BIOS in order to boot video, it wouldn't be able to run in SLI mode unless it were in an SLI motherboard.

As it stands, the optimal single card solution can't be had until NVIDIA allows multi-GPU functionality to be enabled on motherboards without explicit SLI support. Combine this with a multi-GPU graphics card that doesn't require special BIOS hooks to POST, and we have a universal single card solution. Until then, bundling the GA-K8NXP-SLI motherboard and 3D1 is a very good solution for Gigabyte. Those who want to upgrade to PCI Express and a multi-GPU solution immediately have a viable option here. They get the motherboard needed to run an SLI system and two GPUs in one package with less hassle.

For now, we are very interested in taking a look at the first of many innovations that are sure to come out the graphics card vendors' multi-GPU R&D departments.

The Hardware

First up on the chopping block is the Gigabyte 3D1. Gigabyte is touting this board as being a 256bit 256MB card, but this is only really half the story. Each of the 6600GT GPUs is only privy to half of the bandwidth and RAM on the card. The 3D1 is obviously a 256MB card, since it physically has 256MB of RAM on board. But, due to the nature of SLI, two seperate 128bit/128MB busses do not translate to the same power as a 256bit/256MB setup does on a single GPU card. The reason for this is that a lot of duplicate data needs to be stored in local memory off of each GPU. Also, when AA/AF are applied to the scene, we see cases where both GPUs end up becoming memory bandwidth limited. Honestly, it would be much more efficient (and costly) to design a shared memory system into which both GPUs could draw if NVIDIA knew someone was going to drop the chips on one PCB. Current NVIDIA SLI technology really is designed to connect graphics cards together at the board level rather than at the chip level, and that inherently makes the 3D1 design a bit of a kludge.

Of course, without the innovators, we would never see great products. Hopefully, Gigabyte will inspire NVIDIA to take single-board multi-chip designs into account when building future multi-GPU options into silicon. Even if we do see a "good" shared memory design at some point, the complexity added would be orders of magnitude beyond what this generation of SLI offers. We would certainly not expect to see anything more than a simple massage of the current feature set in NV5x.

The 3D1 does ship with a default RAM overclock of 60MHz (120MHz effective), which will end up boosting memory intensive performance a bit. Technically, since this card just places two 6600GT cards physically on a single board and permanently links their SLI interfaces, there should be no other performance advantages over other 6600GT SLI configurations.

One thing that the 3D1 loses over other SLI solutions is the ability to turn off SLI and run two cards with more than two monitors on the output. Personally, I really enjoy running three desktops and working on the center one. It just exudes a sense of productivity that far exceeds single and dual monitor configurations.

The only motherboard that can run the 3D1 is the GA-K8NXP-SLI. These products will be shipping together as a bundle within the month, and will cost about as much as buying a motherboard and two 6600GT cards. As usual, Gigabyte has managed to pack just about everything but the kitchen sink into this board. The two physical x16 PCIe connectors are wired up with x16 and x8 electrical connections. It's an overclocker-friendly setup (though, overclocking is beyond the scope of this article), and easy to set up and get running. We will have a full review of the board coming along soon.

As it pertains to the 3D1, when connected to the GA-K8NXP-SLI, the x16 PCIe slot is broken into 2 x8 connections that are dedicated to each GPU. This requires the motherboards SLI card be flipped to single setting rather than SLI.

Under the Intel solution, Gigabyte is hoping that NVIDIA will decide to release full-featured multi-GPU drivers that don't require SLI motherboard support. Their GA-8AENXP Dual Graphic is a very well done 925XE board that parallels their AMD solution. On this board, Gigabyte went with x16 and x4 PCI Express graphics connections. SLI performance is, unfortunately, not where we would expect it to be. It's hard to tell exactly from where these limitations are coming, given the state of drivers for Intel lagging the AMD platform. One interesting thing to note is that whenever we had more than one graphics card plugged into the board, the card in the x4 PCIe (the bottom PCI Express slot on the motherboard) took the master role. There was no BIOS option to select which PCI Express slot to boot first as there was in the AMD board. Hopefully, this will be updated in a BIOS revision. We don't think that this explains the SLI performance (as we've seen other Intel boards perform at less than optimal levels), but having the SLI master in a x4 PCIe slot probably isn't going to help.

The revision on the GA-8AENXP Dual Graphic that we have is 0.1, so there is definitely some room for improvement.

But let's take a look at the numbers and see what the tests have to say.

The Test

These SLI systems can pull quite a lot of power, so we employed our 520W and 600W OCZ Powerstream PSUs to put voltage to our parts. We needed to use the 66.81 ForceWare drivers for the Intel system in order to get SLI to work.

Our single card NVIDIA 6600 GT tests were run on the AMD Athlon 64 platform.

 Performance Test Configuration
Processor(s): AMD Athlon 64 FX-53
Intel Pentium 4 EE 3.4GHz
RAM: 2 x 512MB OCZ PC3200 Platinum Rev. 2 (AMD)
2x 512MB PC2-4200 (Intel)
Hard Drive(s): Seagate 120GB 7200RPM IDE (8MB Buffer)
Motherboard & IDE Bus Master Drivers: Intel Chipset INF
NVIDIA nForce 6.31 (Beta)
Video Card(s): NVIDIA GeForce 6600 GT
Gigabyte 3D1 SLI
Video Drivers: NVIDIA ForceWare 71.20 (AMD)
NVIDIA ForceWare 66.81 (Intel)
Operating System(s): Windows XP Professional SP2
Motherboards: Gigabyte GA-K8NXP-SLI
Gigabyte GA-8AENXP Dual Graphic
Power Supply: 520W OCZ Powerstream PSU
600W OCZ Powerstream PSU

The GeForce 6800 Ultra numbers shown in the graphs are included as a reference point from our previous SLI tests, just to show where single card, single GPU performance compares to the 3D1 and SLI solutions. Since the previous tests were performed on an A64 4000+, the numbers are just to be used as a reference rather than a direct comparison.

Doom 3 Performance

There's a 2.7% difference in frame rate between the 3D1 and the 2 x 6600GT SLI solution at 16x12 under Doom 3 without AA. This performance improvement is all due to the memory clock speed increase over the stock 6600 GT speed on the 3D1. The extra 120MHz with which each GPU can hit memory helps to make up for the limited bandwidth to each chip. Off the bat, we don't see any performance gains inherent in going with a single card SLI solution.

Doom 3

Doom 3

With about a 5.3% increase in performance bump, the 3D1's lead over the stock 2 x 6600 GT solution is simply due to its 12% memory clock speed increase. Of course, it is good to confirm that there are no negatives that come from going with a single card SLI solution here.

Doom 3

Doom 3

Throughout this test, the Intel SLI solution performs very poorly, putting in numbers between one- and three-quarters their potential shown on the AMD platform. The fact that the Intel system is not as swift a performer under Doom 3 in general is not a help here either, but we are working with GPU limited tests that help to negate that factor.

Far Cry v1.3 Performance

With Far Cry v1.3, we basically see the same thing that we had seen under Doom 3 with no AA and AF. Performance is just about equal to normal SLI with stock 6600 GT cards except for a small percentage gain that is likely due to the memory boost of the Gigabyte solution. Again, the Intel solution starts off slow, putting in numbers that just barely best the single 6600 GT.

Far Cry

Far Cry

This time around, the 3D1, 2 x 6600 GT, and 6600 GT on the AMD system are all stuck at about the same performance point. However, the Intel SLI system makes a surprise showing and posts numbers which show that the second GPU doesn't actually need to remain useless. It just goes to show that there is some potential lying dormant in this hardware, which NVIDIA needs to unleash by flipping the switch in their drivers to allow multi-GPU on non-SLI motherboards.

Far Cry

Far Cry

Half-Life 2 Performance

Unfortunately, we were unable to test the Intel platform under Half-Life 2. We aren't quite sure how to explain the issue that we were seeing, but in trying to run the game, the screen would flash between each frame. There were other visual issues as well. Due to these issues, performance was not comparable to our other systems. Short of a hard crash, this was the worst possible kind of problem that we could have seen. It is very likely that this could be an issue that NVIDIA may have fixed between 71.20 and 66.81 on Intel systems, but we are unable to test any other driver at this time.

We are also not including the 6800 Ultra scores, as the numbers that we were using as a reference (from our article on NVIDIA's official SLI launch) were run using the older version 6 of Half-Life 2 as well as older (different) versions of our timedemos.

Continuing our trend with Half-Life 2 and graphics card tests, we've benched the game in 5 different levels in two resolutions with and without 4xAA/8xAF enabled. We've listed the raw results in these tables for those who are interested. For easier analysis, we've taken the average performance of each of the 5 level tests and compared the result in our graphs below.

 Half-Life 2 1280x1024 noAA/AF
   at_c17_12  at_canals_08  at_coast_05  at_coast_12  at_prison_05
NVIDIA GeForce 6600 GT 76.1 71.3 115 94.9 80
2x NVIDIA 6600 GT (AMD SLI) 77.2 98.8 118.5 117.5 116.6
Gigabyte 2x6600GT 3D1 77.3 99.1 118.7 117.9 118.2

 Half-Life 2 1600x1200 noAA/AF
   at_c17_12  at_canals_08  at_coast_05  at_coast_12  at_prison_05
NVIDIA GeForce 6600 GT 61.1 55.7 91.5 69.3 57.6
2x NVIDIA 6600 GT (AMD SLI) 73.5 85.8 110.8 104.9 92.9
Gigabyte 2x6600GT 3D1 73.6 87 111.4 106 94.6

 Half-Life 2 1280x1024 4xAA/8xAF
   at_c17_12  at_canals_08  at_coast_05  at_coast_12  at_prison_05
NVIDIA GeForce 6600 GT 40.5 40.1 74.8 54.9 45.1
2x NVIDIA 6600 GT (AMD SLI) 45.2 47.8 92.4 81 61
Gigabyte 2x6600GT 3D1 45.7 47.8 94.1 82.8 62.3

 Half-Life 2 1600x1200 4xAA/8xAF
   at_c17_12  at_canals_08  at_coast_05  at_coast_12  at_prison_05
NVIDIA GeForce 6600 GT 27.3 27.2 43.3 32.7 28.1
2x NVIDIA 6600 GT (AMD SLI) 33.8 35.3 58.1 49.2 39.8
Gigabyte 2x6600GT 3D1 33.8 35.3 59.3 50.8 40.6

The 3D1 averages about one half to one frame higher in performance than two stock clocked 6600 GT's in SLI mode. This equates to a difference of absolutely nothing with frame rates of near 100fps. Even the overclocked RAM doesn't help the 3D1 here.

Half-Life 2 Average Performance

Half-Life 2 Average Performance

We see more of the same when we look at performance with anti-aliasing and anisotropic filtering enabled. The Gigabyte 3D1 performs on par with 2 x 6600GT cards in SLI. With the RAM overclocked, this means that the bottleneck under HL2 is somewhere else when running in SLI mode. We've seen GPU and RAM speed to impact HL2 performance pretty evenly under single GPU conditions.

Half-Life 2 Average Performance

Half-Life 2 Average Performance

Unreal Tournament 2004 Performance

Under a DX8.1 game, the 3D1 and 6600GT SLI solutions still run neck and neck, with the Intel SLI coming in slower than the single 6600GT on the AMD platform.

Unreal Tournament 2004

Unreal Tournament 2004

Kicking on AA and AF just serves to push the performance of the Intel solution lower.

Unreal Tournament 2004

Unreal Tournament 2004

Wolfenstein: Enemy Territory Performance

Our tests wouldn't be complete without a Q3A based OpenGL game to round out the low end. Performance characteristics don't look much different here.

Wolfenstein: Enemy Territory

Wolfenstein: Enemy Territory

Though, the Intel board does better when AA and AF are enabled than when they aren't.

Wolfenstein: Enemy Territory

Wolfenstein: Enemy Territory

Final Words

We would like to begin our conclusion by thanking Gigabyte for being the first to come out with such a creative and bleeding edge product. We love to see companies pushing the envelope whereever possible, and this kind of thinking is what we want to see. Of course, it might be a little easier to work on technology if NVIDIA weren't so tight on restricting what they will and will not enable in their drivers.

Unfortunately, in light of the performance tests, there really isn't much remarkable to say about the 3D1. In fact, unless Gigabyte can become very price competitive, there isn't much reason to recommend the 3D1 over a 2-card SLI solution. Currently, buying all the parts separately would cost the same as what Gigabyte is planning to sell the bundle.

The drawbacks to the 3D1 are its limited application (it will only run on the GA-K8NXP-SLI), the fact that it doesn't perform any better than 2-card SLI, and the fact that the user loses a DVI and an HD-15 display connection when compared to the 2-card solution.

Something like this might be very cool for use in a SFF with a motherboard that has only one physical PCIe x16 connector with the NVIDIA SLI chipset. But until we see NVIDIA relax their driver restrictions, and unless Gigabyte can find a way to boot their card on non-Gigabyte boards, there aren't very many other "killer" apps for the 3D1.

The Gigabyte 3D1 does offer single card SLI in a convenient package, and the bundle will be quite powerful for those who choose to acquire it. But we aren't going to recommend it all the same.

As for the Intel solution, a lot rests on NVIDIA's shoulders here as well. With their new Intel chipset coming down the pipeline at some point in the future, it could be that they just don't want it to work well with others. Maybe they just want to sell more of their own parts. Maybe they are actually concerned that the end user won't have the best possible experience on hardware that hasn't been fully tested and qualified to work with SLI. In the end, we will have to wait and see what comes out of NVIDIA in terms of support for other hardware and the concoctions that their partners and customers cook up.

ATI should take note of the issues that NVIDIA is dealing with now, as there are many ways that they could take advantage of the present landscape.

Again, while we can't recommend the Gigabyte 3D1 over standard 6600 GT SLI solutions, we do hope to see other products like this step up to the plate. Ideally, in future single card, multi-GPU solutions, we would like to see full compatibility with any motherboard, the use of true 256-bit memory busses for each GPU (in order to see scalability apply well to memory-intensive settings as well - multiple NV41 GPUs would be nice to see), and three or four external display connectors rather than just two. It may be a lot to ask, but if we're expected to pay for all that silicon, we want to have the ability to take full advantage of it.

Log in

Don't have an account? Sign up now