Name: ATI Radeon HD 4890 vs. NVIDIA GeForce GTX 275
Item: ATI Radeon HD 4890 vs. NVIDIA GeForce GTX 275
Author: Anand Lal Shimpi & Derek Wilson

Original Link: https://www.anandtech.com/show/2745

ATI Radeon HD 4890 vs. NVIDIA GeForce GTX 275

VIEW ARTICLE

by Anand Lal Shimpi & Derek Wilson on April 2, 2009 12:00 AM EST

Posted in
GPUs

294 Comments

I'm not really sure why we have NDAs on these products anymore. Before we even got our Radeon HD 4890, before we were even briefed on it, NVIDIA contacted us and told us that if we were working on a review to wait. NVIDIA wanted to send us something special.

Then in the middle of our Radeon HD 4890 briefing what do we see but a reference to a GeForce GTX 275 in the slides. We hadn't even laid hands on the 275, but AMD knew what it was and where it was going to be priced.

If you asked NVIDIA what the Radeon HD 4890 was, you'd probably hear something like "an overclocked 4870". If you asked AMD what the GeForce GTX 275 was, you'd probably get "half of a GTX 295".

The truth of the matter is that neither one of these cards is particularly new, they are both a balance of processors, memory, and clock speeds at a new price point.

As the prices on the cards that already offered a very good value fell, higher end and dual GPU cards remained priced significantly higher. This created a gap in pricing between about $190 and $300. AMD and NVIDIA saw this as an opportunity to release cards that fell within this spectrum, and they are battling intensely over price. Both companies withheld final pricing information until the very last minute. In fact, when I started writing this intro (Wednesday morning) I still had no idea what the prices for these parts would actually be.

Now we know that both the Radeon HD 4890 and the GeForce GTX 275 will be priced at $250. This has historically been a pricing sweet spot, offering a very good balance of performance and cost before we start to see hugely diminishing returns on our investments. What we hope for here is a significant performance bump from the GTX 260 core 216 and Radeon HD 4870 1GB class of performance. We'll wait till we get to the benchmarks to reveal if that's what we actually get and whether we should just stick with what's good enough.

At a high level, here's what we're looking at:

	GTX 285	GTX 275	GTX 260 Core 216	GTS 250 / 9800 GTX+
Stream Processors	240	240	216	128
Texture Address / Filtering	80 / 80	80 / 80	72/72	64 / 64
ROPs	32	28	28	16
Core Clock	648MHz	633MHz	576MHz	738MHz
Shader Clock	1476MHz	1404MHz	1242MHz	1836MHz
Memory Clock	1242MHz	1134MHz	999MHz	1100MHz
Memory Bus Width	512-bit	448-bit	448-bit	256-bit
Frame Buffer	1GB	896MB	896MB	512MB
Transistor Count	1.4B	1.4B	1.4B	754M
Manufacturing Process	TSMC 55nm	TSMC 55nm	TSMC 65nm	TSMC 55nm
Price Point	$360	~$250	$205	$140

	ATI Radeon HD 4890	ATI Radeon HD 4870	ATI Radeon HD 4850
Stream Processors	800	800	800
Texture Units	40	40	40
ROPs	16	16	16
Core Clock	850MHz	750MHz	625MHz
Memory Clock	975MHz (3900MHz data rate) GDDR5	900MHz (3600MHz data rate) GDDR5	993MHz (1986MHz data rate) GDDR3
Memory Bus Width	256-bit	256-bit	256-bit
Frame Buffer	1GB	1GB	512MB
Transistor Count	959M	956M	956M
Manufacturing Process	TSMC 55nm	TSMC 55nm	TSMC 55nm
Price Point	~$250	~$200	$150

We suspect that this will be quite an interesting battle and we might have some surprises on our hands. NVIDIA has been talking about their new drivers which will be released to the public early Thursday morning. These new drivers offer some performance improvements across the board as well as some cool new features. Because it's been a while since we talked about it, we will also explore PhysX and CUDA in a bit more depth than we usually do in GPU reviews.

We do want to bring up availability. This will be a hard launch for AMD but not for NVIDIA (though some European retailers should have the GTX 275 on sale this week). As for AMD, we've seen plenty of retail samples from AMD partners and we expect good availability starting today. If this ends up not being the case, we will certainly update the article to reflect that later. NVIDIA won't have availability until the middle of the month (we are hearing April 14th).

NVIDIA hasn't been hitting their launches as hard lately, and we've gotten on them about that in past reviews. This time, we're not going to be as hard on them for it. The fact of the matter is that they've got a competitive part coming out in a time frame that is very near the launch of an AMD part at the same price point. We are very interested in not getting back to the "old days" where we had paper launched parts that only ended up being seen in the pages of hardware review sites, but we certainly understand the need for companies to get their side of the story out there when launches are sufficiently close to one another. And we're certainly not going to fault anyone for that. Not being available for purchase is it's own problem.

From the summer of 2008 to today we've seen one of most heated and exciting battles in the history of the GPU. NVIDIA and AMD have been pushing back and forth with differing features, good baseline performance with strengths in different areas, and incredible pricing battles in the most popular market segments. While AMD and NVIDIA fight with all their strength to win customers, the real beneficiary has consistently been the end user. And we certainly feel this launch is no exception. If you've got $250 to spend on graphics and were wondering whether you should save up for the GTX 285 or save money and grab a sub-$200 part, your worries are over. There is now a card for you. And it is good.

New Drivers From NVIDIA Change The Landscape

Today, NVIDIA will release it's new 185 series driver. This driver not only enables support for the GTX 275, but affects performance in parts across NVIDIA's lineup in a good number of games. We retested our NVIDIA cards with the 185 driver and saw some very interesting results. For example, take a look at before and after performance with Race Driver: GRID.

As we can clearly see, in the cards we tested, performance decreased at lower resolutions and increased at 2560x1600. This seemed to be the biggest example, but we saw flattened resolution scaling in most of the games we tested. This definitely could affect the competitiveness of the part depending on whether we are looking at low or high resolutions.

Some trade off was made to improve performance at ultra high resolutions at the expense of performance at lower resolutions. It could be a simple thing like creating more driver overhead (and more CPU limitation) to something much more complex. We haven't been told exactly what creates this situation though. With higher end hardware, this decision makes sense as resolutions lower than 2560x1600 tend to perform fine. 2560x1600 is more GPU limited and could benefit from a boost in most games.

Significantly different resolution scaling characteristics can be appealing to different users. An AMD card might look better at one resolution, while the NVIDIA card could come out on top with another. In general, we think these changes make sense, but it might be nicer if the driver automatically figured out what approach was best based on the hardware and resolution running (and thus didn't degrade performance at lower resolutions).

In addition to the performance changes, we see the addition of a new feature. In the past we've seen the addition of filtering techniques, optimizations, and even dynamic manipulation of geometry to the driver. Some features have stuck and some just faded away. One of the most popular additions to the driver was the ability to force Full Screen Antialiasing (FSAA) enabling smoother edges in games. This features was more important at a time when most games didn't have an in-game way to enable AA. The driver took over and implemented AA even on games that didn't offer an option to adjust it. Today the opposite is true and most games allow us to enable and adjust AA.

Now we have the ability to enable a feature, which isn't available natively in many games, that could either be loved or hated. You tell us which.

Introducing driver enabled Ambient Occlusion.

What is Ambient Occlusion you ask? Well, look into a corner or around trim or anywhere that looks concave in general. These areas will be a bit darker than the surrounding areas (depending on the depth and other factors), and NVIDIA has included a way to simulate this effect in it's 185 series driver. Here is an example of what AO can do:

Here's an example of what AO generally looks like in games:

This, as with other driver enabled features, significantly impacts performance and might not be able to run on all games or at all resolutions. Ambient Occlusion may be something some gamers like and some do not depending on the visual impact it has on a specific game or if performance remains acceptable. There are already games that make use of ambient occlusion, and some games that NVIDIA hasn't been able to implement AO on.

There are different methods to enable the rendering of an ambient occlusion effect, and NVIDIA implements a technique called Horizon Based Ambient Occlusion (HBAO for short). The advantage is that this method is likely very highly optimized to run well on NVIDIA hardware, but on the down side, developers limit the ultimate quality and technique used for AO if they leave it to NVIDIA to handle. On top of that, if a developer wants to guarantee that the feature work for everyone, they would need implement it themselves as AMD doesn't offer a parallel solution in their drivers (in spite of the fact that they are easily capable of running AO shaders).

We haven't done extensive testing with this feature yet, either looking for quality or performance. Only time will tell if this addition ends up being gimmicky or really hits home with gamers. And if more developers create games that natively support the feature we wouldn't even need the option. But it is always nice to have something new and unique to play around with, and we are happy to see NVIDIA pushing effects in games forward by all means possible even to the point of including effects like this in their driver.

In our opinion, lighting effects like this belong in engine and game code rather than the driver, but until that happens it's always great to have an alternative. We wouldn't think it a bad idea if AMD picked up on this and did it too, but whether it is more worth it to do this or spend that energy encouraging developers to adopt this and comparable techniques for more complex writing is totally up to AMD. And we wouldn't fault them either way.

The Cards and The Test

In the AMD department, we received two cards. One was an overclocked part from HIS and the other was a stock clocked part from ASUS. Guess which one AMD sent us for the review. No, it's no problem, we're used to it. This is what happens when we get cards from NVIDIA all the time. They argue and argue for the inclusion of overclocked numbers in GPU reviews when it's their GPU we're looking at. Of course when the tables are turned so are the opinions. We sincerely appreciate ASUS sending us this card and we used it for our tests in this article. The original intent of trying to get a hold of two cards was to run CrossFire numbers, but we only have one GTX 275 and we would prefer to wait until we can compare the two to get into that angle.

The ASUS card also includes a utility called Voltage Tweaker that allows gamers to increase some voltages on their hardware to help improve overclocking. We didn't have the chance to play with the feature ourselves, but more control is always a nice feature to have.

For the Radeon HD 4890 our hardware specs are pretty simple. Take a 4870 1GB and overclock it. Crank the core up 100 MHz to 850 MHz and the memory clock up 75 MHz to 975 MHz. That's the Radeon HD 4890 in a nutshell. However, to reach these clock levels, AMD revised the core by adding decoupling capacitors, new timing algorithms, and altered the ASIC power distribution for enhanced operation. These slight changes increased the transistor count from 956M to 959M. Otherwise, the core features/specifications (texture units, ROPs, z/stencil) remain the same as the HD4850/HD4870 series.

Most vendors will also be selling overclocked variants that run the core at 900 MHz. AMD would like to treat these overclocked parts like they are a separate entity altogether. But we will continue to treat these parts as enhancements of the stock version whether they come from NVIDIA or AMD. In our eyes, the difference between, say, an XFX GTX 275 and an XFX GTX 275 XXX is XFX's call; the latter is their part enhancing the stock version. We aren't going to look at the XFX 4890 and the XFX 4890 XXX any differently. In doing reviews of vendor's cards, we'll consider overclocked performance closely, but for a GPU launch, we will be focusing on the baseline version of the card.

On the NVIDIA side, we received a reference version of the GTX 275. It looks similar to the design of the other GT200 based hardware.

Under the hood here is the same setup as half of a GTX 295 but with higher clock speeds. That means that the GTX 275 has the memory amount and bandwidth of the GTX 260 (448-bit wide bus), but the shader count of the GTX 280 (240 SPs). On top of that, the GTX 275 posts clock speeds closer to the GTX 285 than the GTX 280. Core clock is up 31 MHz from a GTX 280 to 633 MHz, shader clock is up 108 MHz to 1404 MHz, and memory clock is also up 108 MHz to 2322. Which means that in shader limited cases we should see performance closer to the GTX 285 and in bandwicth limited cases we'll still be faster than the GTX 216 because of the clock speed boost across the board.

Rather than just an overclock of a pre-existing card, this is a blending of two configurations combined with an overclock from the two configurations from which it was born. And sure, it's also half a GTX 295, and that is convenient for NVIDIA. It's not just that it's different, it's that this setup should have a lot to offer especially in games that aren't bandwidth limited.

That wraps it up for the cards we're focusing on today. Here's our test system, which is the same as for our GTS 250 article except for the addition of a couple drivers.

The Test

Test Setup
CPU	Intel Core i7-965 3.2GHz
Motherboard	ASUS Rampage II Extreme X58
Video Cards	ATI Radeon HD 4890 ATI Radeon HD 4870 1GB ATI Radeon HD 4870 512MB ATI Radeon HD 4850 NVIDIA GeForce GTX 285 NVIDIA GeForce GTX 280 NVIDIA GeForce GTX 275 NVIDIA GeForce GTX 260 core 216
Video Drivers	Catalyst 8.12 hotfix, 9.4 Beta for HD 4890 ForceWare 185.65
Hard Drive	Intel X25-M 80GB SSD
RAM	6 x 1GB DDR3-1066 7-7-7-20
Operating System	Windows Vista Ultimate 64-bit SP1
PSU	PC Power & Cooling Turbo Cool 1200W

The New $250 Price Point: Radeon HD 4890 vs. GeForce GTX 275

Here it is, what you've all been waiting for. And it's a tie. Pretty much. These cards stay pretty close in performance across the board.

Looking at Age of Conan, we see something we didn't expect. NVIDIA is actually performing on par with AMD in this benchmark. NVIDIA's come a long way to closing the gap in this one, and for this comparison it's paid off a bit. Despite the fact that this one is essentially a tie, NVIDIA gets props for being competitive here.

While NVIDIA usually owns Call of Duty benchmarks, the 4890 outpaces the GTX 275 at 16x10 and 19x12 while the GTX 275 leads at the 30" panel resolution. As long as its still playable, then this isn't a huge deal, but the fact that most people have lower resolution monitors who might want one of these GPUs isn't in NVIDIA's favor.

Crysis Warhead is really close in performance again.

AMD leads Fallout 3, and this is the first game we've seen any consistent significant difference favoring one card over another.

FarCry 2 takes us back to the norm with both cards performing essentially the same.

The 4890 does have a pretty hefty lead under Race Driver GRID. The gap does close at higher resolution, but it's still a gap in AMD's favor.

Left4Dead is also pretty much a tie with the card you would want changing depending on the resolution of your monitor.

Overall, this is really a wash. These parts are very close in performance and very competitive.

What will an Extra $70 Get You? Radeon HD 4890 vs. Radeon HD 4870 1GB

The short answer is more performance. We see across the board improvement in performance from the highly clocked 4890. In some games the improvement is large while in others it is just a nice perk. But moving up to this price point we do still have diminishing returns. This isn't as significant as it is up above $300, but it's still a big price gap to cover for the gain. Only individual gamers can really decided whether they would benefit from the added performance enough.

Another Look at the $180 Price Point: 260 core 216 vs. 4870 1GB

With the change in performance with the 185 drivers from NVIDIA, we wanted to breakout one of the other most important price points. With lower resolution performance sometimes showing degraded performance and higher resolutions popping up with improvements, this price point is worth exploring.

Overall, it looks like the 4870 1GB is just a little bit faster in one or two games, but these cards are still pretty well matched after the new driver. the scaling differences really wouldn't change our minds either way. Both of these cards are good options.

Putting this PhysX Business to Rest

Let me put things in perspective. Remember our Radeon HD 4870/4850 article that went up last year? It was a straight crown-robbing on ATI’s part, NVIDIA had no competitively priced response at the time.

About two hours before the NDA lifted on the Radeon HD 4800 series we got an urgent call from NVIDIA. The purpose of the call? To attempt to persuade us to weigh PhysX and CUDA support as major benefits of GeForce GPUs. A performance win by ATI shouldn’t matter, ATI can’t accelerate PhysX in hardware and can’t run CUDA applications.

The argument NVIDIA gave us was preposterous. The global economy was weakening and NVIDIA cautioned us against recommending a card that in 12 months would not be the right choice because new titles supporting PhysX and new CUDA applications would be coming right around the corner.

The tactics didn’t work obviously, and history showed us that despite NVIDIA’s doomsday warnings - Radeon HD 4800 series owners didn’t live to regret their purchases. Yes, the global economy did take a turn for the worst, but no - NVIDIA’s PhysX and CUDA support hadn’t done anything to incite buyer’s remorse for anyone who has purchased a 4800 series card. The only thing those users got were higher frame rates. (Note that if you did buy a Radeon HD 4870/4850 and severely regretted your purchase due to a lack of PhysX/CUDA support, please post in the comments).

This wasn’t a one time thing. NVIDIA has delivered the same tired message at every single opportunity. NVIDIA’s latest attempt was to punish those reviewers who haven’t been sold on the PhysX/CUDA messages by not sending them GeForce GTS 250 cards for review. The plan seemed to backfire thanks to one vigilant Inquirer reporter.

More recently we had our briefing for the GeForce GTX 275. The presentation for the briefing was 53 slides long, now the length wasn’t bothersome, but let’s look at the content of the slides:

Slides About...	Number of Slides in NVIDIA's GTX 275 Presentation
The GeForce GTX 275	8
PhysX/CUDA	34
Miscellaneous (DX11, Title Slides, etc...)	11

You could argue that NVIDIA truly believes that PhysX and CUDA support are the strongest features of its GPUs. You could also argue that NVIDIA is trying to justify a premium for its much larger GPUs rather than having to sell them as cheap as possible to stand up to an unusually competitive ATI.

NVIDIA’s stance is that when you buy a GeForce GPU, it’s more than just how well it runs games. It’s about everything else you can run on it, whether that means in-game GPU accelerated PhysX or CUDA applications.

Maybe we’ve been wrong this entire time. Maybe instead of just presenting you with bar charts of which GPU is faster we should be penalizing ATI GPUs for not being able to run CUDA code or accelerate PhysX. Self reflection is a very important human trait, let’s see if NVIDIA is truly right about the value of PhysX and CUDA today.

The Widespread Support Fallacy

NVIDIA acquired Ageia, they were the guys who wanted to sell you another card to put in your system to accelerate game physics - the PPU. That idea didn’t go over too well. For starters, no one wanted another *PU in their machine. And secondly, there were no compelling titles that required it. At best we saw mediocre games with mildly interesting physics support, or decent games with uninteresting physics enhancements.

Ageia’s true strength wasn’t in its PPU chip design, many companies could do that. What Ageia did that was quite smart was it acquired an up and coming game physics API, polished it up, and gave it away for free to developers. The physics engine was called PhysX.

Developers can use PhysX, for free, in their games. There are no strings attached, no licensing fees, nothing. Now if the developer wants support, there are fees of course but it’s a great way of cutting down development costs. The physics engine in a game is responsible for all modeling of newtonian forces within the game; the engine determines how objects collide, how gravity works, etc...

If developers wanted to, they could enable PPU accelerated physics in their games and do some cool effects. Very few developers wanted to because there was no real install base of Ageia cards and Ageia wasn’t large enough to convince the major players to do anything.

PhysX, being free, was of course widely adopted. When NVIDIA purchased Ageia what they really bought was the PhysX business.

NVIDIA continued offering PhysX for free, but it killed off the PPU business. Instead, NVIDIA worked to port PhysX to CUDA so that it could run on its GPUs. The same catch 22 from before existed: developers didn’t have to include GPU accelerated physics and most don’t because they don’t like alienating their non-NVIDIA users. It’s all about hitting the largest audience and not everyone can run GPU accelerated PhysX, so most developers don’t use that aspect of the engine.

Then we have NVIDIA publishing slides like this:

Indeed, PhysX is one of the world’s most popular physics APIs - but that does not mean that developers choose to accelerate PhysX on the GPU. Most don’t. The next slide paints a clearer picture:

These are the biggest titles NVIDIA has with GPU accelerated PhysX support today. That’s 12 titles, three of which are big ones, most of the rest, well, I won’t go there.

A free physics API is great, and all indicators point to PhysX being liked by developers.

The next several slides in NVIDIA’s presentation go into detail about how GPU accelerated PhysX is used in these titles and how poorly ATI performs when GPU accelerated PhysX is enabled (because ATI can’t run CUDA code on its GPUs, the GPU-friendly code must run on the CPU instead).

We normally hold manufacturers accountable to their performance claims, well it was about time we did something about these other claims - shall we?

Our goal was simple: we wanted to know if GPU accelerated PhysX effects in these titles was useful. And if it was, would it be enough to make us pick a NVIDIA GPU over an ATI one if the ATI GPU was faster.

To accomplish this I had to bring in an outsider. Someone who hadn’t been subjected to the same NVIDIA marketing that Derek and I had. I wanted someone impartial.

Meet Ben:

I met Ben in middle school and we’ve been friends ever since. He’s a gamer of the truest form. He generally just wants to come over to my office and game while I work. The relationship is rarely harmful; I have access to lots of hardware (both PC and console) and games, and he likes to play them. He plays while I work and isn't very distracting (except when he's hungry).

These past few weeks I’ve been far too busy for even Ben’s quiet gaming in the office. First there were SSDs, then GDC and then this article. But when I needed someone to play a bunch of games and tell me if he noticed GPU accelerated PhysX, Ben was the right guy for the job.

I grabbed a Dell Studio XPS I’d been working on for a while. It’s a good little system, the first sub-$1000 Core i7 machine in fact ($799 gets you a Core i7-920 and 3GB of memory). It performs similarly to my Core i7 testbeds so if you’re looking to jump on the i7 bandwagon but don’t feel like building a machine, the Dell is an alternative.

I also setup its bigger brother, the Studio XPS 435. Personally I prefer this machine, it’s larger than the regular Studio XPS, albeit more expensive. The larger chassis makes working inside the case and upgrading the graphics card a bit more pleasant.

My machine of choice, I couldn't let Ben have the faster computer.

Both of these systems shipped with ATI graphics, obviously that wasn’t going to work. I decided to pick midrange cards to work with: a GeForce GTS 250 and a GeForce GTX 260.

PhysX in Sacred 2: There, but not tremendously valuable

The first title on the chopping block? Sacred 2.

This was Ben’s type of game. It’s a Diablo-style RPG. It’s got a Metacritic score of 71 out of 100, which indicates “mixed or average reviews”.

I let ben play Sacred 2 for a while, first with PhysX disabled and then with it enabled. His response after it was enabled? “The game feels a little choppier but I don’t really notice anything.”

Derek and I were hovering over his shoulder at times and eventually Derek pointed out the leaves blowing in the wind. “Did they do that before?”, Derek asked. “I didn’t even notice them”, was Ben’s reply.

Sacred 2 without GPU accelerated PhysX

Sacred 2 with GPU accelerated PhysX - It's more noticeable here than in the game itself

We left Ben alone for him to play for a while. His verdict mirrored ours. The GPU accelerated PhysX effects in Sacred 2 were hardly noticeable, and when they were, they didn’t really do anything for the game at all. To NVIDIA’s credit, a Diablo-style RPG isn’t really the best place for showing off GPU accelerated physics.

Ben wanted a different style of game, something more actiony. He needed explosions, perhaps that would convince him (and all of us) of the value of GPU-accelerated PhysX. We moved to the next game on the list.

PhysX in Warmonger: Fail

Cryostasis is a title due out this year, unfortunately there is no playable demo. Just a tech demo. Next.

Metal Knight Zero, MKZ for short, was another game on NVIDIA’s list. Once more, no playable demo, just a tech demo. We need real games here people, real titles, if you’re trying to convince someone to buy NVIDIA on the merits of PhysX.

Warmonger, ah yes, now we have a playable game. Warmonger is a first person shooter that uses GPU accelerated PhysX to enable destructible environments. Allow me to quote NVIDIA:

The first thing about Warmonger is that it runs horribly slow on ATI hardware, even with GPU accelerated PhysX disabled. I’m guessing ATI’s developer relations team hasn’t done much to optimize the shaders for Radeon HD hardware. Go figure.

The verdict here (aside from: I don’t want to play Warmonger), was that the GPU accelerated PhysX effects were not very, well, impressive. You could destroy walls, but the game itself wasn’t exactly fun so it didn’t matter. The realistic cloth that you could shoot holes through? Yeah, not terribly realistic looking.

Look at the hyper realistic cloth! Yeah, it looks like a highly advanced game from 6 years ago.

Warmonger itself wasn’t a triple A first person shooter, and the GPU accelerated PhysX effects on top of it weren’t going to make the game any better. Sorry guys, none of us liked this one. PC Gamer gave it a 55/100. Looks like we weren’t alone. Next.

The Unreal Tournament 3 PhysX Mod Pack: Finally, a Major Title

Unreal Tournament 3. Metacritic gives it an 83 for “Generally favorable reviews” and NVIDIA released a PhysX mod pack for it last year. Now we’re getting somewhere.

The mod pack consists of three levels that use GPU accelerated PhysX. The rest of the game is left unchanged. You can run these levels without GPU acceleration, but they’re much slower.

The three levels are HeatRay, Lighthouse and Tornado. Guess what the PhysX does in Tornado?

Ben and I played HeatRay together (aw, cute). First the PhysX enabled level with GPU acceleration turned off, then with it turned on and then the standard level that doesn’t use any GPU accelerated PhysX at all.

Turning the PhysX acceleration on made a huge difference, we both agreed. The game was much faster, much more playable. The most noticeable PhysX effect was hail falling from the sky, and lots of it. You could blow up signs in the level but the hail was by far the most noticeable part. Note that I said noticeable, not desirable.

See all of the white pellets? Yeah, that's what PhysX got us in UT3.

Playing the normal version of the HeatRay map was far more fun for both of us. The hail was distracting. Each of the hundreds of pellets hit the ground and bounced off in a physically accurate manner, but in doing so it sounded like I was running through a tunnel full of bead curtains suspended from the ceiling. Not to mention the visual distraction of tons of pellets hitting the ground all of the time. Ben and I both liked the level without the hail. The point of the hail? Not to make the level cooler, but rather to truly stress the PPU/GPU - particles are one of the most difficult things to do on the CPU thanks but work very well on the GPU. This wasn’t a fun level, this was a benchmark.

Tornado was the turning point for us. As the name implies, there’s a giant tornado flying through this capture the flag level. The tornado is physically accurate, if you shoot rockets at it, they fly around and get sucked into the funnel or redirected depending on their angle of incidence. It’s neat.

The tornado sucks up everything around it but if you’re looking to relive Wizard of Oz fantasies I’ve got bad news: you are immune from its sucking power. You just stay on the ground and lose health. Great.

Ben’s take on the tornado level? “It was neat”. I agreed. Not compelling enough for me to tattoo PhysX on my roided up mousing-arm, but the most impressive thing we’d seen thus far.

Mirror’s Edge: Do we have a winner?

And now we get to the final test. Something truly different: Mirror’s Edge.

This is an EA game. Ben had to leave before we got to this part of the test, he does have a wife and kid after all, so I went at this one alone.

I’d never played Mirror’s Edge. I’d seen the videos, it looked interesting. You play as a girl, Faith, a runner. You run across rooftops, through buildings, it’s all very parkour-like. You’re often being pursued by “blues”, police offers, as you run through the game. I won’t give away any plot details here but this game, I liked.

The GPU accelerated PhysX impacted things like how glass shatters and the presence of destructible cloth. We posted a video of what the game looks like with NVIDIA GPU accelerated PhysX enabled late last year:

"Here is the side by side video showing better what DICE has added to Mirror's Edge for the PC with PhysX. Please note that the makers of the video (not us) slowed down the game during some effects to better show them off. The slow downs are not performance related issues. Also, the video is best viewed in full screen mode (the button in the bottom right corner)."

In Derek’s blog about the game he said the following:

“We still want to really get our hands on the game to see if it feels worth it, but from this video, we can at least say that there is more positive visual impact in Mirror's Edge than any major title that has used PhysX to date. NVIDIA is really trying to get developers to build something compelling out of PhysX, and Mirror's Edge has potential. We are anxious to see if the follow through is there.”

Well, we have had our hands on the game and I’ve played it quite a bit. I started with PhysX enabled. I was looking for the SSD-effect. I wanted to play with it on then take it away and see if I missed it. I played through the first couple of chapters with PhysX enabled, fell in lust with the game and then turned off PhysX.

I missed it.

I actually missed it. What did it for me was the way the glass shattered. When I was being pursued by blues and they were firing at me as I ran through a hallway full of windows, the hardware accelerated PhysX version was more believable. I felt more like I was in a movie than in a video game. Don’t get me wrong, it wasn’t hyper realistic, but the effect was noticeable.

I replayed a couple of chapters and then played some new ones with PhysX disabled now before turning it back on and repeating the test.

The impact of GPU accelerated PhysX was noticeable. EA had done it right.

The Verdict?

So am I sold? Would I gladly choose a slower NVIDIA part because of PhysX support? Of course not.

The reason why I enjoyed GPU accelerated PhysX in Mirror’s Edge was because it’s a good game to begin with. The implementation is subtle, but it augments an already visually interesting title. It makes the gameplay experience slightly more engrossing.

It’s a nice bonus if I already own a NVIDIA GPU, it’s not a reason for buying one.

The fact of the matter is that Mirror’s Edge should be the bare minimum requirement for GPU accelerated PhysX in games. The game has to be good to begin with and the effects should be the cherry on top. Crappy titles and gimmicky physics aren’t going to convince anyone. Aggressive marketing on top of that is merely going to push people like us to call GPU accelerated PhysX out for what it is. I can’t even call the overall implementations I’ve seen in games half baked, the oven isn’t even preheated yet. Mirror’s Edge so far is an outlier. You can pick a string of cheese off of a casserole and like it, but without some serious time in the oven it’s not going to be a good meal.

Then there’s the OpenCL argument. NVIDIA won’t port PhysX to OpenCL, at least not anytime soon. But Havok is being ported to OpenCL, that means by the end of this year all games that use OpenCL Havok can use GPU accelerated physics on any OpenCL compliant video card (NVIDIA, ATI and Intel when Larrabee comes out).

While I do believe that NVIDIA and EA were on to something with the implementation of PhysX in Mirror’s Edge, I do not believe NVIDIA is strong enough to drive the entire market on its own. Cross platform APIs like OpenCL will be the future of GPU accelerated physics, they have to be, simply because NVIDIA isn’t the only game in town. The majority of PhysX titles aren’t accelerated on NVIDIA GPUs, I would suspect that it won’t take too long for OpenCL accelerated Havok titles to equal that number once it’s ready.

Until we get a standard for GPU accelerated physics that all GPU vendors can use or until NVIDIA can somehow convince every major game developer to include compelling features that will only be accelerated on NVIDIA hardware, hardware PhysX will be nothing more than fancy lettering on a cake.

You wanted us to look at PhysX in a review of an ATI GPU, and there you have it.

CUDA - Oh there’s More

Oh I’m not done. Other than PhysX, NVIDIA is stressing CUDA as another huge feature that no other GPU maker on the world has.

For those who aren’t familiar, CUDA is a programming interface to NVIDIA hardware. Modern day GPUs are quite powerful, easily capable of churning out billions if not a trillion instructions per second when working on the right dataset. The problem is that harnessing such power is a bit difficult. NVIDIA put a lot of effort into developing an easy to use interface to the hardware and eventually it evolved into CUDA.

Now CUDA only works on certain NVIDIA GPUs and certainly won’t talk to Larrabee or anything in the ATI camp. Both Intel and ATI have their own alternatives, but let’s get back to CUDA for now.

The one area that GPU computing has had a tremendous impact already is the HPC market. The applications there lent themselves very well to GPU programming and thus we see incredible CUDA penetration there. What NVIDIA wants however is CUDA in the consumer market, and that’s a little more difficult.

The problem is that you need a compelling application and the first major one we looked at was Elemental’s Badaboom. The initial release of Badaboom fell short of the mark but over time it became a nice tool. While it’s not the encoder of choice for people looking to rip Blu-ray movies, it’s a good, fast way of getting your DVDs and other videos onto your iPod, iPhone or other portable media player. It only works on NVIDIA GPUs and is much faster than doing the same conversion on a CPU if you have a fast enough GPU.

The problem with Badaboom was that, like GPU accelerated PhysX, it only works on NVIDIA hardware and NVIDIA isn’t willing to give away NVIDIA GPUs to everyone in the world - thus we have another catch 22 scenario.

Badaboom is nice. If you have a NVIDIA GPU and you want to get DVD quality content onto your iPod, it works very well. But spending $200 - $300 on a GPU to run a single application just doesn’t seem like something most users would be willing to do. NVIDIA wants the equation to work like this:

Badaboom -> You buy a NVIDIA GPU

But the equation really works like this:

Games (or clever marketing) -> You buy a NVIDIA GPU -> You can also run Badaboom

Now if the majority of applications in the world required NVIDIA GPUs to run, then we’d be dealing in a very different environment, but that’s not reality in this dimension.

The Latest CUDA App: MotionDSP’s vReveal

NVIDIA had more slides in its GTX 275 presentation about non-gaming applications than it did about how the 275 performed in games. One such application is MotionDSP’s vReveal - a CUDA enabled video post processing application than can clean up poorly recorded video.

The application’s interface is simple:

Import your videos (anything with a supported codec on your system pretty much) and then select enhance.

You can auto-enhance with a single click (super useful) or even go in and tweak individual sliders and settings on your own in the advanced mode.

The changes you make to the video are visible on the fly, but the real time preview is faster on a NVIDIA GPU than if you rely on the CPU alone.

When you’re all done, simply hit save to disk and the video will be re-encoded with the proper changes. The encoding process takes place entirely on the GPU but it can also work on a CPU.

First let’s look at the end results. We took three videos, one recorded using Derek’s wife’s Blackberry and two from me on a Canon HD cam (but at low res) in my office.

I relied on vReveal’s auto tune to fix the videos and I’ve posted the originals and vReveal versions on YouTube. The videos are below:

In every single instance, the resulting video looks better. While it’s not quite the technology you see in shows like 24, it does make your videos look better and it does do it pretty quickly. There’s no real support for video editing here and I’m not familiar enough with the post processing software market to say whether or not there are better alternatives, but vReveal does do what it says it does. And it uses the GPU.

Performance is also very good on even a reasonably priced GPU. It took 51 seconds for the GeForce GTX 260 to save the first test video, it took my Dell Studio XPS 435’s Core i7 920 just over 3 minutes to do the same task.

It’s a neat application. It works as advertised, but it only works on NVIDIA hardware. Will it make me want to buy a NVIDIA GPU over an ATI one? Nope. If all things are equal (price, power and gaming performance) then perhaps. But if ATI provides a better gaming experience, I don’t believe it’s compelling enough.

First, the software isn’t free - it’s an added expense. Badaboom costs $30, vReveal costs $50. It’s not the most expensive software in the world, but it’s not free.

And secondly, what happens if your next GPU isn’t from NVIDIA? While vReveal will continue to work, you no longer get GPU acceleration. A vReveal-like app written in OpenCL will work on all three vendors’ hardware, as long as they support OpenCL.

If NVIDIA really wants to take care of its customers, it can start by giving away vReveal (and Badaboom) to people who purchase these high end graphics cards. If you want to add value, don’t tell users that they should want these things, give it to them. The burden of proof is on NVIDIA to show that these CUDA enabled applications are worth supporting rather than waiting for cross-vendor OpenCL versions.

Do you feel any differently?

The Rest of the Performance Charts

Age of Conan Performance

Age of Conan

Call of Duty World at War Performance

Call of Duty World at War

Crysis Warhead Performance

Crysis Warhead

Fallout 3 Performance

Fallout 3

FarCry 2 Performance

FarCry 2

Left 4 Dead Performance

Left 4 Dead

Race Driver GRID Performance

Race Driver GRID

Power Consumption

With both of these competitors built on TSMC's 55nm manufacturing process we get to see how power efficient they are. At idle the GeForce GTX 275 draws the least amount of power, while under load the Radeon HD 4890 is cooler.

Final Words

NVIDIA is competitive at this new price point of $250 depending on what resolution you look at. We also see some improvement from NVIDIA's new 185 series driver and get a new feature to play with in the form of Ambient Occlusion. We did look at PhysX and CUDA again, and, while we may be interested in what is made possible by them, there is still a stark lack of compelling content that takes advantage of these technologies. We can't recommend prioritizing PhysX and CUDA over performance, and performance is where a GPU needs to compete. Luckily for NVIDIA, the GTX 275 does.

The fact that its worst-case performance is still better than the GTX 260 core 216 and in the best case, it can hit that of the GTX 280 was a plus for the GTX 275. It often posted performance more in line with its bigger brothers than a $50+ cheaper part. This is pretty sweet for a $250 card, especially as many games these days rely very heavily on shader performance. The GeForce GTX 275 is a good fit for this price point, and is a good option. But then there's the Radeon HD 4890.

The 4890, basically a tweaked and overclocked 4870, does improve performance over the 4870 1GB and puts up good competition for the GTX 275. On a pure performance level the 4890 and GTX 275 trade blows at different resolutions. The 4890 tends to look better at lower resolutions while the GTX 275 is more competitive at high resolutions. At 1680 x 1050 and 1920 x 1200 the 4890 is nearly undefeated. At 2560 x 1600, it seems to be pretty much a wash between the two cards.

At the same time, there are other questions, like that of availability. With these parts performing so similarly, and price being pretty well equal, the fact that AMD parts can be bought starting today and we have to wait for the NVIDIA parts is an advantage for AMD. However, we have to factor in the fact that AMD driver support doesn't have the best track record as of late for new game titles. Add in the fact that NVIDIA's developer relations seem more effective than AMD's could mean more titles that run better on NVIDIA hardware in the future. So what to go with? Really it depends on what resolutions you're targeting and what the prices end up being. If you've got a 30" display then either card will work, it's just up to your preference and the items we talked about earlier. If you've got a 24" or smaller display (1920x1200 or below), then the Radeon HD 4890 is the card for you.

AMD tells us that most retailers will feature mail in rebates of $20, a program which was apparently underwritten by AMD. Could AMD have worried they weren't coming in at high enough performance late in the game and decided to try and throw an extra incentive in there? Either way, not everyone likes a mail in rebate. I much prefer the instant variety and mail-in-rebate offers do not make decisions for me. We still compare products based on their MSRP (which is likely the price they'll be back at once the rebate goes away). This is true for both AMD and NVIDIA parts.

There will also be overclocked variants of the GTX 275 to compete with the overclocked variants from AMD. The overclock on the AMD hardware is fairly modest, but does make a difference and the same holds true for the GTX 275 products in early testing. We'll have to take a look at how such parts compare in the future along with SLI and CrossFire. In the meantime, we have another interesting battle at the $250 price point.