Stage 2: Vertex Processing

At the forefront of the 3D pipeline we have what has commonly been referred to as one or more vertex engines. These "engines" are essentially a collection of pipelined execution units, such as adders and multipliers. The execution units are fairly parallelized, to the point where there are multiple adders, multipliers, etc… in order to exploit the fact that most of the data they will be working on is highly parallel in nature.

The functional units that make up these vertex engines are all 32-bit floating point units, regardless of whether we're talking about an ATI or NVIDIA architecture. In terms of the efficiency of these units, ATI claims that there is rarely a case when they process fewer than 4 vertex operations every clock cycle, while NVIDIA says that the NV35 can execute at least 3 vertex operations every clock cycle but gave the range from 3 - 4 ops per clock.

It's difficult to figure out why the discrepancy exists without looking at both architectures at a very low level, which as we mentioned at the beginning of this article is fairly impossible due to both manufacturers wanting to keep their secrets closely guarded.

An interesting difference that exists between the graphics pipeline and the CPU pipeline is the prevalence of branches in graphics code. As you will remember from our articles detailing the CPU world, branches occur quite commonly in code (e.g. 20% of all integer instructions are branches in x86 code). A branch is any piece of code where a decision must be made and the outcome of which determines which instruction to execute next. For example, a general branch would be:

If "Situation A" then begin executing the following code

Else, if "Situation B" then execute this code

As you can guess, branches are far less common in the graphics world. Extremely complex lighting algorithms are much more likely to contain branches than any other sort of code as well as vertex processing in general. Obviously in any case where branches exist, you will want to be able to predict the outcome of a branch before evaluating it in order to avoid costly pipeline stalls. Luckily, because of the high bandwidth memory subsystem that GPUs are paired up with as well as the limited nature of branches in graphics code to begin with, the branch predictors in these GPUs don't have to be too accurate. Whereas in the CPU world you need to be able to predict branches with ~95% accuracy, the requirements are no where near as stringent in the GPU world. NVIDIA insists that their branch predictor is significantly more accurate/efficient than ATI's, however it is difficult to back up those claims with hard numbers.

Stage 1: The Front End Stages 3 & 4: Triangle Setup & Rasterization
Comments Locked

19 Comments

View All Comments

  • Anonymous User - Thursday, October 16, 2003 - link

    After reading this article, how can I determine which GeForceFX 5600 card has the NV30 core or the NV35 core. I'm currently interested in purchasing one, but on any of the retail boxes or manuals from the manufacturer's web site say nothing about the type of core used. Did NVidia corrected themselves using the NV35 core before releasing their 5600 cards to the market? Or are there 5600's NV30 cards on the retail shelves too. Help is appreciated. Thanks.
  • JamesVon - Thursday, December 27, 2018 - link

    Have you tried to play any Fortnite in GeForceFX 5600? Actually you can get free v-bucks or free fortnite leaked skins here if you interested <a href="https://newfortnite.com/">https://newfortn...
  • Anonymous User - Saturday, September 6, 2003 - link

    You should be ashamed. The linking of words to completely unrelated MARKETING ADS is absolutely ridiculous...as if you don’t have ENOUGH ads already.


    -J
  • Shagga - Saturday, August 9, 2003 - link

    I certainly found the article informative. I read the article with a view to making a decision on which card to purchase over the next week or so and to be honest the article said enough to convince me to sit tight. I also felt there is more to come from both ATI and nVidia and the results which are presented are perhaps not entirely complete. This is pointed out by Anand and at $499 I need to be making the right choice, however, Anand did succeed in convincing me to wait a tad longer.

    Good article I thought.
  • Anonymous User - Friday, August 1, 2003 - link

    Please stop using Flash graphics!
  • JamesVon - Thursday, December 27, 2018 - link

    What is the problem with Flash Graphics? Have you tried using Steam Platform? You can get free steam keys here https://steamity.com if you want to download free steam games
  • Pete - Tuesday, July 22, 2003 - link

    It's only fair that I praise the article, as well. As I said above, in the initial article comment thread, I congratulated Anand on what I thought was a well-written article. I appreciate his lengthy graphics pipeline summary, his extensive image quality investigation, and his usual even-handed commentary (though I had problems with the latter two).
  • Pete - Tuesday, July 22, 2003 - link

    I think this is a great article with a few significant flaws in its benchmarking.

    Firstly, the Doom 3 numbers. Anand acknowledged that he could not get the 9800P 256MB to run the tech demo properly, yet he includes the numbers anyway. This strikes me as not only incorrect but irresponsible. People will see 9800P 256MB numbers and note that its extra memory makes no difference over its 128MB sibling, yet only if they read the article carefully would they know that the driver Anand used limits the 9800P 256MB to only 128MB, essentially crippling the card.

    Also, note the difference between Medium Quality and High Quality modes in Doom 3 is only anisotropic filtering (AF), which is enabled in HQ mode. Note that forcing AF in the video card's drivers, rather than via the application, will result in higher performance and potentially lower image quality! This was shown to be the case both in a TechReport article on 3DM03 ("3DMurk"), in forum discussions at B3D, and in an editorial at THG. Hopefully this will be explored fully once a Doom3 demo is released to the public, and we have more open benchmarking of this anticipated game.

    Secondly, Anand's initial Quake 3 5900U numbers seemed way off compared to other sites that tested the same card in similar systems at the same settings. At 1600x1200 with 4xAA 8xAF, Anand was scoring over 200fps, well higher than any other review. And yet, after weeks of protest in the forum thread on this article, all that happened was the benchmark results for 12x10 and 16x12 were removed. The text, which notes:

    "The GeForceFX 5900 Ultra does extremely well in Quake III Arena, to the point where it is CPU/platform bound at 1600x1200 with 4X AA/8X Anisotropic filtering enabled."

    was left unchanged, even though it was based on what many assumed were erroneous benchmark data. I can only conclude that the data were indeed erroneous, as they have been removed from the article. Sadly, the accompanying text has not been edited to reflect that.

    Thirdly, the article initially tested Splinter Cell with AA, though the game does not perform correctly with it. The problem was that NVIDIA's drivers automatically disable AA if it's selected, yielding non-AA scores for what an unsupsecting reviewer believes is an AA mode. ATi's driver allow AA, warts and all, and thus produce appropriately dimished benchmark numbers, along with corresponding AA errors. The first step at correcting this mistake was to remove all Splinter Cell graphs and place a blurb in the driver section of the review blaming ATi for not disabling AA. Apparently a second step has been taking, expunging Splinter Cell from the article text altogether. Strangely, Splinter Cell is still listed in the article's drop-down menu as p. 25; clicking will bring you to the one last Quake 3 graph with the incorrect analysis, noted above.

    Finally, a note on the conclusion:

    "What stood out the most about NVIDIA was how some of their best people could look us in the eye and say "we made a mistake" (in reference to NV30)."

    What stands out most to me is that NVIDIA still can't look people in the eye and say they made a mistake by cheating in 3DMark03. Recent articles have shown NVIDIA to be making questionable optimizations (that may be considered cheats in the context of a benchmark) in many games and benchmarks, yet I see only a handful of sites attempt to investigate these issues. ExtremeTech and B3D noted the 3DMark03 "optimizations." Digit-Life has noted CodeCreatures and UT2K3 benchmark "optimizations," and Beyond3D and AMDMB have presented pictorial evidence of what appears to be the reason for the benchmark gains. NVIDIA appears to currently foster a culture of cutting corners without the customer's (and, hopefully, reviewer's) knowledge, and they appear reticent to admit it at all.

    I realize this post comes off as harsh against both Anand and NVIDIA. In the initial comment thread on this article, I was gentler in my (IMO, constructive) criticism. As the thread wore on for weeks without a single change in the multiple errors perceived in the original article, I gradually became more curt in my requests for corrections. Anand elicits possibly the greatest benefit of the doubt of any online hardware reviewer I know, as I've read his site and enjoyed the mature and thoughtful personality he imbued it with for years. I'm sorry to say his response--rather, his lack of response, as it was only Evan and Kristopher, not Anand, that replied to the original article thread--was wholly unsatisfactory, and the much belated editing of the article into what you read today was unsatisfactory as well. I would have much preferred Anand(tech) left the original article intact and appended a cautionary note or corrected benchmarks and commentary, rather than simply cutting out some of the questionable figures and text.

    Consider this post a summation of the criticism posted in the original article thread. I thought they would be useful to put this article in context, and I hope they are taken as constructive, not destructive, criticism. The 5900 is no doubt a superior card to its predecessor. I also believe this article, in its current form, presents an incomplete picture of both the 5900U and its direct competition, ATi's 9800P 256MB. Hopefully the long chain of revelations and commentary sparked by and after this article will result not in hard feelings, but more educated, thorough, and informative reviews.

    I look forward to Anandtech's next review, which I believe has been too long in coming. :)
  • ritaeora - Tuesday, December 11, 2018 - link

    I like your review about the GeForce.
    http://www.linkedin.com/company/free-instagram-fol...
  • kyrac - Monday, December 24, 2018 - link

    I am a user of Nvidia and i have a great experience using it.
    https://www.linkedin.com/company/virtual-assistant...

Log in

Don't have an account? Sign up now