The 4GB Question, Mantle’s Teething Issues, & the Test

Though not strictly a feature of R9 285 or Tonga, before diving into our benchmark breakdown we would like to spend a moment pondering VRAM capacity and how it impacts the R9 285.

When it comes to VRAM capacity the R9 285 is in a particularly odd position for a few different reasons. First and foremost, the R9 285 is a 2GB standard card that is replacing the 3GB standard R9 280. Despite R9 285 in most other ways being a lateral on R9 280 (including price), this is the one area where R9 285 is a clear downgrade compared to R9 280, losing 33% of its predecessor’s RAM capacity.

But second of all, midrange and high-end cards in general are in a bit of an odd spot due to the combination of a ready supply of 4Gb GDDR5 chips and the current-generation consoles. The use of 4Gb chips would allow a standard 256 bit memory bus card to accommodate 4GB of VRAM, and in the Playstation 4’s case this is used in 16bit mode to give the console a full 8GB of VRAM. So a 2GB card is somewhat behind the times as far as cutting edge RAM goes, but it also means that such a card only has ¼ of the RAM capacity of the current-gen consoles, which is a potential problem for playing console ports on the PC (at least without sacrificing asset quality).

Finally, midrange cards have been stuck at 2GB for some time now. In AMD’s ecosystem this has been the case informally since the 2GB 6950 fell to $250 in the middle of 2011, and formally since the 7850 launched with 2GB back in 2012. So depending on your starting point, 2GB of VRAM has been a standard of midrange cards for 2-3 years, which is about as long as we’d expect to go before we outgrow any given RAM capacity.

The question in our minds then is this: is 2GB enough VRAM for a $250 video card? All things considered we’ll always take more VRAM; there’s no performance penalty for having it, however there’s also no benefit to having it unless you can put it to good use. And to that end at least in our current benchmarks that’s generally not the case.

While we don’t have a 4GB card to use as a control at this time, of all of our benchmarks, the only Direct3D benchmarks that seem to show any signs of being impacted by 2GB of VRAM are Battlefield 4 and Thief. Even in those cases these signs are only occurring at 2560x1440 with MSAA and SSAA respectively, both of which tend to chew up memory to store the necessary anti-aliasing buffers. Otherwise if we drop down to 1920x1080, even with the aforementioned MSAA/SSAA, the 2GB R9 285 seems perfectly content.

The one global exception to this is in the case of Mantle, which throws a wrench in matters since it gives developers direct control over memory access. For both Thief and BF4, the Mantle renderers in these games are far more at home with 3GB+ of VRAM, and ignoring the present issues with Mantle on Tonga (more on this later), 2GB just isn’t cutting it when Mantle is involved, which is something we’ve already seen on other 2GB cards such as the R9 270 series.

The short answer to our question then is that whether 2GB is enough VRAM is going to depend on the resolution and API used. For AMD’s stated goal of being a 2560x1440 gaming card the R9 285 is already at risk of coming up short, and this is only going to get worse as more graphically advanced games come down the pipeline, especially console ports that aren’t being held back by last-generation consoles. On the other hand 1920x1080 is solid for the moment, and it may continue to be that way for some time.

Ultimately due to overall performance the R9 285 is not our first choice for a 2560x1440 gaming card – we’d suggest a minimum of the R9 280X – but the lack of VRAM isn’t doing it any favors here. Otherwise 1920x1080 should fare better, but whether that holds true for what’s increasingly becoming a 3+ upgrade cycle for video cards remains to be seen. With 2GB cards having been the $250 standard for so long, a 4GB card is looking like a safer bet right now, which is all the more reason we’re interested in seeing just what the premium for the 4GB R9 285 will be. Very rarely do we suggest the higher capacity version of a video card, but R9 285 may prove to be the exception.

Mantle: Teething Problems

Shifting gears, for the launch of the R9 285 AMD is advising reviewers and users alike that Mantle performance on Thief and Battlefield 4 is not going to be up to snuff right now. The reason for this is simple, but the potential ramifications are a bit more complex.

Because Tonga is based on a new GPU – and a newer version of GCN no less – the developers of Thief and Battlefield 4 have not had the opportunity to optimize their games for Tonga products. If you have ever used some of the lower end GCN products (e.g. Cape Verde) then you’ve seen first-hand that these games already are hit & miss depending on the GPU in use, so Tonga is an extension to that limitation. Meanwhile though AMD’s admission doesn’t include drivers, we would expect that there is some work that the company needs to do to better account for the minor architectural differences, even if Mantle is a thin driver API.

The complexity then stems from the fact that this is basically the first litmus test for how well Mantle (and potentially other low level APIs) will handle new hardware in the future, and at this time AMD is close to failing this test. On the one hand Mantle is up and running; both Thief and Battlefield 4’s Mantle rendering paths work on R9 285 despite neither game having seen the GPU before, and as far as we can tell there are no immediate rendering errors. However the fact that Mantle performance has significantly regressed and at this point is below Direct3D performance is not what we’d like to see.

Radeon R9 285 Mantle Performance

In explaining the situation, AMD tells us that this is an application level issue due to these games not being familiar with Tonga, and that this can be fixed through further patches. And ultimately if nothing else, these Tonga teething issues would be limited to these two games since they’re the only Mantle games to be released before Tonga.

The bind this puts AMD in, and why this is a bad omen for Mantle, is that if low level APIs are to take off then these kind of forward compatibility issues cannot occur. Though even high level APIs aren’t perfect – we’ve seen OS and driver updates break very old D3D and OpenGL games over time – high level APIs are forward compatible enough that virtually all games will work on newer hardware. And in the case they don’t, due to the abstraction-heavy nature of these APIs the problem and the solution are likely at the driver level. Mantle’s current state on the other hand puts the resolution in the hands of game developers, who unlike hardware vendors cannot necessarily be counted on to update their games to account for new hardware, especially given the front-loaded nature of video game sales.

For the moment Mantle is still in beta and very clearly so, with Thief and Battlefield 4 serving as proof of concept for the API. For that reason AMD still has time to contemplate the issue and ensure Mantle is more readily forward-compatible. But it’s going to be very hard justifying using Mantle if we see these kinds of regressions on non-beta drivers with games that were built against the non-beta SDK. AMD needs to ensure the shipping version of Mantle doesn’t suffer from these teething issues.

On a tangential note, this does raise the question of how well Direct3D 12 may handle the issue. By its vendor-limited nature Mantle has the opportunity to work even lower than a cross-vendor low level API like Direct3D 12, but D3D12 is still going to be low level and exposed to some of these hazards. For that reason it will be interesting to keep an eye on Direct3D development over the next year to see how Microsoft and its partners handle the issue. We would expect to see Microsoft have a better handle on forward-compatibility – in their position they pretty much have to – but if nothing else we’re curious just what it will take from game developers, API developers, and hardware developers alike to ensure that necessary level of forward-compatibility.

The Test

For the launch of the R9 285 AMD has released beta driver version 14.300.1005, which identifies itself as Catalyst 14.7 (though we suspect this will not be the final Catalyst version number). As to be expected for a launch involving a new GPU architecture, this launch driver is from a new driver branch (14.300) to account for the new hardware. With that said, based on our examination of the performance of this driver it does not appear to be significantly different than Catalyst 14.7 (14.200) for existing Radeon products.

Our R9 285 sample meanwhile is Sapphire’s R9 285 Dual-X OC. As this is a factory overclocked model, for the purposes of our testing we will be testing this card at both its factory clockspeed (965MHz/5.6GHz) and the R9 285 reference clockspeed (918MHz/5.5GHz) by underclocking our card to the appropriate clockspeeds. The bulk of our comparisons in turn will be drawn from the reference clockspeeds, but we do want to note that of the 5 R9 285 cards currently available for sale at Newegg, only a single (non-Sapphire) model is shipping without some kind of factory overclock. Consequently while we are looking to establish a reliable performance baseline, retail cards should perform a bit closer to our card’s factory overclocked performance.

CPU: Intel Core i7-4960X @ 4.2GHz
Motherboard: ASRock Fatal1ty X79 Professional
Power Supply: Corsair AX1200i
Hard Disk: Samsung SSD 840 EVO (750GB)
Memory: G.Skill RipjawZ DDR3-1866 4 x 8GB (9-10-9-26)
Case: NZXT Phantom 630 Windowed Edition
Monitor: Asus PQ321
Video Cards: AMD Radeon R9 290
Sapphire R9 285 Dual-X OC
AMD Radeon R9 280X
AMD Radeon R9 280
AMD Radeon R9 270
AMD Radeon HD 7850
AMD Radeon HD 6870
NVIDIA GeForce GTX 770
NVIDIA GeForce GTX 760
NVIDIA GeForce GTX 660
NVIDIA GeForce GTX 560 Ti
Video Drivers: NVIDIA Release 340.52 WHQL
AMD Catalyst 14.300.1005 Beta
OS: Windows 8.1 Pro

 

Meet The Sapphire R9 285 Dual-X OC 2GB Metro: Last Light
Comments Locked

86 Comments

View All Comments

  • mczak - Wednesday, September 10, 2014 - link

    This is only partly true. AMD cards nowadays can stay at the same clocks in multimon as in single monitor mode though it's a bit more limited than GeForces. Hawaii, Tonga can keep the same low clocks (and thus idle power consumption) up to 3 monitors, as long as they all are identical (or rather more accurately probably, as long as they all use the same display timings). But if they have different timings (even if it's just 2 monitors), they will clock the memory to the max clock always (this is where nvidia kepler chips have an advantage - they will stay at low clocks even with 2, but not 3, different monitors).
    Actually I believe if you have 3 identical monitors, current kepler geforces won't be able to stick to the low clocks, but Hawaii and Tonga can, though unfortunately I wasn't able to find the numbers for the geforces - ht4u.net r9 285 review has the numbers for it, sorry I can't post the link as it won't get past the anandtech forum spam detector which is lame).
  • Solid State Brain - Thursday, September 11, 2014 - link

    A twin monitor configuration where the secondary display is smaller / has a lower resolution than the primary one is a very common (and logic) usage scenario nowadays and that's what AMD should sort out first. I'm positively surprised that on newer Tonga GPUs if both displays are identical frequencies remain low (according to the review you pointed out), but I'm not going to purchase a different display (or limit my selection) to get advantage of that when there's no need to with equivalent NVidia GPUs.
  • mczak - Thursday, September 11, 2014 - link

    Fixing this is probably not quite trivial. The problem is if you reclock the memory you can't honor memory requests for display scan out for some time. So, for single monitor, what you do is reclock during vertical blank. But if you have several displays with different timings, this won't work for obvious reasons, whereas if they have identical timings, you can just run them essentially in sync, so they have their vertical blank at the same time.
    I don't know how nvidia does it. One possibility would be a large enough display buffer (but I think it would need to be in the order of ~100kB or so, so not quite free in terms of hw cost).
  • PEJUman - Thursday, September 11, 2014 - link

    I used multimonitor with AMD & NVIDIA cards. I would take that 30W hit if it means working well.
    NVIDIA: too aggressive with low power mode, if you have video on one screen & game on the other, it will remain at the clock speed of the 1st event (if you start the video before the game loading, it will be stuck at the video clocks).

    I used 780TI currently, R9 290x I had previously works better where it will always clock up...
  • hulu - Wednesday, September 10, 2014 - link

    The conclusions section of Crysis: Warhead seems to be copy-pasted from Crysis 3. R9 285 does not in fact trail GTX 760.
  • thepaleobiker - Wednesday, September 10, 2014 - link

    @Ryan - A small typo on the last page, last line of first paragraph - "Functionally speaking it’s just an R9 285 with more features"

    It should be R9 280, not 285. Just wanted to call it out for you! :)

    Bring on more Tonga, AMD!
  • FriendlyUser - Wednesday, September 10, 2014 - link

    I would like to note that if memory compression is effective, it should not only improve bandwidth but also reduce the need for texture memory. Maybe 2GB with compression is closer to 3GB in practice, at least if the ~40% compression advantage is true.

    Obviously, there is no way to predict the future, but I think your conclusion concerning 2GB boards should take compression in account.
  • Spirall - Wednesday, September 10, 2014 - link

    If GCN1.2 (instead of a GCN 2.0) is what AMD has to offer as the new arquitecture for their next year cards, Maxwell (based in 750Ti x 260X tests), will punch hard AMD in terms of performance per watt and production cost (not price) so their net income.
  • shing3232 - Wednesday, September 10, 2014 - link

    750ti use a better 28nm process call HPM while rest of the 200 series use HPL , that's the reason why maxwell are so efficient.
  • Spirall - Wednesday, September 10, 2014 - link

    I'm afraid this won't be enough (but hope it does). Anyway, as Nvidia is expected to launch their Maxwell 256 bits card nearby, we'll have the answer soon.

Log in

Don't have an account? Sign up now