With NVIDIA currently between GPU generations, things have been relatively quiet on the professional graphics front for the company. On the high-end NVIDIA released the Quadro M6000 back in 2015, bringing their fully enabled GM200 GPU into the professional market. Now just over a year later, they are giving the Quadro a refresh with a newer, higher capacity model.

NVIDIA Quadro Specification Comparison
  M6000 (24GB) M6000 (12GB) K6000 6000
CUDA Cores 3072 3072 2880 448
Texture Units 192 192 240 56
ROPs 96 96 48 48
Core Clock N/A N/A 900MHz 574MHz
Boost Clock ~1140MHz ~1140MHz N/A N/A
Memory Clock 6.6Gbps GDDR5 6.6Gbps GDDR5 6Gbps GDDR5 3Gbps GDDR5
Memory Bus Width 384-bit 384-bit 384-bit 384-bit
FP64 1/32 FP32 1/32 FP32 1/3 FP32 1/2 FP32
TDP 250W 250W 225W 204W
GPU GM200 GM200 GK110 GF110
Architecture Maxwell 2 Maxwell 2 Kepler Fermi
Transistor Count 8B 8B 7.1B 3B
Manufacturing Process TSMC 28nm TSMC 28nm TSMC 28nm TSMC 40nm
Launch Date 03/22/2016 03/19/2015 07/23/2013 N/A
Launch Price (MSRP) $5000 $5000 $5000 $5000

When the original Quadro M6000 was launched, NVIDIA outfitted it with 12GB of VRAM in a 24x4Gb configuration, a large amount of memory for the time but not the full amount a GM200 card could be equipped with. Now this week the company is giving the card mid-cycle upgrade by increasing its VRAM capacity, replacing the 12GB model with a 24GB model utilizing higher density 8GB GDDR5 memory chips.

The target market for the 24GB M6000 is relatively straightforward: certain segments of the professional visualization market need all of the VRAM they can get, so for NVIDIA ecosystem users this should be a welcome upgrade. At the same time since 8Gb GDDR5 has been on the market for some time now, I’m surprised it has taken NVIDIA this long to bring GM200 to its maximum 24GB capacity. None the less this does give NVIDIA bragging rights as the highest capacity professional graphics card – surpassing the 16GB FirePro W9100 – though it’s worth noting that AMD should have the capability to push that to 32GB if they want final bragging rights.

Meanwhile NVIDIA’s press materials also briefly note that the updated Quadro M6000 ships with some new temperature & clockspeed management options – presumably via a newer firmware – though details are limited. The new M6000 features "More discrete GPU clock options for a better customer experience when running their application" and "Greater software temperature control to keep the GPU temperature below the hardware slowdown threshold for the best user experience.” NVIDIA’s professional cards (Quadro & Tesla) feature more performance controls than we see on consumer cards (which just run as fast as they can) and from the description I expect that NVIDIA has put in some new, finer grained options to better control automatic throttling behavior by manually setting both the maximum clockspeed and temperature. For single card workstations this is rarely an issue, but for large arrays of cards (e.g. Quadro VCA), keeping all of the cards in lockstep with regards to performance is a desired feature.

Finally, since this is a mid-cycle refresh, the new 24GB Quadro M6000 will be launching this week. It will be a drop-in replacement in NVIDIA’s product stack, and will occupy the previous M6000’s spot at $5000.



View All Comments

  • carnachion - Tuesday, March 22, 2016 - link

    Where are the news Teslas!! Reply
  • ImSpartacus - Tuesday, March 22, 2016 - link

    Apparently arriving for a while (in meaningful quantities), else this update wouldn't be necessary. Reply
  • damianrobertjones - Tuesday, March 22, 2016 - link

    Does Tesla work for Anandtech? Reply
  • nathanddrews - Tuesday, March 22, 2016 - link

    Love seeing GPUs with tons of RAM, even if I likely won't be using it anytime soon. The Sony quote... 10x performance boost... compared to what? Reply
  • zoxo - Tuesday, March 22, 2016 - link

    If you are memory size limited, it's reasonable to assume a very large performance gain Reply
  • yannigr2 - Tuesday, March 22, 2016 - link

    Looking at the Angry Birds movie trailer, I would say they are more "good scenario" limited. Reply
  • Ian Cutress - Tuesday, March 22, 2016 - link

    If it's the difference between finding a memory element in VRAM compared to spinning out to disk/DRAM and doing a PCIe transfer while that warp is idle, then 10x is conservative. Ideally warps with information at hand would jump ahead, but you still end with some async kernel waiting on data at one point. Depending on how an algorithm is run.

    I think the standard taught methodology with CUDA is that for every memory access you need 24-30 FLOPs per DRAM read/write access to get peak performance. If you have to go outside VRAM for that cache line, then the algorithm better iterate over it's own data to keep on spinning to maintain high perf.
  • nathanddrews - Tuesday, March 22, 2016 - link

    That makes sense. Assuming a 4K render path for movies (Sony has been pushing 4K for a while now), I can see them benefiting from the increase. Even in my own amateur experience with doing 4K effects in AE, it consumes all 32GB of my system memory in a heartbeat. Reply
  • kefkiroth - Tuesday, March 22, 2016 - link

    This graphics card has more memory than some phones have storage. Reply
  • Eden-K121D - Tuesday, March 22, 2016 - link

    I saw what you did there *cough* Apple *cough* Reply

Log in

Don't have an account? Sign up now