As we covered briefly in our live blog of this morning’s keynote, NVIDIA has publically updated their roadmap with the announcement of the GPU family that will follow 2014’s Maxwell family. That new family is Volta, named after Alessandro Volta, the physicist credited with the invention of the battery.

At this point we know very little about Volta other than a name and one of its marque features, but with how NVIDIA operates that’s consistent with how they’ve done things before. NVIDIA has for the last couple of years operated on an N+2 schedule for their public GPU roadmap, so with the launch of Kepler behind them we had been expecting a formal announcement of what was to follow Maxwell.

In any case, Volta’s marque feature will be stacked DRAM, which sees DRAM placed very close to the GPU by placing it on the same package, and connected to the GPU using through-silicon vias (TSVs). Having high bandwidth, on-package RAM is not new technology, but it is still relatively exotic. In the GPU world the most notable shipping product using it would be the PS Vita, which has 128MB of RAM in a wide-IO (but not TSV) manner. Meanwhile NVIDIA competitor Intel will be using a form of embedded DRAM for their highest-performance GT3e iGPU for their forthcoming Haswell generation CPUs.

The advantage of stacked DRAM for a GPU is that its locality brings with it both bandwidth and latency benefits. In terms of bandwidth the memory bus can be both faster and wider than an external memory bus, depending on how it’s configured. Specifically the close location of the DRAM to the GPU makes it practical to run a wide bus, while the short traces can allow for higher clockspeeds. Meanwhile the proximity of the two devices means that latency should be a bit lower – a lot of the latency is in the RAM fetching the required cells, but at the clockspeeds GDDR5 already operates at the memory buses on a GPU are relatively long, so there are some savings to be gained.

NVIDIA is targeting a 1TB/sec bandwidth rate for Volta, which to put things in perspective is over 3x what GeForce GTX Titan currently achieves with its 384bit, 6Gbps/pin memory bus (288GB/sec). This would imply that Volta is shooting for something along the lines of a 1024bit bus operating at 8Gbps/pin, or possibly an even larger 2048bit bus operating at 4Gbps/pin. Volta s still years off, but this at least gives us an idea of what NVIDIA needs to achieve to hit their 1TB/sec target.

What will be interesting to see is how NVIDIA handles the capacity issues brought on by on-chip RAM. It’s no secret that DRAM is rather big, and especially so for GDDR. Moving all of that RAM on-chip seems unlikely, especially when consumer video cards are already pushing 6GB (Titan). For high-end GPUs this may mean NVIDIA is looking at a split RAM configuration, with the on-chip RAM acting as a cache or small pool of shared memory, while a much larger pool of slower memory is attached via an external bus.

At this point Volta does not have a date attached to it, which is unlike Maxwell which originally had a 2013 date attached to it when first named. That date of course slipped to 2014, and while it’s never been made clear why, the fact that Kepler slipped from 2011 to 2012 is a reminder that NVIDIA is still tied to TSMC’s production schedule due to their preference to launch new architectures on new nodes. Volta in turn will have some desired node attached to its development, but we don’t know what at this time.

With TSMC shaking up its schedule in an attempt to catch up to Intel on both nodes and technology, the lack of a date ultimately is not surprising since it’s difficult at best to predict when the appropriate node will be ready 3 years out. On that note it’s interesting to note that while NVIDIA has specifically mentioned FinFET transistors will be used on their Parker SoC, they have not mentioned FinFET for Volta. Coming from their investor meeting the question came up, and while it wasn’t specifically denied we were also left with no reason to expect Volta to be using FinFET, so make of that what you will.

Meanwhile, in NVIDIA tradition they’ve also thrown out a very rough estimate of Volta’s performance by plotting their GPUs against a chart of FP64 performance per watt. Today Kepler is already at roughly 5.5 GFLOPS/watt for K20X, while Volta is plotted at 24ish. Like the rest of the GPU industry NVIDIA remains to be power constrained, so at equal TDPs we’d expect roughly four times the performance of K20X, which would put total FP64 performance at around 5 TFLOPS. But again, all of this is early into a GPU that will not be released for years.

Finally, while Volta is capturing the majority of the press due to the fact that it’s the newest GPU coming out of NVIDIA, this latest roadmap does also offer a bit more on Maxwell. Maxwell’s marque feature as it turns out is unified virtual memory. CUDA already has a unified virtual address space available, so this would seemingly go beyond that. In practice such a technology is important for devices integrating a GPU and a CPU onto the same package, which is what the AMD-led Heterogeneous System Architecture seeks to exploit. For NVIDIA their Parker SoC will be based on Maxwell for the GPU and Denver for the CPU, so this looks to be a feature specifically setup for Parker and Parker-like products, where NVIDIA can offer their own CPU integrated with a Maxwell GPU.

POST A COMMENT

17 Comments

View All Comments

  • ImSpartacus - Wednesday, March 20, 2013 - link

    I think Nvidia is licensing the ARM v8 ISA so Denver is there own design, but I'm not certain. Reply
  • Kevin G - Wednesday, March 20, 2013 - link

    Denver being nVidia's own custom CPU design has been known in the rumor circles for awhile, though nVidia has yet to officially confirm.

    One of the odd rumors about Project Denver was that it was nVidia's response to Larrabee but from the ARM side. nVidia was going to tack a wide vector unit onto their own custom ARM cores to utilize them fully programmable shader hardware. Seeing Maxwell and Denver as part of the Parker SoC ends that rumor chain.
    Reply
  • chizow - Sunday, March 24, 2013 - link

    Tesla is already Nvidia's pre-emptive response keeping Larrabee at arm's length for the last few years. Project Denver imo is Nvidia's attempt to enter the server market with 64-bit ARM and their attempt to remove x86 from the equation in these supercomputers. Basically, it would be their other half to the "hetereogenous computing model" they've been evangelizing for years. Since they can't get an x86 license. the smartest move for them is to marginalize x86 which is why you see them pushing Tegra, Android, and Project Denver on the server side. Surprised they haven't been a bigger backer of Windows RT. Reply
  • mayankleoboy1 - Tuesday, March 19, 2013 - link

    At this point, we know more about Volta, than we know about Maxwell. Rather ridiculous. Reply
  • vFunct - Tuesday, March 19, 2013 - link

    You'll also be seeing 3-D stacked-chip technology using through-silicon vias in many other upcoming devices. It was a big trend in EDA software over the last few years, and so will be seeing more of it as designers take advantage of the software that allows it to happen.

    on a side-note, the contrast for the article text is way too low. i actually had to edit the CSS using web inspector to read the article. The text color is way too light..
    Reply
  • MrSpadge - Wednesday, March 20, 2013 - link

    Not sure if they changed anything in-between, but on an IPS screen of normal resolution it looks absolutely fine / normal in FF 19. Reply
  • watersb - Tuesday, March 19, 2013 - link

    Great coverage, very much appreciated!

    Not many details yet, but how does Volta compare with JEDEC-standard Wide I/O? Isn't that in the same time-frame?
    Reply

Log in

Don't have an account? Sign up now