Micron Technology this week confirmed that it had begun mass production of GDDR5X memory. As revealed last week, the first graphics card to use the new type of graphics DRAM will be NVIDIA’s upcoming GeForce GTX 1080 graphics adapter powered by the company’s new high-performance GPU based on its Pascal architecture.

Micron’s first production GDDR5X chips (or, how NVIDIA calls them, G5X) will operate at 10 Gbps and will enable memory bandwidth of up to 320 GB/s for the GeForce GTX 1080, which is only a little less than the memory bandwidth of NVIDIA’s much wider memory bus equipped (and current-gen flagship)  GeForce GTX Titan X/980 Ti. NVIDIA’s GeForce GTX 1080 video cards are expected to hit the market on May 27, 2016, and presumably Micron has been helping NVIDIA stockpile memory chips for a launch for some time now.

NVIDIA GPU Specification Comparison
  GTX 1080 GTX 1070 GTX 980 Ti GTX 980 GTX 780
TFLOPs (FMA) 9 TFLOPs 6.5 TFLOPs 5.6 TFLOPs 5 TFLOPs 4.1 TFLOPs
Memory Clock 10Gbps GDDR5X GDDR5 7Gbps
GDDR5
6Gbps
GDDR5
Memory Bus Width 256-bit ? 384-bit 256-bit 384-bit
VRAM 8 GB 8 GB 6 GB 4 GB 3 GB
VRAM Bandwidth 320 GB/s ? 336 GB/s 224 GB/s 288 GB/s
Est. VRAM Power Consumption ~20 W ? ~31.5 W ~20 W ?
TDP 180 W ? 250 W 165 W 250 W
GPU "GP104" "GP104" GM200 GM204 GK110
Manufacturing Process TSMC 16nm TSMC 16nm TSMC 28nm
Launch Date 05/27/2016 06/10/2016 05/31/2015 09/18/2014 05/23/2013

Earlier this year Micron began to sample GDDR5X chips rated to operate at 10 Gb/s, 11 Gb/s and 12 Gb/s in quad data rate (QDR) mode with 16n prefetch. However, it looks like NVIDIA decided to be conservative and only run the chips at the minimum frequency.

As reported, Micron’s first GDDR5X memory ICs (integrated circuits) feature 8 Gb (1 GB) capacity, sport 32-bit interface, use 1.35 V supply and I/O voltage as well as 1.8 V pump voltage (Vpp). The chips come in 190-ball BGA packages with 14×10 mm dimensions, so, they will take a little less space on graphics cards than GDDR5 ICs.

The announcement by Micron indicates that the company will be the only supplier of GDDR5X memory for NVIDIA’s GeForce GTX 1080 graphics adapters, at least initially. Another important thing is that GDDR5X is real, it is mass produced now and it can indeed replace GDDR5 as a cost-efficient solution for gaming graphics cards. How affordable is GDDR5X? It should not be too expensive - particularly as it's designed as an alternative to more complex technologies such as HBM - but this early in the game it's definitely a premium product over tried and true (and widely available) GDDR5.

Source: Micron

POST A COMMENT

59 Comments

View All Comments

  • Schecter1989 - Thursday, May 12, 2016 - link

    Small question, why is it that recently the ratings of memory speed have changed to Gb/s instead of Ghz. When I see my memory clock using MSI afterburner or Precision X they both show 3500Mhz. Which of course with DDR equals out to the 7Ghz we used to advertise memory as. Whats the change to the Gb/s signify? And if the GDDR5X chip runs at 10Gb/s wouldnt that mean that if you take bit to byte of that 10Gbps you would only get a bandwidth of 1.25GB/s.

    What exactly am I missing here? Lol
    Reply
  • zepi - Thursday, May 12, 2016 - link

    10Gbps is unambiguous, whereas 1GHz can be either 1, 2, 4 or even 8 Gbps depending on the modulation used. Maybe that is the reason? Reply
  • SunnyNW - Thursday, May 12, 2016 - link

    I'm not a memory expert but from what I unerstand... the 10 Gb/s is the per pin speed rating of the memory. So with a 256 bit bus (like the GTX 1080 has) thats 256 pins multiplied by 10 Gb/pin or 1.25 GB giving you the 320 GB/s of memory bandwith. Reply
  • SunnyNW - Thursday, May 12, 2016 - link

    And to answer your question your other question I believe the change to Gb/s from of Ghz is to make it easier to compare between different memory technologies, ie GDDR, HBM, etc.
    Also these gpus use 32-bit memory controllers and as per the article each of these 8 Gb (1GB) GDDR5x chips has a 32-bit interface therefore giving you 320 Gb/s of bandwidth per IC or 40 GB/s. Hope that helps at least a little and if anyone would like to correct anything I said or add to it feel free.
    Reply
  • BurntMyBacon - Friday, May 13, 2016 - link

    I'll just add that there are 8 ICs on the GTX1080. At 40 GB/s per IC you get a total of 320GB/s (as advertised). Reply
  • Gigaplex - Thursday, May 12, 2016 - link

    Because they're trying to increase (or at least maintain) bandwidth while lowering clock speeds. Lower clock speeds means less power usage. It also means it looks slower than older generation hardware when you simply quote the GHz rating. Reply
  • BurntMyBacon - Friday, May 13, 2016 - link

    @Schecter1989: "why is it that recently the ratings of memory speed have changed to Gb/s instead of Ghz."

    You need more information for the GHz rating to mean anything. For instance. I can have a 100GHz clock rate, but if I can only transmit 1 bit every 100 cycles, then I only have a transfer rate of 1Gbps. Gbps is more meaningful for transfers. 1Gbps is will give me the same amount of data over the same amount of time whether I use a 100GHz clock or a 500MHz clock. Given that GDDR5X can transmit 2 bits per cycle (DDR mode) or 4 bits per cycle (QDR mode) but doesn't change its frequency, it makes a lot of sense to start listing transfer speed in terms of Gbps per pin.

    @Schecter1989: "When I see my memory clock using MSI afterburner or Precision X they both show 3500Mhz."

    This is the actual clock frequency.

    @Schecter1989: "Which of course with DDR equals out to the 7Ghz we used to advertise memory as."

    This is a misnomer. There is no clock frequency of 7GHz with DDR. DDR simply transfers 2 bits per cycle. One on each edge (rising edge / falling edge) of the clock. SDR is an acronym invented after the introduction of DDR for the purposes of distinguishing it from the older memory that can only transfer on one of the edges of the clock. Marketing likes to throw the 7GHz number around because it is far more impressive to show the larger "effective frequency" to explain the doubling of bandwidth given the same bus width than it is to explain how the same frequency and bus width can given you double the bandwidth through an more clever clock scheme. BTW, QDR tranfers on both edges and both levels, but the technical details would be better left to a reputable tech site do a full writeup on than a random post in the comments section.

    @Schecter1989: "if the GDDR5X chip runs at 10Gb/s wouldnt that mean that if you take bit to byte of that 10Gbps you would only get a bandwidth of 1.25GB/s."

    Correct. You are almost done. This is the pin speed. The bus width on the GTX1080 is 256 bits wide. Multiply that by the bandwidth you show (1.25GB/s) and you get the advertised 320GB/s.
    Reply
  • Schecter1989 - Saturday, May 14, 2016 - link

    Thanks @BurntMyBacon Your response covered my issues. I did get the 320GB/s when using the pin speed multipled by each actual pin count. I did get the same result I just was unsure if that was the correct method to find your actual full bandwidth potential.

    Ok so if the old style of cards being advertised in the GHZ range, then what was the faster MHz rating really giving us? Did pin speeds ever change throughout the past 10 years?

    Basically like with the change from the 6xx series to the 7xx. I purchased a GTX 770 back when I first built my machine, now this card advertised a 7Ghz speed on the memory, and when the GTX 680 arrived it had 6Ghz. So if how youre saying the pin speed is the actual measure of performance, does that mean that the so called +1Ghz in speed really did not do anything for me? If of course their pin speeds were the same?
    Reply
  • jasonelmore - Thursday, May 12, 2016 - link

    It is starting to look like Pascal is Memory Bandwidth Starved and that any increase in bandwidth over the 10Gbps rate, will increase performance quite a bit. Nvidia is purposely holding back pascal because they know they are going to be on 16nm for 2-3 years.

    We shall see when HBM finds it's way onto pascal. the P100 is already doing 21 TFLOPS half precision compared to 9 TFLOPS on the 1080
    Reply
  • Yojimbo - Thursday, May 12, 2016 - link

    Purposely holding it back? I don't think so. Seems like a silly "conspiracy theory". They have to worry too much about competition for that. Their gross margins would plummet if they spend billions of dollars developing and manufacturing a highly capable chip and then don't take full advantage of it by pairing it with sub-optimal memory configurations, because their selling prices would be lower.

    I also don't think they'll be at 16nm for 2 to 3 years. The data center segment is becoming more important than the mobile segment. The 20nm process catered to mobile chips. I doubt the foundries will do the same with 10nm because mobile growth has stalled and they won't want to miss out on the opportunity to make data center chips.

    Finally, HBM already has found its way to Pascal in the Tesla P100. I assume you meant consumer Pascal. I doubt HBM will find its way onto a consumer graphics-oriented GPU that has a die size close to that of GP104. GDDR5X still has headroom above its implementation in the 1080. A consumer variant of the GP100 would use HBM2, I assume, but it has a die size of ~600 mm^2 compared to GP104's ~330 mm^2.

    As far as half-precision, does it not also use half bandwidth (per operation)? If they allow the 1080 to do half-precision then the bandwidth shouldn't be any more of an issue than it is for single precision usage. The P100 has the bandwidth it has because its beneficial for the compute tasks it's designed to be used for. The 1080 has the bandwidth it does presumably because that bandwidth is well-balanced for graphics workloads, and that shouldn't change if its a half-precision graphics workload (but where is that going to be used?). Some people may want to use the 1080 for half-precision compute and in that scenario the card may be bandwidth starved in some instances, but if it is it should also be bandwidth starved in similar single precision compute workloads.
    Reply

Log in

Don't have an account? Sign up now