One of the more interesting consequences of GPUs being built on TSMC’s 28nm process for an extended period of time is that it has forced both vendors to compensate and compromise in order to have product lines that cover the nearly 5 year span. Traditional upgrade cycles got thrown out of the window, and instead we saw a number of refreshes and updates, culminating in both AMD and NVIDIA taking their top GPUs right to the 28nm reticle limit of ~600mm2. Such large GPUs have typically been the crossover point between graphics and compute parts, incorporating high-end features such as ECC memory and faster double precision (FP64) compute capabilities. However for the reticle riders, AMD and NVIDIA went another route, building what is arguably the ultimate graphics GPUs with the highest FP32 performance possible.

I mention this because it puts the GPU vendors into the position of doing unconventional things with their GPUs. Nowhere is this more evident than in the new FirePro card AMD is announcing today. The FirePro S9300 X2 is the latest entry into the FirePro S series lineup, and it marks the first (and possibly only) time we’ll see AMD’s Fiji GPU used to power an HPC-grade compute card. The end result is an interesting product that at times will be wickedly powerful for a 300W card, and at other times will have to cope with the abilities and limitations of a GPU that wasn’t designed for the traditional HPC market.

AMD FirePro S Series Specification Comparison
  FirePro S9300 X2 FirePro S9170 FirePro S9150 FirePro S9000
Stream Processors 2 x 4096 2816 2816 1792
Boost Clock 850MHz 930MHz 900MHz 900MHz
Memory Clock 1Gbps HBM 5Gbps GDDR5 5Gbps GDDR5 5.5Gbps GDDR5
Memory Bus Width 2 x 4096-bit 512-bit 512-bit 384-bit
VRAM 2 x 4GB 32GB 16GB 6GB
FP32 13.9 TFLOPs 5.2 TFLOPs 5.1 TFLOPs 3.2 TFLOPs
FP64 0.8 TFLOPs
(1/16)
2.6 TFLOPs
(1/2)
2.5 TFLOPs
(1/2)
0.8 TFLOPs
(1/4)
Transistor Count 2 x 8.9B 6.2B 6.2B 4.31B
TDP 300W 275W 235W 225W
Cooling Passive Passive Passive Passive
Target Market HPC HPC HPC HPC + VDI
Manufacturing Process TSMC 28nm TSMC 28nm TSMC 28nm TSMC 28nm
Architecture GCN 1.2 GCN 1.1 GCN 1.1 GCN 1.0
GPU Fiji Hawaii Hawaii Tahiti
Launch Date Q2 2016 07/2015 08/2014 08/2012
Launch Price $5999 $3999 N/A N/A

As alluded to by the name, the S9300 X2 is a dual Fiji card, integrating a pair of AMD’s last and most powerful 28nm GPUs. In the interests of delivering a more efficient 300W card, AMD clocks S9300 X2’s GPUs at 850MHz, giving the card a theoretical 13.9 TFLOPs of FP32 compute performance. Meanwhile on the memory side AMD leaves the card’s HBM memory untouched, with each GPU getting 512GB/sec of memory bandwidth, for an aggregate 1TB/sec of bandwidth. Like its graphics counterpart, the Radeon Pro Duo, the S9300 X2 is designed to be the fastest thing available in a single card, at least for the niche where Fiji shines.

Since making its consumer debut nine months ago, I have been pondering whether AMD would attempt to deploy Fiji in a FirePro card. Fiji is arguably built for graphics first and foremost; its FP64 performance is capped at 1/16th FP32 performance, it lacks ECC memory, and its limited to just 4GB of memory per GPU. Given the expectations set by “traditional” HPC cards such as the FirePro S9170 – which offers 4-8x the memory and 3x the FP64 performance – Fiji seemingly can’t stack up. However in building the ultimate graphics GPU, AMD also built the ultimate FP32 compute GPU – one that on paper delivers far more FP32 performance than any other HPC card – and this is where the company will be running with this card.

The end result is that the S9300 X2 is an interesting niche product designed for a certain market segments that need strong FP32 performance above all else – and everything else held equal, don’t use massive data sets. It’s a somewhat narrow niche as a result, but one AMD believes they can do very well in given what kind of FP32 performance S9300 X2 is capable of, especially as NVIDIA doesn’t have an FP32 HPC-focused dual-GPU card of their own.

If you follow the HPC market then the market segments AMD is going after should sound familiar to you. Oil and gas (geosciences) has long been a FP32-centric field – something NVIDIA exploited a few years back as well with the Tesla K10 – and AMD will be chasing after this market with the S9300 X2. AMD will also be trying to push farther into the neural network market, and this is an area where the S9300 X2 may be uniquely suited. Popular GPU neural network implementations don’t use FP32 math, rather they use even lower precision FP16 math. And though the S9300 X2’s FP16 throughput is merely equal to its FP32 throughput, internally Fiji supports natively storing FP16 data types, which will significantly reduce register pressure on the card, and register pressure is almost always a concern for HPC kernel development.

AMD will also be looking to exploit the products of their Boltzmann Initiative – now formally called the Radeon Open Compute Platform (ROCm) – which will be near or at production quality by the time the S9300 X2 ships. With AMD’s newest card providing the necessary muscle at the hardware level, the company is looking towards ROCm’s heterogeneous compiler to close the gap with NVIDIA on the software side, with the HIPify tools to further bridge that gap by giving developers the means to port their CUDA applications over to AMD’s platform. AMD has already seen some success with ROCm with the geosciences firm CGG, and they’re hoping to continue this trend as the ROCm platform reaches production quality.

Wrapping things up, when it’s released the S9300 X2 will take its place alongside the rest of AMD’s FirePro S series lineup. Continuing to ship alongside it will be the S9100 series cards, which are based on AMD’s Hawaii GPU and compliment the S9300 X2 with traditional HPC-centric features such as ECC memory and high performance FP64. The FirePro S9300 X2 will be shipping this quarter with an MSRP of $5999.

POST A COMMENT

19 Comments

View All Comments

  • nismotigerwvu - Thursday, March 31, 2016 - link

    Honestly, this seems like a really smart move by AMD. They had a card sitting in the stack, found a niche for it and put it out there. The margins are MUCH higher here than as a Radeon Pro Duo and it didn't take much R&D to make it happen. It all comes back to having the right tool for the job. Sure this is a more specialized tool than we typically see, but I'm sure there will be people that will be quite happy to have it. It's not like it is the only card on the market either. If you have massive datasets or primarily need FP64 grunt, then this card won't be all that compelling compared to the rest of the market. Reply
  • zangheiv - Thursday, March 31, 2016 - link

    Perfect GPU for HPC, cloud HW Virtualization as well as large commercial 3D content displays Reply
  • Pork@III - Thursday, March 31, 2016 - link

    Yes for 1/3 of 3D content 4GB is enough. Reply
  • StereoPixel - Thursday, March 31, 2016 - link

    Is AMD HPC Software has FP16 support? Reply
  • MrSpadge - Thursday, March 31, 2016 - link

    OpenCL surely has, that's enough. Reply
  • BillyONeal - Thursday, March 31, 2016 - link

    > Popular GPU neutral network

    Did you mean neural network?
    Reply
  • Ryan Smith - Thursday, March 31, 2016 - link

    Well wouldn't a *neutral* neural network be a good thing?;-)

    But yes, you are correct. Thanks!
    Reply
  • SeanJ76 - Monday, April 11, 2016 - link

    AMD is GARBAGE!! Reply
  • MLSCrow - Tuesday, November 22, 2016 - link

    Well Google just invested in these for their Cloud platform and Deep Learning, which apparently rely heavily on FP16/32 performance. This should really boost confidence in AMD. Their stock jumped from $6.5 to about $9.00 as a result. The fact that Vega will be succeeding the chips that are in this S9300x2, are half the size via the 14nm FinFET process (not 1/4 as one might think as 14nm FinFET is 1/2 the size of 28nm Bulk), are more powerful and more efficient, with HBM 2.0, means that AMD could literally put 4x Vega GPU's on a card the same size as the S9300x2.

    The implications of that should speak for themselves.
    Reply

Log in

Don't have an account? Sign up now