Exploring Thunderbolt 3 eGFX Performance, Feat. PowerColor's Gaming Station & Radeon RX Vega 56 Nanoby Ganesh T S on February 13, 2019 10:00 AM EST
Platform Analysis and Bandwidth Implications
The arrangement of the Thunderbolt 3 controller and any associated bridge chips / hubs inside the eGFX enclosure has a significant bearing on the performance. This is particularly important if the additional ports in the eGFX enclosure (USB / Ethernet / SATA etc.) are used simultaneously with the eGPU. In this section, we first determine the layout of the various ports in order to identify how the bandwidth is shared and which ports bottleneck each other. Following this, we do some tests to determine how the bandwidth sharing works for real-world scenarios.
Gaming Station - Internal Layout
The picture below shows the bus layout of the Intel NUC8i7BEH (Bean Canyon) NUC with the PowerColor Gaming Station eGFX enclosure attached to its Thunderbolt 3 port. The Radeon RX Vega 56 Nano is installed inside the enclosure, and all five USB 3.0 ports are connected to USB 3.0 external storage drives. The Ethernet port is also active.
The hardware report generated by HWiNFO shows that the Gaming Station uses a 2-port version of the Alpine Ridge controller for peripherals. One downstream port supports the x16 slot for the GPU, while the second downstream port acts as a USB 3.1 embedded endpoint. Simply put, this acts as a USB host interface, and PowerColor's design connects a Genesys Logic GL3520 to it. This hub chip supports 4 downstream ports - an ASMedia ASM2115 is connected to the first, and that enables the unadvertised SATA port inside the enclosure. The second port is connected to an ASIX AX88179 USB 3.0 to Gigabit Ethernet Adapter, and this enables the LAN port in the enclosure. The fourth port leads to the top USB 3.0 port in the front panel. The third port connects to another GL3520 hub chip, and the remaining four USB 3.0 ports (one in front and the three in the rear) are from this second hub. This break down clearly shows that any bandwidth-sensitive USB peripheral needs to be connected to the top USB 3.0 Type-A port in the front panel in order to avoid bandwidth sharing as much as possible.
Thunderbolt 3 is advertised as providing 40 Gbps of external I/O bandwidth. But, this comes with some fine print. The 40 Gbps capability can only be achieved if we have display outputs being routed out through the Thunderbolt links. Intel's own technical brief clears up the situation.
The Thunderbolt 3 link between an eGFX enclosure and a host system is not meant to carry display traffic, and hence, we are effectively left with a PCIe 3.0 x4 link. Based on the board layout deciphered above, we configured the following setup:
- Run the Radeon RX Vega 56 Nano as the display-driving GPU for the Bean Canyon NUC (Intel NUC8i7BEH)
- Connect a 2TB Samsung Portable SSD T5 to USB Port #1
- Connect a 240GB SanDisk Extreme Portable SSD to USB Port #4
- Set up a
fioworkload to write sequential data within a 8GB span to the Samsung T5 for a pre-defined time interval
- Set up a
fioworkload to read sequential data from a 8GB span in the SanDisk Extreme Portable SSD for a pre-defined time interval
- Set up two different GPU workloads:
- Run one instance of a OpenCL bandwidth test to transfer data from the host DRAM to the GPU VRAM and another instance to transfer data from the GPU VRAM to the host.
- Run custom 3DMark Time Spy Extreme and Fire Strike workloads with only the graphics tests enabled
Various combinations of the above workloads were simultaneously processed to determine the real-world effects of bandwidth sharing.
|PowerColor Gaming Station - Bandwidth Sharing Analysis (Gbps)|
|Experiment||Read Port #4||Write Port #1||PCIe Device to Host||PCIe Host to Device|
|USB Reads Only||2.498||-||-||-|
|USB Writes Only||-||2.775||-||-|
|USB Traffic Only||2.322||2.251||-||-|
|PCIe Device to Host Only||-||-||14.552||-|
|PCIe Host to Device Only||-||-||-||14.096|
|PCIe Traffic Only||-||-||8.472||8.440|
|All Interfaces Active||2.159||1.574||8.122||8.096|
The above bandwidth numbers show that the best case numbers of around 10.2 Gbps / 9.6 Gbps for bidirectional bandwidth are quite far from the 22 Gbps / 22 Gbps capability of Thunderbolt 3 in this type of data-only scenario. Though the above numbers do not reflect the traffic necessary to keep the eGPU active as a display driver, it can be safely said that the USB 3.0 hub layout and the role of the second Alpine Ridge port as a USB 3.0 endpoint prevent the realization of the full potential of Thunderbolt 3 in the Gaming Station. That said, most enclosures in the price range (~USD 300) adopt a similar USB 3.0 hub strategy.
The artificial test-cases in the above sub-section do show that the bandwidth numbers take a slight hit as more players come into the fray. However, the question on whether there is a significant impact to real-world applications is not answered. To address that, we use the second set of GPU workloads. The two graphics tests in the Futuremark 3DMark Time Spy Extreme came in at 20.68 fps and 16.03 fps respectively in the absence of any USB traffic. Meanwhile, when loading traffic on both USB ports, the tests came in at 20.85 fps and 15.97 fps. These numbers are within the scope of variation from run to run even in the absence of USB traffic. A similar behavior was see for the Fire Strike workload. 90.3 fps and 71.45 fps for the two tests in the absence of USB traffic became 89.88 fps and 72.51 fps after USB traffic was turned on.
The average bandwidth for USB traffic over the duration of the graphics tests was around 2.3 Gbps / 2.3 Gbps, which is almost the same as what we saw for the 'USB traffic only' case in the previous sub-section's table. This doesn't preclude the possibility of occasional hiccups in the data transfer speeds. For most common eGPU workloads, users will rarely see any performance bottlenecking on the USB or eGPU side of matters; in our testing there's been enough bandwidth to keep the USB ports properly fed, even when the GPU is dealing with a gaming workload of its own.