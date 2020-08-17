AnandTech Live Blog: The newest updates are at the top. This page will auto-update, there's no need to manually refresh your browser.

08:13PM EDT - A100 couldn't just scale up V100 - L2 memory bandwidth wouldn't keep up

08:12PM EDT - Continually stream data improving utilization

08:11PM EDT - 2x efficiency

08:11PM EDT - 3x in L1 BW, 2x inflight cap

08:11PM EDT - New load-global-store-shared copy bypassing the register file

08:11PM EDT - A100 uses 32-thread tensor cores to reduce instructions required

08:10PM EDT - Improved speeds and feeds, and efficiency

08:10PM EDT - 6K bytes per clock per SM for sparse

08:10PM EDT - Increase A100 data bandwidth increases based on algorithm requirements

08:09PM EDT - FP32 now uses TF32 OPs, supports 20x improvement for sparse data

08:09PM EDT - Tensor core supports more data types

08:08PM EDT - fixed size networks

08:08PM EDT - A100 targeted strong scaling

08:08PM EDT - each layer is parallelised - A100 is 2.5x for dense FP16

08:07PM EDT - DL strong scaling

08:07PM EDT - Strong scaling

08:06PM EDT - Even wins against unreleased chips

08:06PM EDT - A100 dominates in per-chip performance as well

08:06PM EDT - Records on MLPerf with A100 Pods

08:06PM EDT - IEEE for FP64 MatMul

08:05PM EDT - Performance uplift against V100

08:05PM EDT - Increased L1, async data movement

08:05PM EDT - More efficient, improves perf with sparsity

08:05PM EDT - Next-Gen Tensor Core

08:04PM EDT - 2x-7x improvements over V100 overall

08:04PM EDT - Elastic GPU, scale out with 3rd Gen NVLink

08:04PM EDT - 1.6 TB/sec HBM2 bandwidth

08:03PM EDT - 6912 CUDA Cores

08:03PM EDT - A100: 54-56B transistors

08:03PM EDT - Jack Choquette from NV

08:02PM EDT - Intel's John Sell, ex-Microsoft, is the chair for the session

08:00PM EDT - Open question if they'll talk about Ampere for environments other than HPC, but this session is also about 'Gaming', so you never know

07:58PM EDT - First talk of the GPU session is from NVIDIA, on the A100 performance and the Ampere architecture