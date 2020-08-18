AnandTech Live Blog: The newest updates are at the top. This page will auto-update, there's no need to manually refresh your browser.

08:05PM EDT - fp19 support

08:04PM EDT - on EW2 stage

08:04PM EDT - Convert data to FP and push down the pipe

08:03PM EDT - Use sliding window to minimize access

08:02PM EDT - minimize data movement

08:02PM EDT - data reuse and fused ops

08:02PM EDT - This is the tensor engine throughput

08:02PM EDT - Each core has three engines: Tensor, Pooling, Memory

08:01PM EDT - PCIe 4.0 x16

08:01PM EDT - Command processor above all four cores

08:01PM EDT - 192 MB local memory, distributed shared, no DDR

08:01PM EDT - 4 cores with ring bus

08:00PM EDT - Flexible to support future activation functions

08:00PM EDT - Optimization for GEMM as well

08:00PM EDT - Lots of Alibaba workloads are convolution-related

08:00PM EDT - achieve high-throughput, low latency, high power efficiency design

08:00PM EDT - Lots of business on inferencing

07:59PM EDT - Development in early 2018

07:58PM EDT - Former Huawei GPU architect