A surprise at this year’s IFA is the previously unannounced Kirin 970 SoC hitting the show floor. Normally Huawei announces a new SoC with plenty of press details, and we were expecting perhaps some musings towards what is next from Huawei (it’s usually around this time of year), but this time they pushed it through to the show floor without any pomp and show (or any notice). Cue my surprise when I saw it…

The headline that Huawei seems to want to promote is the addition of dedicated neural network silicon inside the Kirin 970, dubbed the Neural Processing Unit (NPU). The sticker performance of the NPU is rated at 1.92 TFLOPs of FP16, which for reference, is about 3x what the Kirin 960's GPU alone can do on paper (~0.6 TFLOPs FP16). Or to put this in practical terms, Huawei says that the NPU is capable of discerning 2005 images per minute from internal testing, compared to 97 images per minute without the NPU – and presumably on the CPU – using the Kirin Thundersoft software (likely a future brand name). Obviously, depending on the implementation and power use, I would expect Huawei to try and leverage the NPU as much as possible in upcoming designs.

HiSilicon High-End Kirin SoC Lineup
SoC Kirin 970 Kirin 960 Kirin 950/955
CPU 4x A73 @ 2.40 GHz
4x A53 @ 1.80 GHz
4x A73 @ 2.36GHz
4x A53 @ 1.84GHz
4x A72 @ 2.30/2.52GHz
4x A53 @ 1.81GHz
GPU ARM Mali-G72MP12
? MHz
ARM Mali-G71MP8
1037MHz
ARM Mali-T880MP4
900MHz
LPDDR4
Memory
? 2x 32-bit
LPDDR4 @ 1866MHz
29.9GB/s
2x 32-bit
LPDDR4 @ 1333MHz 21.3GB/s
Interconnect ? ARM CCI-550 ARM CCI-400
Storage ? UFS 2.1 eMMC 5.0
ISP/Camera Dual ISP Dual 14-bit ISP
(Improved)
Dual 14-bit ISP
940MP/s
Encode/Decode 2160p60 Decode
2160p30 Encode
2160p30 HEVC & H.264
Decode & Encode

2160p60 HEVC
Decode
1080p H.264
Decode & Encode

2160p30 HEVC
Decode
Integrated Modem Kirin 970 Integrated LTE
(Category 18)
DL = 1200 Mbps
4x20MHz CA, 128-QAM
Kirin 960 Integrated LTE
(Category 12/13)
DL = 600Mbps
4x20MHz CA, 64-QAM
UL = 150Mbps
2x20MHz CA, 64-QAM
Balong Integrated LTE
(Category 6)
DL = 300Mbps
2x20MHz CA, 64-QAM
UL = 50Mbps
1x20MHz CA, 16-QAM
Sensor Hub ? i6 i5
NPU Yes No No
Mfc. Process TSMC 10nm TSMC 16nm FFC TSMC 16nm FF+

Other details for the Kirin 970 show improvements over the Kirin 960. First is the movement to TSMC’s 10nm process, from 16FF+. The Kirin 960 launched a few months before the 10nm ramp up for other high-end smartphone SoCs hit the shelves, so Huawei is matching their competitors here. The core configuration is the same as the 960, with four ARM Cortex A73 cores and four ARM Cortex A53 cores, this time clocked at 2.4 GHz and 1.8 GHz respectively. The integrated graphics is the newest Mali G72, announced alongside the A75/A55 processors earlier this year, which will be in an MP12 configuration. Frequency was not listed.

Other sticker features include dual ISP for motion detection and low light enhancement, support for HDR10 with 4K60 decoding, 4K30 encoding, and an LTE Category 18 modem, which Huawei states is good for 1.2 Gbps download. I’d be under the assumption that this is 4x carrier aggregation with 128-QAM. The Kirin 970 will also ship with an embedded Security Engine, supporting TEE and inSE.

Huawei’s final declarations on the NPU state that it is 25x the performance of a CPU with 50x the energy efficiency, and using a new HiAI (Hi-Silicon AI) nomenclature.

Huawei’s CEO, Richard Yu, has a keynote later this week and we also have some meetings with Huawei. I’m going to probe for details. The only smartphones with Kirin 970 on the show floor were generic models hooked up to development boards. Any devices coming to market (such as a Mate 9) will be a few weeks away, given launches from previous years.

POST A COMMENT

11 Comments

View All Comments

  • jjj - Friday, September 01, 2017 - link

    The NPU should be in collaboration with Cambricon Technologies. Reply
  • VeixES - Friday, September 01, 2017 - link

    That 1200Mbps LTE Cat18 is probably not from 4CA with DL-128QAM.
    More likely a max performance is achieved while doing 4CA with 2 of the carriers being 4x4MIMO(vs 2x2MIMO on other 2) using DL-256QAM. Or with 3CA and all carriers on 4x4MIMO with DL-256QAM. So data 12 streams total with either option.
    Upcoming Qualcomm X20 modem will be also 12 streams capable.
    Reply
  • levizx - Friday, September 01, 2017 - link

    Considering 128QAM is not defined in 3GPP Rel.13, I'd say the chart is most definitely wrong. You are probably right about the 12-stream 256QAM, but it could also be some sort of 5x20MHz combination or 2x10+4x20. Reply
  • peevee - Friday, September 01, 2017 - link

    No A75/A55 - not interested. Reply
  • Santoval - Friday, September 01, 2017 - link

    Still too early, it was just announced. Expect the first shipments in Q4, most likely December '17. Q1 2018 for reasonably high volume. Reply
  • levizx - Friday, September 01, 2017 - link

    You won't see A75/A55 until next year. And there won't be any material difference since, by ARM's own admission, there won't be any efficiency gain going from A73 to A75.
    So long as SoCs are still thermal/power limited, there's no point in upgrading if it negatively affects time-to-market.
    Reply
  • peevee - Wednesday, September 06, 2017 - link

    I am more interesting in performance gains from A55. After all, these are all thermally limited environments. Speculative OO just wastes power. Reply
  • Tigran - Friday, September 01, 2017 - link

    Can't open pics in the gallery. Reply
  • Ryan Smith - Friday, September 01, 2017 - link

    Thanks! Fixed. Reply
  • Tigran - Friday, September 01, 2017 - link

    Thanks, it's OK now. Reply

Log in

Don't have an account? Sign up now