Due to a lack of total vertical integration and heavy focus on IP licensing, one of the more interesting aspects of the SoC development pipeline is that we get to see the architectures and chips developed and announced in a very decoupled fashion. As opposed to the PC industry where there is heavy vertical integration and architectures are usually announced close to (if not at the same time as) the first silicon and even the first finished products, in the SoC space we will frequently see the complete development cadence in public – architectures, chips, and finally finished products. This has its own ups and downs, and while it means we’ll potentially hear about a new architecture long before it’s available in a product, on the other hand we get a lot more visibility into what’s coming down the development pipeline, at least for companies that develop IP for licensing or purchase it for use in their SoCs.

This brings us to the matter of ARM and today’s announcements. As a massive ISA, CPU, and GPU licenser, ARM’s product lineup is the textbook case for early visibility. At a time when the first high-end 20nm SoCs are just now hitting consumer devices in products such as the Samsung Galaxy Note 4 International and Apple iPhone 6, ARM’s development pipeline and product marketing teams are already looking at next year’s products and what processor designs to prepare for them. For ARM’s GPU group in particular, whom now functionally operates on a yearly release cadence, the roll-out of Mali 700 equipped devices means that it’s time to announce the designs for next year’s GPUs.

To that end, today ARM is announcing the Mali 800 series. Designed for inclusion in 2015+ SoCs, the Mali 800 series is the immediate successor to the current Mali 700 series. From an architectural standpoint Mali 800 is still based on the same common Midgard architecture that the Mali 600 and Mali 700 series are based on, and as a result from an architectural standpoint there isn’t much to discuss. Midgard and its unusual all-ILP/no-TLP architecture is still a modern GPU architecture that supports up to OpenGL ES 3.1, the Android Extension Pack, and Direct3D 11.1, so from that standpoint ARM has little reason to change.

Consequently while still based on Midgard, the 800 series is a successive round of optimization for the Midgard designs. For this generation ARM has focused on further improving Midgard’s area and power efficiency while adding a handful of new features not found in the existing Mali 700 series. This refresh spans from ARM’s highest-end designs to lowest-end designs, and coupled with Mali’s multi-core scalability spans the SoC GPU market from top to bottom.


The Midgard Shader Core

ARM Mali 800 Series
  T860 T830 T820
Core Configurations 1-16 1-4 1-4
ALU Pipes Per Core 2 2 1
Texture Units Per Core 1 1 1
FLOPs (FP32 MAD) 20 - 320 FLOPs 20 - 80 FLOPs 10 - 40 FLOPs
OpenGL ES ES 3.1 + AEP ES 3.1 + AEP ES 3.1 + AEP
Direct3D FL 11_1 FL 9_3 FL 9_3
10-Bit YUV Yes Optional Optional

 

Mali-T860

We’ll start off with ARM’s new high-end Mali design, the Mali-T860. With ARM reforming their product naming with the Mali 700 series ARM’s lineup is now much easier to follow, and as given away by the T860’s design it’s the immediate successor to the T760.

Like T760 before it, T860 is ARM’s most feature packed and most powerful Mali design. The underlying design uses the more common Midgard 2 arithmetic pipelines per core configuration, with the overall design being scalable to up to 16 cores. Coupled with the single texture unit per core, the throughput of a T860 design can scale from 20 FLOPs (10 MADs) and 1 texel per clock up to 320 FLOPs and 16 texels per clock. As a result this design can also be scaled up and down as needed to cover both phones and tablets just by varying the number of cores.

From a functionality standpoint, T860 will be the only Mali 800 part to support the 800 series’ fullest feature set. Specifically, support for Direct3D feature level 11_1 is limited to T860. Otherwise common to the entire 800 series, T860 also supports OpenGL ES 3.1, the Android Extension Pack, and OpenCL 1.2.

Meanwhile one notable feature addition for the Mali 800 family is support for native (and full speed) 10-bit YUV input and output. At the moment this feature addition is going to be of limited value, but HEVC is expected to make significant use of 10-bit YUV, so adding support here is laying the groundwork for HEVC in future products, and for that matter will go hand-in-hand with ARM’s new video processing block and display controller block, which are also being announced today.

Elsewhere from a performance standpoint ARM is offering the usual high level performance estimates. However it should be noted that these are compared to the two-generation old T628, and there aren’t similar numbers to work from for T760. In any case, compared to T628 ARM expects an equal configuration T860 to be some 45% more energy efficient on the same process node. And seeing as how mobile performance gains are almost entirely an exercise in energy efficiency, this would represent a very significant increase in energy efficiency (and ultimately sustainable performance) for their designs.


ARM Frame Buffer Compression - From Our Look at Midgard Earlier This Year

That said, by making a two-generation old comparison ARM also gets to roll up the benefits of their AFBC frame buffer compression technology, which was first introduced on the Mali 700 series. AFBC is something the company is significantly banking on due to the high bandwidth savings, and ARM considers one of their greatest feature advantages for the 800 series as well as the 700 series.

Finally, something to also keep in mind though is that while ARM’s same-node comparison is the fairest way to look at architectural efficiency, you’re highly unlikely to see T628 manufactured on 20nm+ processes. So on top of the architectural gains over the years, the real world performance gains for T860 should be better still due to the newer process node.

Mali-T830 & Mali-T820

Also being announced today alongside the T860 and rounding out the new Mali 800 family are the T830 and T820. These parts are best described as ARM’s low-end and mainstream designs, and are the successors to the T720. With Mali T860 essentially scaling down to cover most of the mid-range, the T820 and T830 are intended to be lower performance, lower power consuming parts that are optimized around power and die size needs.

For the Mali 800 series ARM is bifurcating the T720’s market a bit to offer different blends of die size and performance. The T720’s immediate successor is the T820, and like its predecessor is a one arithmetic pipeline design that is focused first and foremost on die size. Meanwhile new to the 800 series, though still a successor of sorts to the T720, is the Mali-T830. This is a more powerful design that while still focused on die size efficiency brings the number of pipelines to two per core (like T860), offering better performance in exchange for a slightly larger die size.

Other than the difference in the number of pipelines, the T820 and T830 designs are extremely similar. Both of them can be scaled up to 4 cores, allowing for some performance scaling. This puts the throughput of T820 designs at a range of 10 FLOPs and 1 texel per clock to 40 FLOPs and 4 texels per clock, while T830 will scale from 20/1 to 80/4 respectively.

From a feature standpoint, as previously mentioned only T860 gets the fullest Mali feature set while the other Mali 800 parts will be a bit more modest. T830 and T820 only support Direct3D up to feature level 9_3, while for the more mobile-centric world they will be on par with the T860 and support OpenGL ES 3.1 and the Android Extension Pack. Meanwhile 10-bit YUV support is present here, however for T820 and T830 it is an optional feature that will depend on which specific version of the core is licensed, so we will likely see a mix of retail products that do and do not include it.

Finally from a performance standpoint ARM is once again offering some high level guidance compared to the Mali 600 series, specifically the Mali-T622 in this case. Depending on the 800 design used, ARM tells us that performance should be up to 55% better or area efficiency will be 50% better. Presumably the area efficiency comparison is for T820 while the performance comparison is for T830.

Closing Thoughts

Launching alongside the new Mali GPUs today are a series of updates for the rest of ARM’s graphics stack, which will see the Mali-V video block and Mali-DP display controllers updated respectively. Along with the general strength of the Mali GPUs, expect to see ARM focus heavy on the synergy between these parts, including their common support for AFBC and of course the benefits of having all graphics components developed together. We’re covering these in another article, but we wanted to quickly point out where the Mali GPUs fit in the bigger picture of ARM’s announcements today.

Finally, while ARM doesn't have complete control over consumer devices (since they only sell designs to chipmakers) they are providing a rough estimate of when to expect Mali 800 GPUs to begin appearing in devices. According to ARM we should expect to start seeing Mali 800 devices starting in late 2015, or roughly a year from now. This is consistent with the Mali 700 series, which having been announced almost a year ago to this day has started to show up in consumer devices very recently. To that end we would expect to start seeing Mali 800 SoC designs announced in the first half of next year, with consumer designs to follow as per ARM's timetable.

Comments Locked

28 Comments

View All Comments

  • knightspawn1138 - Tuesday, October 28, 2014 - link

    I keep wondering when or if Intel will decide to license some of these GPU architectures and start integrating them into their Core or Atom lineups. I imagine that Intel doesn't "need" to pay someone else for a graphics solution since they already build their own (and the HD series GPU cores are decent, but still way behind AMD and nVidia). It would be nice to see Intel tack on a GPU that is actively trying to compete with other big dogs in the GPU space.
  • DanNeely - Tuesday, October 28, 2014 - link

    They used PowerVR in a few atoms. They never worked well; powerVRs drivers were worse for gaming than Intels on Windows. It was even worse on Linux because PowerVR only dropped a single binary driver for whatever kernel was current when the chip launched and then refused to fix any of the major bugs or even just recompile for newer kernels. After that experience I doubt Intel is going to go back to using a 3rd party GPU again.
  • knightspawn1138 - Tuesday, October 28, 2014 - link

    I do have one of those cursed Atoms with the Poulsbo video chip (Acer Aspire One netbook). You're right, that was a nightmare - horrendous performance, no driver support, I had to work a few hours of dark magic to get Ubuntu or Mint to display to the screen correctly. Still, I would love to see Intel develop a chip with a GPU that comes close to being in the same class as the nVidia or AMD graphic subsystems.
  • frenchy_2001 - Tuesday, October 28, 2014 - link

    I would love for Intel to bite the bullet and license Nvidia's graphic IP, either Kepler (a good start) or even Maxwell. Nvidia understood quite a while ago that most of the magic happened in the drivers and have dedicated huge resources to it. An intel processor with integrated nvidia video would be quite nice, particularly on Atom. HTPC or media stick in a single SoC...
  • lilmoe - Tuesday, October 28, 2014 - link

    Or pull an AMD and buy NVidia all together.

    I agree though, a Broadwell/Skylake CPU with a Maxwell/Pascal integrated GPU on Intel's 14nm/10nm process would be really amazing in both performance and power efficiency for mainstream devices. Add stacked DRAM to that with some unified memory goodness and it'll give mid-range dedicated GPUs a run for their money (probably even some of the high end).
  • ant6n - Wednesday, October 29, 2014 - link

    So if a T860 has 320 flops per clock, and if they run at say 1GHZ, that gives 320 gflops. That would be similar to a Geforce 820M (276.1-366.3 gflops according to wikipedia).

    I guess it's gonna be a while before we can have Crysis-like games on a tablet/chromebook.
  • vFunct - Thursday, October 30, 2014 - link

    If that's supposed to be their high-end part, it already seems outdated for something that should arrive in devices in about 3 years.

    Apple's A8X in the iPad is good for about 200-300GFlops, and the Tegra K1 is at 350+ Gflops.. and both of those are out right now.
  • kron123456789 - Sunday, November 2, 2014 - link

    But, 820M is the same GPU as 720M and 620M and it's 4 year old Fermi.

Log in

Don't have an account? Sign up now