HyperLane Technology

Another new addition to the A-Series GPU is Imagination's “HyperLane” technology, which promises to vastly expand the flexibility of the architecture in terms of multi-tasking as well as security. Imagination GPUs have had virtualization abilities for some time now, and this had given them an advantage in focus areas such as automotive designs.

The new HyperLane technology is said to be an extension to virtualization, going beyond it in terms of separation of tasks executed by a single GPU.

In your usual rendering flows, there are different kinds of “master” controllers each handling the dispatching of workloads to the GPU; geometry is handled by the geometry data master, pixel processing and shading by the 3D data master, 2D operations are handled by the 2D data, master, and compute workloads are processed by the, you guessed it, the compute data master.

In each of these processing flows various blocks of the GPU are active for a given task, while other blocks remain idle.

HyperLane technology is said to be able to enable full task concurrency of the GPU hardware, with multiple data masters being able to be active simultaneously, executing work dynamically across the GPU’s hardware resources. In essence, the whole GPU becomes multi-tasking capable, receiving different task submissions from up to 8 sources (hence 8 HyperLanes).

The new feature sounded to me like a hardware based scheduler for task submissions, although when I brought up this description the Imagination spokespeople were rather dismissive of the simplification, saying that HyperLanes go far deeper into the hardware architecture, with for example each HyperLane having being able to be configured with its own virtual memory space (or also sharing arbitrary memory spaces across hyperlanes).

Splitting GPU resources can happens on a block-level concurrently with other tasks, or also be shared in the time-domain with time-slices between HyperLanes. Priority can be given to HyperLanes, such as prioritizing graphics over a possible background AI task using the remaining free resources.

The security advantages of such a technology also seem advanced, with the company use-cases such as isolation for protected content and rights management.

An interesting application of the technology is the synergy it allows between an A-Series GPU and the company’s in-house neural network accelerator IP. It would be able to share AI workloads between the two IP blocks, with the GPU for example handling the more programmable layers of a model while still taking advantage of the NNA’s efficiency for the fixed function fully connected layer processing.

Three Dozen Other Microarchitectural Improvements

The A-Series comes with other numerous microarchitectural advancements that are said to be advantageous to the GPU IP.

One such existing feature is the integration of a small dedicated CPU (which we understand to be RISC-V based) acting as a firmware processor, handling GPU management tasks that in other architectures might be still be handled by drivers on the host system CPU. The firmware processor approach is said to achieve more performant and efficient handling of various housekeeping tasks such as debugging, data logging, GPIO handling and even DVFS algorithms. In contrast as an example, DVFS for Arm Mali GPUs for example is still handled by the kernel GPU driver on the host CPUs.

An interesting new development feature that is enabled by profiling the GPU’s hardware counters through the firmware processor is creating tile heatmaps of execution resources used. This seems relatively banal, but isn’t something that’s readily available for software developers and could be extremely useful in terms of quick debugging and optimizations of 3D workloads thanks to a more visual approach.

Fixed Function Changes & Scalability PPA Projections - Significant, If Delivered
Comments Locked

143 Comments

View All Comments

  • mode_13h - Tuesday, December 3, 2019 - link

    > after over 20 years it's no longer a brand in and of itself

    Only 20 years? Pfft.

    After 55 years Ford's Mustang is still around, and it's now an electric SUV.

    And long after x86 is a thing of the past, you'd better believe Intel will *still* be using the Pentium branding for at least some of their CPUs.
  • Goshi112112 - Tuesday, December 3, 2019 - link

    Good
  • mode_13h - Wednesday, December 4, 2019 - link

    The idea of a super-wide SIMD seems somewhat at odds with tiled-rendering. Unless you can scale up your tile sizes (which might be how they got away with it), it seems that it'd be difficult to pack your 128-lane SIMD with conditionally-coherent threads, if you're also limiting the parallelism with a spatial coherency constraint.
  • lucam - Wednesday, December 4, 2019 - link

    I believe IMG has, also, proposed good solutions in the past. Problem was they never got to market as they never been licensed. We only have seen some low-midrange solution in some MediaTek SoC that never shined and nobody even bothered.
    Now the main question still remains, will IMG be able to license high end solutions to third parties in order to put our hands on?
    Otherwise it still will be another paper show off and nothing more...I am afraid...: 😦
  • mode_13h - Wednesday, December 4, 2019 - link

    This is not a new problem for chip (or IP) companies. The job of a good sales & marketing team is to engage with potential customers and figure out what specs their product would need to have to potentially win their business.

    Of course, whatever the competition & end-user markets do are wildcards you can't control.
  • vladx - Wednesday, December 4, 2019 - link

    I wouldn't consider the Helio P90 as low-midrange, in fact it's close to a Snapdragon 730 performance-wise.
  • lucam - Wednesday, December 4, 2019 - link

    Indeed...but we only have seen this just few months ago in the market and it's not even the Furian version. The only chips have seen around have the g8320 ...come on...they really are low..low...low range. I wished to have seen some 9XT around but it didn't happen and perhaps never will. Now look forward to seeing this new A series....but my doubts still remain...I hope to be wrong..
  • nvmnghia - Saturday, December 7, 2019 - link

    So today's smartphones have these for AI:
    - DSP
    - "neural engine"
    - CPU (is there an instruction/separate die area for this?)
    - GPU
  • mpbello - Monday, December 9, 2019 - link

    Are they going to offer open source drivers for this new series?
  • peevee - Monday, December 9, 2019 - link

    Why do you call wider vectors "thread-level parallelism"? Seems the opposite of the meaning of threads as threads must be able to execute different pieces of code.

Log in

Don't have an account? Sign up now