Original Link: http://www.anandtech.com/show/5826/mips-technologies-updates-processor-ip-lineup-with-aptiv-series
MIPS Technologies Updates Processor IP Lineup with Aptiv Seriesby Ganesh T S on May 10, 2012 8:55 AM EST
ARM has been making waves over the past two years with plenty of processor and graphics IP announcements, but they are not alone in the game. MIPS Technologies, almost as old as ARM itself, also licenses RISC processors. With licensees like Broadcom and Sigma Designs, they have undoubtedly held the upper hand in the home entertainment / set-top-box arena as well as the networking space. However, success in the fast-growing mobile / tablet space has been hard for MIPS to come by, thanks to ARM being well-entrenched in that market.
Today, MIPS is introducing a range of new processor IP cores in the Aptiv lineup, similar to ARM's Cortex. The members of this lineup range from small microcontroller cores to triple dispatch superscalar ones. By introducing a member at each performance level to compete directly with offerings from ARM, MIPS has made its move in the processor IP battle.
MIPS last introduced a new processor IP core back in September 2010, the MIPS 1074K Coherent Processing System. Between September 2010 and now, ARM officially announced the Cortex-A15 (well after TI had announced an SoC based on it) and Cortex-A7. In the preceding year, the Cortex-A5 and the Cortex-M4 had been launched. The Aptiv series from MIPS introduces members which compete against each of these offerings.
Throughout the briefing, MIPS stressed that the standard DMIPS/MHz/core was not a reliable benchmark. Instead, they promoted CoreMark in which their cores performed better than ARM's offerings. CoreMark is comprised of small and easy to understand ANSI C code with a realistic mixture of read/write operations, integer operations, and control operations. CoreMark has a total binary size of no more then 16K using gcc on an x86 machine (this small size makes it more convenient to run using simulation tools). We do agree with MIPS that it could be a better measure of L1 cache and branch prediction performance. Unfortunately, we don't have reliable CoreMark data for the upcoming Cortex-A15, and hence, will be using DMIPS/MHz/core as a rough performance comparison metric in the rest of the piece.
The Aptiv series being launched today consists of three families, the proAptiv, interAptiv and microAptiv. While proAptiv and interAptiv come in multi-core variants (with up to 6 for the former and 4 for the latter), the microAptiv family members are all single core.
The following tables presents the various MIPS and ARM processor IP cores available for licensing in order of their performance. Note that multiple generations of processors are presented in the table. The Cortex-A,R & M series cater to the application processor segment, real-time processing segment and the microcontroller segment respectively. They are matched field for field by the proAptiv, interAptiv and microAptiv series being launched by MIPS today.
|MIPS and ARM High End IP Cores in Order of Performance|
In terms of processor IP cores catering to real-time applications where high reliability (such as ECC support for the internal caches) and low power footprint is also required, the interAptiv family and the Cortex-R series go head to head. That said, MIPS also targets interAptiv family members as alternatives for Cortex-A5 / A7 / A9. However, the target market for the Cortex-R series and interAptiv series are similar (wireless baseband / automotive applications such as safety and powertrain control etc.)
|MIPS and ARM Mid-Range IP Cores in Order of Performance|
In the microcontroller class processor IP cores, the microAptiv series is pitted against the Cortex-M series.
|MIPS and ARM Microcontroller Class IP Cores in Order of Performance|
|1.25||Cortex-M3 / Cortex-M4|
In the next few sections, we will look at the architectural details of the newly introduced processors.
The proAptiv family of processors can contain 1 to 6 proAptiv cores, each of which implements in about half the size of a standard Cortex-A15 core. This is not entirely impossible, given that some people in the industry feel that ARM's Cortex-A15 implementation takes up too much area for the advertised performance. However, it is likely that the NEON engine is being accounted for in the Cortex-A15 area while the proAptiv implementation doesn't take into account the 32 bit SIMD engine (DSP ASE). [ Update: MIPS clarified that the DSP ASE is not a configurable block and is included in the quoted area. The precise area numbers for ARM are estimates only, since ARM has published no concrete specifications for the Cortex-A15. MIPS also attempted to remove the estimated area for NEON, with the desire to achieve as close to an “apples to apples” comparison in area as possible].
The proAptiv core is a superscalar out-of-order CPU with quad instruction fetch and fused triple dispatch. In the absence of any dependencies, the CPU can issue up to four integer and two floating point operations. Multi-level TLBs and branch target buffers / sophisticated branch prediction aid in getting more than 60% better performance over the previous generation 1074K series. The FPU is dual issue and runs at the same speed as the CPU.
The proAptiv and interAptiv families implement EVA (Extended Virtual Addressing) in order to better utilize the available address space. Similar to the Cortex-A15, the IP includes a coherence manager and an integrated L2 cache controller with ECC support. While Cortex-A15 supports up to 32 cores, the proAptiv family supports up to 6. An interesting aspect of the Coherent Processing System (CPS) is the presence of a cluster power controller which does clock gating per core (common in other multi-core CPUs also) and voltage domain / gating per core. The latter has interesting applications in scenarios similar to ARM's big:LITTLE architecture. Instead of tying up a big core such as the A15 with a smaller one like the A7, MIPS suggests that licensees could implement a multi-core proAptiv system with some cores running at much lower frequencies / lower voltage to save upon power (since the proAptiv cores are already small compared to the A15 core).
|proAptiv||ARM Cortex A9||Qualcomm Krait||ARM Cortex A15|
|Pipeline Depth||13 stages||8 stages||11 stages||15 stages|
|Out of Order Execution||Y||Y||Y||Y|
|SIMD / Media Processing Engine||DSP ASE (32-bit wide)||Optional MPE (64-bit wide)||Y (128-bit wide)||Optional MPE (128-bit wide)|
|Process Technology||40nm / 28nm||40nm / 32nm||28nm||28nm|
|Typical Clock Speeds||1.2GHz*||1.2GHz||1.5GHz||2.5GHz|
While ARM expects the A15 to reach up to 2.5 GHz in the HP/G processes, MIPS only expects up to 1.5 GHz. That said, embedded applications using the proAptiv are likely to be power sensitive, and while peak performance of the A15 is likely to be much better than the proAptiv family, MIPS can tout the smaller size for equivalent performance as an advantage.
*Update: MIPS supplied detailed feedback on our architecture comparison, and I will leave it here for readers to take note:
- MIPS and ARM provide synthesizable IP. As such, these technologies can be implemented in any process geometry and node, with standard cells and memories. At that point, it all comes down to what target a customer shoots for, what physical IP libraries and memories they use, and other implemetation specific aspects.
- MIPS at 1.2 GHz is using readily available using TSMC's 12 track SVt libraries and representing worst case silicon corner results with production margins. MIPS projects that using more aggressive implementation techniques and typical corner silicon, proAptiv implementations can reach 2.0-2.5 GHz (similar to the Cortex-A15) [ Editor's Note: The conditions under which the Cortex -A15 reaches 2.5 GHz are unclear ]
The interAptiv family brings multithreading to the table, something which ARM hasn't started implementing yet. As our Lava Xolo smartphone revealed, implementing simultaneous multi-threading is highly beneficial for performance, particularly in current day workloads.
MIPS claims that 3 interAptiv cores deliver performance similar to / slightly exceeding what could be obtained from 2x Cortex-A9 / 3x Cortex-A5 cores with the same silicon area. Of course, CoreMark numbers heavily favor the interAptiv cores.
In the interAptiv family, the CPU execution pipeline is shared by multiple threads, which allows the mitigation of the performance impact of memory access latencies. Since interAptiv is targeted towards real time workloads, a hardware scheduler enables a better QoS.
MIPS terms the threads as VPEs (virtual processing elements). The pipeline itself is 9 stages long and is in-order. An optional multi-threaded IEEE 754 FPU can be added if necessary. DSP ASE is available, as is EVA (similar to the proAptiv family). The CPS used with the multi-core interAptiv family has the same features as that used by the proAptiv.
Compared to the proAptiv, the interAptiv core architecture allows for core clock shutdown during outstanding bus requests, intelligent way selection in the L1 instruction cache and 32-bit L1 data cache access as options for power reduction. [ Update: Intelligent way selection in the L1 instruction cache is also available in the proAptiv family ]
In the TSMC 40nm G process, the interAptiv family members can run at up to 1 GHz for applications involving multi-threading with QoS and at up to 1.2 GHz for multi-threading without QoS. If DSP ASE is not desired, implementations can run at up to 1.5 GHz for networking applications. [ Update: The quoted frequency numbers are 'sweet spots' in terms of power consumption and other application specific requirements. As mentioned in the previous section, the frequency of operation can be scaled depending on customer requirements and is not related to the presence of absence of DSP ASE / QoS ]
The microAptiv architecture is a superset of the M14K/c cores with microMIPS code compression support. With integrated DSP ASE, signal processing comes in at a lower cost. There are options to implement without caches / MMUs depending on the application.
This 5 stage pipeline architecture can run at up to 400 MHz in a 65nm LP process. MIPS also presented a side-by-side comparison of the Cortex-M4 and the microAptiv family:
Obviously, the extra features don't come without an area penalty. In a 90nm LP process, Cortex-M4 has a floorplanned area of 0.17 mm2 compared to the 0.42 mm2 of the microAptiv MCU (cacheless version). [ Update: MIPS claims that the area numbers are not apples-to-apples comparison. Under similar implementation conditions in 90LP - read, area optimized - MIPS expects the microAptiv family to have only 0.01 mm2 extra area. Our data is from ARM's Cortex-M4 specifications. We agree it is difficult to compare the area requirements, but readers should note that there is no free lunch when it comes to feature set vs. die area ]
MIPS launched the 1074K CPS in September 2010, and till now, we have seen only one announcement regarding the processor core actually having gone into silicon. Plenty of companies seem to have licensed the IP, but we haven't seen any SoCs announced with the 1074K. eSilicon announced last year that they had taped out the 1074K CPS in Globalfoundries 28nm process, and that they are on the lookout for potential licensees of their hardened IP core. It is clear that at least two years seem to be the bare minimum for volume shipment of announced IP cores. ARM is in the same boat, with the Cortex-A15 being a known entity as far back as February 2011.
Given that the high end proAptiv core delivers performance similar to the Cortex-A15, it appears that MIPS is a little bit late to the game. Being late to the game and not delivering any advantage would have been a disaster. Fortunately, MIPS seems to have been frugal with the area compared to ARM. However, the lack of licensees using the cores in the family to make a push in the high end mobile space is also a detriment. While Qualcomm and Broadcom are MIPS licensees, they are fully committed to ARM as their architecture of choice in the fast-growing mobile space.
Despite the fact that Google is paying attention to MIPS as a platform for Android, it looks likely that the architecture of choice in the mobile / tablet space will become a two way shootout between ARM and x86. That said, the easiest way to lose a fight is to not turn up for it. MIPS must continue to create high performance cores and try to get into mid-range smartphones / tablets for a start. They have a foothold in the low-end space, thanks to Ingenic's tablet platform.
However, the new proAptiv series does have some bright spots for consumers. One can look forward to more powerful home networking equipment and set top boxes. The cores serve to ensure that ARM can't easily encroach upon MIPS's traditional turf. Changing consumer behaviour and the rising popularity of OTT streaming has given ARM a slight opening in the STB / STB replacement space. The new proAptiv cores will definitely be able to help MIPS in this area.
Fortunately, for MIPS, the interAptiv and microAptiv family members seem to hold the upper hand in the battle against ARM's lineup. In the interAptiv series, MIPS has stolen a march over ARM with respect to the multi-threading feature. The integration of a powerful DSP engine in the microAptiv series should open up new markets and strengthen MIPS's position in its current ones.
General production availability of the proAptiv and interAptiv cores is slated to be in the middle of 2012. The microAptiv cores are available for production now. MIPS has also developed strategic relationships with multiple vendors for complementary IP and enabling technologies in order to speed up the SoC development of their licensees.
We look forward to seeing silicon based on the MIPS processor IP cores soon.