The proAptiv family of processors can contain 1 to 6 proAptiv cores, each of which implements in about half the size of a standard Cortex-A15 core. This is not entirely impossible, given that some people in the industry feel that ARM's Cortex-A15 implementation takes up too much area for the advertised performance. However, it is likely that the NEON engine is being accounted for in the Cortex-A15 area while the proAptiv implementation doesn't take into account the 32 bit SIMD engine (DSP ASE). [ Update: MIPS clarified that the DSP ASE is not a configurable block and is included in the quoted area. The precise area numbers for ARM are estimates only, since ARM has published no concrete specifications for the Cortex-A15. MIPS also attempted to remove the estimated area for NEON, with the desire to achieve as close to an “apples to apples” comparison in area as possible].

The proAptiv core is a superscalar out-of-order CPU with quad instruction fetch and fused triple dispatch. In the absence of any dependencies, the CPU can issue up to four integer and two floating point operations. Multi-level TLBs and branch target buffers / sophisticated branch prediction aid in getting more than 60% better performance over the previous generation 1074K series. The FPU is dual issue and runs at the same speed as the CPU.

The proAptiv and interAptiv families implement EVA (Extended Virtual Addressing) in order to better utilize the available address space. Similar to the Cortex-A15, the IP includes a coherence manager and an integrated L2 cache controller with ECC support. While Cortex-A15 supports up to 32 cores, the proAptiv family supports up to 6. An interesting aspect of the Coherent Processing System (CPS) is the presence of a cluster power controller which does clock gating per core (common in other multi-core CPUs also) and voltage domain / gating per core. The latter has interesting applications in scenarios similar to ARM's big:LITTLE architecture. Instead of tying up a big core such as the A15 with a smaller one like the A7, MIPS suggests that licensees could implement a multi-core proAptiv system with some cores running at much lower frequencies / lower voltage to save upon power (since the proAptiv cores are already small compared to the A15 core).

Architecture Comparison
  proAptiv ARM Cortex A9 Qualcomm Krait ARM Cortex A15
Decode 3-wide 2-wide 3-wide 3-wide
Pipeline Depth 13 stages 8 stages 11 stages 15 stages
Out of Order Execution Y Y Y Y
Pipelined FPU Y Y Y Y
SIMD / Media Processing Engine DSP ASE (32-bit wide) Optional MPE (64-bit wide) Y (128-bit wide) Optional MPE (128-bit wide)
Process Technology 40nm / 28nm 40nm / 32nm 28nm 28nm
Typical Clock Speeds 1.2GHz* 1.2GHz 1.5GHz 2.5GHz

While ARM expects the A15 to reach up to 2.5 GHz in the HP/G processes, MIPS only expects up to 1.5 GHz. That said, embedded applications using the proAptiv are likely to be power sensitive, and while peak performance of the A15 is likely to be much better than the proAptiv family, MIPS can tout the smaller size for equivalent performance as an advantage.

*Update: MIPS supplied detailed feedback on our architecture comparison, and I will leave it here for readers to take note:

  • MIPS and ARM provide synthesizable IP. As such, these technologies can be implemented in any process geometry and node, with standard cells and memories. At that point, it all comes down to what target a customer shoots for, what physical IP libraries and memories they use, and other implemetation specific aspects.
  • MIPS at 1.2 GHz is using readily available using TSMC's 12 track SVt libraries and representing worst case silicon corner results with production margins. MIPS projects that using more aggressive implementation techniques and typical corner silicon, proAptiv implementations can reach 2.0-2.5 GHz (similar to the Cortex-A15) [ Editor's Note: The conditions under which the Cortex -A15 reaches 2.5 GHz are unclear ]

 


 

Introduction interAptiv and microAptiv Architectures
POST A COMMENT

40 Comments

View All Comments

  • sheh - Thursday, May 10, 2012 - link

    Even a Pentium 1 (if not even 486es or 386es, although with some missing useful instructions) can do preemptive multithreading fine, so it seems unlikely modern ARM CPUs are incapable of that much. There are also multi-core ARMs which by definition are multithreaded, so the article must be referring to something more elaborate, no? Reply
  • quadrivial - Thursday, May 10, 2012 - link

    MIPS licensees have some beastly processors designed for networking. One example company is Netlogic

    http://www.netlogicmicro.com/Products/MultiCore/in...

    Interestingly, AMD had a MIPS license and may still have one as the company that purchased the MIPS division in 2006 (back when AMD was selling several divisions to make ends meet) already had a MIPS license.
    Reply
  • Penti - Thursday, May 10, 2012 - link

    AMD's Alchemy MIPS line ended up with just that company NetLogic Microsystems via RMI. It is totally unremarkable and pointless if they are still a MIPS licensee or not, there are several x86-licenses not used too. Intel is still (ARM11/ARM7) a ARM licensee too. They would still need to license the synthesizeable IP-cores if they don't like to make their own architecture/design. MIPS doesn't list AMD as a licensee however. Broadcom itself has some impressive MIPS designs and happen to own Netlogic. Reply
  • quadrivial - Friday, May 11, 2012 - link

    I agree with you on almost everything.

    The big game changer here is a new focus on mobile combined with non-competitive chips. AMD has a game plan for laptops (trinity). AMD has a plan for desktop and servers (piledriver).

    What is AMD's move for tablets and smartphones? The best current option is the 28nm shrink of Brazos. That chip will be a possibility for tablets, but it will be too power hungry for phones. A redesign for lower power is too time-consuming (a couple of years). The better option could be to pair an existing CPU with an existing GPU. The Brazos GPU would work and, as a bonus, would be very competitive even if the processor launched in a few months (and adding another shader array wouldn't be too difficult). On the CPU side, an already-designed MIPS processor could be a great match. Power consumption and die size would decrease while performance would remain about the same (future integration of x86 like that done with Loongson would be possible if desired). In addition, companies that are skeptical of the x86 label on their mobile products would be satisfied.

    The question seems to become one of "what's the cost of a buy-in?"
    Reply
  • Penti - Saturday, May 12, 2012 - link

    They have chips and chips coming that is just fine for Tablet PCs including x86 Android tablets. Intel's work should prove useful there. Remember the iPad 3gen has a 42.5 Whr battery. That is more then low end laptops or ultraportables like 11" MBAir. They don't need phone IP they only look at ARM for server use as I said in some other post under this article. They already sold off their mobile gpus to Qualcomm and Broadcom and just shrinking their GCN design wouldn't really work if they like to do it in a few hundreds mW. Nvidia isn't terribly successful doing the same. GPUs that ATi and AMD sold where designs aimed at that market. AMD only needs to compete against Atom chips with PowerVR-graphics here. AMD has no interest in emulating x86 in software like the Chinese that has no Intel x86 license. They might want to run ARM-binaries in Android like Intel's platform though. Remember the manufacturing arm of AMD Global foundries already manufactures ARM processors for other companies and is an ARM licensed manufacturer. Brazos is manufactured at TSMC and AMD is already moving towards real synthesizable x86-cores. Intel is busy manufacturing their own stuff. There are no MIPS-processors design for AMD's GF plants really. Foundries is important here. When it came to Godson and Loongson they where manufactured by a MIPS licensed plant even if they where custom designs. When you choose a design that is ready it needs to be tooled for the specific fab. That's why players like Nvidia stuck with 40nm LP and LPG a long while.

    Look at AMD's upcoming Tamesh as far as tablets and thin and low power ultra-portables is concerned. They are looking at integrating GCN instead of the current Brazos Evergreen/VLIW-5.

    Designing a low power CPU would be far easier then to employ a new team of a few hundred engineers to build a new GPU-architecture just for a flawed MIPS/Mobile solution they don't need. AMD already has a two track approach and designs two different x86-64 processor cores/designs. One of which is highly synthesizable and low-powered. They only has one GPU-architecture track however. So they can already target what they need and spends their resources wisely. APU's is where they are heading and they have no reason to deviate they need to concentrate on making those great. These MIPS-technologies cores isn't even proven designs. The Chinese has their own designs on the devices running Android. AMD doesn't need to target phones just because Intel does or Freescale (ex Motorola) does and so on. A tablet does fine with a sub-5W APU. 50 Whr battery divided by 5W is just 10 hrs of computing as far as the SoC is concerned add in all the peripherals and it is still easy to hit the magic about 10 hrs. For certain uses. A SoC idling at 2-3 W and a screen using 5 Watts at max brightness get you longer then most ultraportables. That is laptop sizes machines. If they can do better great, don't think most OEM's care enough though. No need for sub-1W there. It's not really competing against ARM-machines if it is running Windows or x86-Android. It can hit the tablet form factor of those ARM-tablets any ways. Those wishing to run phones on x86 has Intel to turn to in the next year or so. You will see larger chips running tablets though. AMD can compete fine with Atom/Nextgen-Atom there. It is still a potential market of millions of units. Millions more when you factor in embedded market and low-end laptops which share the design.
    Reply
  • quadrivial - Saturday, May 12, 2012 - link

    AMD doesn't need to shrink its GCN design. Mobile GPU design is doing what desktop GPU design has already done (at a slightly accelerated pace and with some of the experimentation being unneeded as it was already done as well). Nvidia used an older GPU model with some newer efficiencies added for Tegra2/3. AMD's older/smaller designs (such as an 80 shader VLIW5 core from Brazos) are very similar to the "new" and "innovative" designs such as the Mali 6xx, Adreno 3xx, SGX6xx, etc. There's no need for AMD to spend too much money or resources to reinvent the wheel for either x86 or MIPS-based chips (MIPS has standard "slots" to drop-in GPU's and other co-processors). AMD need only modify an existing design (though the possibility of using a new architecture still exists).

    AMD already does x86 emulation. They don't call it that and keep the ISS to themselves, but all the complex x86 instructions are decoded (actually re-encoded) into another instruction set (so-called micro-ops) which then allow a RISC architecture that hardware emulates them. The advantage of AMD doing such with MIPS is twofold. The first is that AMD has all the x86 experts and raw IP, so unlike China, AMD has a much better chance of making the hardware execute at the same rate as current "x86 only" chips. The second is that adding another architecture (MIPS) as a superset of x86 means that the superset architecture gains wherever x86 falls short and still incurs no penalties. That is to say one could run an x86 OS and then run x86 code, but MIPS code made for MIPS only chips (specifically low power chips where decoder units use too much power) would also run on the x86 chips. This gives a clear path to migration from x86 to the better architecture (and with the current ARM scare, that's a huge marketing point).

    AMD has stated quite clearly that there future's in mobile. Giving up the biggest mobile market (when there's still no "winner") isn't good strategy.

    AMD is not tied to any fabrication process (the sold the remaining shares in GF). AMD only goes with whoever is cheapest while meeting the required specifications. MIPS designs have been proofed on 28nm TSMC (Brazos is 40nm TSMC fabrication, and the current 28nm GPU designs are made with TSMC in mind as well). The idea that MIPS would release designs that "aren't proven" is not completely logical. AMD and Intel can launch "unproven designs" because they design them and then sell them. MIPS designs, but relies on other companies to license and make the chips. If the company's engineers don't like the samples or design (not a group that is likely to be affected by a "MIPS inside" logo), then they can't turn a profit (for this reason, in addition to their reputation in the market, they have huge incentive to make good designs).

    AMD has a chance with Brazos, but not beyond tablets. Even in the tablet market, there's the same problem that Intel is facing. Atom was forced to lengthen the pipeline to reduce decode power consumption, but that also made for branch prediction and performance penalties. In addition, Intel was forced to have only one core (two with HT). The net effect is that Krait (and presumably A15) with 2 real cores each having better coremark/MHz have better performance while using less power. AMD will face huge performance per watt issues when going against either ARM or MIPS (an issue that won't ever completely disappear as long as x86 is used and will continue to be significant for a couple more fab sizes past 28nm).
    Reply
  • Penti - Sunday, May 13, 2012 - link

    Hardware is not emulation, Loongson does hardware accelerated QEMU x86 emulation. It only does it that way because software projects doesn't need to worry about the semicompanies going after them. A x86 front end is always an x86 front end. It's the decode hardware they are not allowed to build. There is still a similar front end on RISC or VLIW cpus. You don't design cpus by putting together a few TTLs or ALU chips any more.

    Scaling down AMD gpus makes them perform awful that was the only point and they still are quiet different to tilebased GPUs or their mobile drivers. An AMD GPU with just a few SPs can be outperformed by todays mobile gpus. AMD isn't trying to do anything under tablets and thats fine, I'm not sure what your trying to get at. I'm saying they shouldn't target anything less and they already has two separate CPU designs going on. Two cores is enough for them. It's more then Intel has for x86 almost considering there hasn't been a real architectural update to Atom yet. AMD is already designing new low power cpu architecture (Jaguar core) and a separate and different server/desktop architecture (Steamroller-line). AMD has no reason to go cooking VLIW-5/ARM or VLIW-5/MIPS chips regardless of who fabs them. They would have much less work saying to produce ARM with GF physical IP and Mali-gpus and just support a SoC that has all the software and tools done from the partners already. Not much use for it either. But others can do that and do much cooler stuff. Design/fab choice needs to happen a few years in advance though as you use tools / libraries aimed at a particular fab and process. That is why they can only fab Brazos at TSMC and Trinity at a couple of GF plants and so on. Even a highly synthesizable design requires a lot of work to move to another fab to do a tapeout for a totally different process with different transistors and fab specific tools and libraries. You don't shop around in the middle of a product cycle. Even old style die-shrink takes a lot, something like Ivy-Bridge even more (3D-transistors) and move over to another fab and use other tools even more.

    Many MIPS-vendors are like Qualcomm (a MIPS-vendor too) with Krait their own custom design, fabs with licenses also has their own designs based on the physical IP from MIPS/ARM for that matter. I'm sure the new mips designs will be great as MCUs. What ISA they speak is essentially meaningless as long as the tools (compiler, platform etc) supports it. A current 80 SP VLIW-5 AMD GPU uses more power then a 4/quad core Cortex A15 + Mali-400 MP4 overclocked + on-package DRAM together. RMI even used a Mali-GPU with the ex AMD MIPS design and I'm sure AMD could even license it for an mobile x86 SoC. If they really wanted to. They at least has no reason to start developing a mobile gpu/drivers. Their expertise in the field is found at Qualcomm and Broadcom today. It's substantial work to make a redesign of their architecture, write drivers suitable that performs good and it is really needed to target really low powered SoCs. There is no reason to disparage against x86 here, the 2007 Atom architecture made it into phones just fine a newer with updated architecture handles multicore just fine. ARM designs aren't made with 100 000 transistors any more. AMD only need to compete against Atom tablets and laptops here. They are not planning on selling any mobile (phone) chips. Intel will eventually do that again (the first Blackberrys used embedded 386-processsors for that matter) in some real way. AMDs Jaguar core will have other frontend and decode hardware then Steamroller and Piledriver for that matter. Features doesn't make the GPUs alike at all for that matter. Remember AMD didn't sell off their (now) Adreno gpus until 2008 they where very well aware of the market they divested from. If they would like to have done mobile they would had continue selling their IP to Qualcomm and others. Would have partnered with someone if they wanted to direct their own mobile SoC with their own then mobile gpus.

    You seem to forget that that x86 has surpassed almost everything, it performance better then the advance RISC-chips with SIMD and hw-virtualization or for that matter EPIC. It also does quite well as far as embedded solutions are concerned. Compilers perform great with x86 too. (Intel MIC/Larrabee uses 50 RISCy Pentium cores, 2007-era Atom is down to ARM-levels in terms of power consumption with Medfield which is a true SoC with gpu/video-encode|decode/ISP/Memory controller/IO). Intel's chips aimed at tablets will have a higher power consumption though.There is were AMD comes in nowhere else. Adding MIPS in the front-end would be useless here the firmware situation would be awful. The cpu would still speak AMD's macroops nothing else internally. Next year you will have dual-core "Medfield" Atom by just a die-shrink basically for that matter. That is what 28nm Krait is up against. Not the past.

    If AMD does anything significant it is to offer ARMv8 64-bit server SoC's.

    There are plenty of winners in the mobile field. It's dominated by companies such as TI, Qualcomm, Broadcom, Freescale, Renesas, Samsung, ST-Ericsson (STMicro plus old EMP-group) and so on. They have literally made billions and billions of chips. If they (AMD) can't do stuff better then domestic Chinese tech companies that turns out ARM and MIPS solutions they should of course stay out from doing the same thing and trying to compete with the same stuff. There are 2-3 really big fabless mobile SoC manufacturers already that dominates the market. We don't need AMD to come to the table when their old tech and sold off tech is already powering those businesses that succeeded. I would for that matter love for Intel to design their own gpus for the mobile market/SoC but they reasonably can't. AMD did and sold their business doing it with much success following that technology. They have no reason to start over fighting bigger fabless businesses with their old technology powering them. When you exit something you quite for a long while, even if you like to come back.
    Reply
  • obiwanbill - Thursday, May 10, 2012 - link

    What is "OTT streaming?

    Maybe ... "Over the ??"

    My reco would be this ... Since you are PA (posting articles) and basically anybody, AAUASL (at any understanding and skill level), reads your articles ... on TFUOAA (the first use of an acronym) you should really include the FT (full text) for the acronym.

    I know, you MNUTAA (might not use the acronym again) in the article, but like others reading this PA (particular article), for many it is getting close to reading jibberish. You know, TBITH (the bar is too high) and many won't understand the article and thereby reduce their visits to your site because they simply won't understand the content.

    My nickel. (Pennies aren't made in Canada anymore so I can't share 2 cents)

    OB (Obiwanbill)
    Reply
  • ganeshts - Thursday, May 10, 2012 - link

    (Over The Top Streaming)

    Apologize for the oversight.
    Reply
  • obiwanbill - Thursday, May 10, 2012 - link

    I usually let these go, but, since I am giving feedback today ...

    On the first page, 3rd paragraph, "Betweem"

    You can delete this post after you fix the typo.

    OB
    Reply

Log in

Don't have an account? Sign up now