RISC vs. CISC Revisited

The RISC vs. CISC discussion is never ending. It started as soon as the first RISC CPUs entered the market in the mid eighties. Just six years ago, Anand reported that AMD's CTO, Fred Weber was claiming:

Fred said that the overhead of maintaining x86 compatibility was negligible, at the time around 10% of the die was the x86 decoder and that percentage would only shrink over time.

Just like Intel today, AMD claimed that the overhead of the complex x86 ISA was dwindling fast as the transistor budget grew exponentially with Moore's law. But the thing to remember is that high ranking managers will always make statements that fit their current strategy and vision. Most of the time there is some truth in it, but the subtleties and nuances of the story are the first victims in press releases and statements.

Now in 2014, it is good to put an end to all this discussion: the ISA is not a game changer, but it matters! AMD is now in a very good position to judge as it will develop x86 and ARM CPUs by the same team, lead by the same CPU architecture veteran. We listened carefully to what Jim Keller, the head of the AMD CPU architect team, had to say in the 4th minute of this YouTube video:

"The big fundamental thing is that ARMv8 ISA has more registers (32), a three operand ISA, and spends less transistors on decoding and dealing with the complexities of x86. That allows us to spend more transistors on performance... ARM gives us some inherent architectural efficiency."

You can debate until you drop, but there is no denying that the x86 ISA requires more pipeline stages and thus transistors to decode than any decent RISC ISA. As x86 instructions are variable length, fetching instructions is less efficient and requires more transistors. The instruction cache is also larger as you need to store pre-decode information. The back-end might deal with RISC-like micro-ops but as the end result must adhere to rules of the x86 ISA, thus transistors are spent on exception handling and condition codes.

It's true that the percentage of transistors spent on decoding has dwindled over the years. But the number of cores has increased significantly. As a result, the x86 tax is not imaginary.

Hardware Accelerators

While we feel that the ARMv8 ISA is definitely a competitive advantage for the ARM server SoCs, the hardware accelerators are a big mystery: we have no idea how large the performance or power advantage is in real software. It might be spectacular or it might be just another "offload works only in the rare case where all these conditions are met". Nevertheless, it is interesting to see how the ARM server SoC has many different integrated accelerators.

Most of them are the usual IPSec, TCP offloading engines, and Cryptographic accelerators. It will be interesting to see if the ARM ecosystem can offer more specialized devices that can really outperform the typical Intel offerings.

One IP block that got my attention was the the Regex accelerators of Cavium. Regular expression accelerators are specialized in pattern recognition and can be very useful for search engines, network security, and data analytics. That seems exactly what we need in the current killer apps. But the devil is in the details: it will need software support, and preferably on a wide scale.

The Evolving Server Market Conclusions So Far
Comments Locked

78 Comments

View All Comments

  • patrickjchase - Thursday, December 18, 2014 - link

    It's been a while since I worked on this stuff, but I don't think that the statement that "CCN is very comparable to the ring bus found inside all Xeon processors beginning with Sandy Bridge" is quite right.

    CCN
  • patrickjchase - Thursday, December 18, 2014 - link

    Finishing my comment:

    CCN
  • stefstef - Wednesday, December 17, 2014 - link

    the idea of having an energy efficient design certainly will pay off. nvidia and samsung showed that having i.e. 4 cores and a fifth core dedicated to the energy management can be a good low cost solution. i dont often read the articles at anandtech because they are usually boring. although i am happy to place a coment here. arm rules in certain fields but in a couple of years only because intel will allow them to do so. every company needs a room to live in. another american breakfast for the chinese who will get their share in the processor market as well.
  • milli - Thursday, December 18, 2014 - link

    I don't understand how ARM is suddenly going to succeed while MIPS and PowerPC have already tried and failed. I feel that ARM is more of a market trend than anything else (in the server market).
    Even the current ARM server SOC manufacturers have already tried to penetrate the server market. Cavium and Broadcom already had custom designed low-power MIPS SOCs. IBM, Applied Micro and Freescale have had a bunch of low-power PowerPC options.
    By the time any of these products is released, Intel is going to have a better alternative thanks to their process advantage. No IT manager is going to manage to convince any of the corporate fat-cats that a huge overhaul is needed. Same story over again.
  • yuhong - Friday, December 19, 2014 - link

    "Unfortunately their 16GB DIMMs will only work with the Atom C2000, leading to the weird situation that the Atom C2000 supports more memory than the more powerful Xeon E3."
    I think the reason is software related. More precisely, the Memory Reference Code (MRC).
  • intiims - Tuesday, December 30, 2014 - link

    If You want to know something about External Hard Drives visit http://www.hddmag.com/
  • adrian1987 - Monday, January 5, 2015 - link

    Hi. The Haswell core can actually have a max IPC of 6 instructions per cycle using macro-fusion not 5 as listed here (assuming the code is ideal). It has 2 execution units that can handle fused ALU and branch instructions. Source: http://www.anandtech.com/show/6355/intels-haswell-...
  • aaronjoue - Tuesday, April 7, 2015 - link

    Here is the real micro server. http://www.ambedded.com.tw/pt_list.php?CM_ID=20140...
    http://wiki.ambedded.com.tw/index.php?title=MicroS...
    7 & 21 nodes in a chassis
    It support Ubuntu and open source Ceph.

Log in

Don't have an account? Sign up now