ARM Challenging Intel in the Server Market: An Overview

Name: ARM Challenging Intel in the Server Market: An Overview
Item: ARM Challenging Intel in the Server Market: An Overview
Author: Johan De Gelas

by Johan De Gelas on December 16, 2014 10:00 AM EST

78 Comments | Add A Comment

78 Comments

RISC vs. CISC Revisited

The RISC vs. CISC discussion is never ending. It started as soon as the first RISC CPUs entered the market in the mid eighties. Just six years ago, Anand reported that AMD's CTO, Fred Weber was claiming:

Fred said that the overhead of maintaining x86 compatibility was negligible, at the time around 10% of the die was the x86 decoder and that percentage would only shrink over time.

Just like Intel today, AMD claimed that the overhead of the complex x86 ISA was dwindling fast as the transistor budget grew exponentially with Moore's law. But the thing to remember is that high ranking managers will always make statements that fit their current strategy and vision. Most of the time there is some truth in it, but the subtleties and nuances of the story are the first victims in press releases and statements.

Now in 2014, it is good to put an end to all this discussion: the ISA is not a game changer, but it matters! AMD is now in a very good position to judge as it will develop x86 and ARM CPUs by the same team, lead by the same CPU architecture veteran. We listened carefully to what Jim Keller, the head of the AMD CPU architect team, had to say in the 4th minute of this YouTube video:

"The big fundamental thing is that ARMv8 ISA has more registers (32), a three operand ISA, and spends less transistors on decoding and dealing with the complexities of x86. That allows us to spend more transistors on performance... ARM gives us some inherent architectural efficiency."

You can debate until you drop, but there is no denying that the x86 ISA requires more pipeline stages and thus transistors to decode than any decent RISC ISA. As x86 instructions are variable length, fetching instructions is less efficient and requires more transistors. The instruction cache is also larger as you need to store pre-decode information. The back-end might deal with RISC-like micro-ops but as the end result must adhere to rules of the x86 ISA, thus transistors are spent on exception handling and condition codes.

It's true that the percentage of transistors spent on decoding has dwindled over the years. But the number of cores has increased significantly. As a result, the x86 tax is not imaginary.

Hardware Accelerators

While we feel that the ARMv8 ISA is definitely a competitive advantage for the ARM server SoCs, the hardware accelerators are a big mystery: we have no idea how large the performance or power advantage is in real software. It might be spectacular or it might be just another "offload works only in the rare case where all these conditions are met". Nevertheless, it is interesting to see how the ARM server SoC has many different integrated accelerators.

Most of them are the usual IPSec, TCP offloading engines, and Cryptographic accelerators. It will be interesting to see if the ARM ecosystem can offer more specialized devices that can really outperform the typical Intel offerings.

One IP block that got my attention was the the Regex accelerators of Cavium. Regular expression accelerators are specialized in pattern recognition and can be very useful for search engines, network security, and data analytics. That seems exactly what we need in the current killer apps. But the devil is in the details: it will need software support, and preferably on a wide scale.

The Evolving Server Market Conclusions So Far

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

78 Comments

View All Comments

hojnikb - Tuesday, December 16, 2014 - link
Wow, i have never motherboard that simple :)
CajunArson - Tuesday, December 16, 2014 - link
OK you devote another huge block of text to the typical x86 complexity myth* followed by: Oh, but the ARM chips are superior because they have special-purpose processors that overcome their complete lack of performance (both raw & performance per watt).

Uhm... WTF?? I need to have a proprietary, poorly documented add-on processor to make my software work well now? How is that a "standard"? How is requiring a proprietary add-on processor that's not part of any standard and requires boatloads of software cruft working in a "reduced instruction set architecture" exactly?

I might as well take the AVX instruction set for modern x86... which is leagues ahead of anything that ARM has available, and say that x86 is now a "RISC" architecture because the AVX part of x86 is just as clean or cleaner than anything ARM has. I'll just conveniently forget about the rest of x86 just like the ARM guys conveniently forget about all the non-standard "application accelerators" that are required to actually make their chips compete with last-year's Atoms.

* Maybe in a micro-controller setting where you are using a PIC or Arduino the x86 decoding is a real issue, but in a server? Please. Considering the only hard numbers you have show a 2013-model Atom beating a 2015-model ARM server processor, you'll have to try harder.
hlmcompany - Tuesday, December 16, 2014 - link
The article describes ARM chips as becoming more competitive, but still lagging behind...not that they're superior.
Kevin G - Tuesday, December 16, 2014 - link
The coprocessor idea is something stems from mainframe philosophy. Historically things like IO requests and encryption were always handled by coprocessors in this market.

The reason coprocessors faded away outside of the mainframe market is that it was generally cheaper to do a software implementation. Now with power consumption being more critical than ever, coprocessors are seen as a means to lower overall platform power while increasing performance.

Philosophically, there is nothing that would prevent the x86 line from doing so and for the exact same reasons. In fact with PCIe based storage and NVMe on the horizon in servers, I can see Intel incorporating a coprocessor to do parity calculations for RAID 5/6 in there SoCs.
kepstin - Tuesday, December 16, 2014 - link
Intel has already added some instructions in avx and avx2 that vastly improve the performance of software raid5 and 6; the Haswell chip in my laptop has the Linux software raid implementation claiming 24GiB/s raid5 with avx, and 23GiB/s raid6 with avx2 (per core).
MrSpadge - Tuesday, December 16, 2014 - link
Of course additional power draw for more complex instruction deconding mattes in servers: today they are driven by power-efficiency! The transistors may not matter as much, but in a multi-core environment they add up. Using the quoted statement from AMD of "only 10% more transistors" means one could place 11 RISC cores in the same area for the same cost as 10 otherwise identical x86 cores. Johan said it perfectly with "the ISA is not a game changer, but it matters".

And you completely misunderstood him regarding the accelerators. Intel is producing "CPUs for everyone" and hence only providing few accelerators or special instructions. In the ARM ecosystem it's obvious that vendors are searching niches and are willing to provide custom solutions for it - hence the chance is far higher that they provide some accelerator which might be game-changing for some applications.

This doesn't mean the architecture has to rely entirely on them, neither does it mean they have to be undocumented. The accelerators do not even have to be faster than software solutions, as long as they're easy enough to work with and provide significant power savings. Intel is doing just that with special-purpose hardware in their own GPUs.

And don't act as if much would have changed in the Atom space ever since 22 nm Silvermont cores appeared. It doesn't matter if it's from 2013 or 2015 - it's all just the same core.
OreoCookie - Tuesday, December 16, 2014 - link
What's with all the unnecessary piss and vinegar?

All CPU vendors rely increasingly on specialized silicon, newer Intel CPUs feature special crypto instructions (AES-NI) and Quick Sync, for instance. Adding special purpose hardware to augment the system (in the past usually done for performance reasons) is quite old, just think of hardware RAID cards and video »accelerators« (which are not called GPUs). The reason that Intel doesn't add more and more of these is that they build general purpose CPUs which are not optimized for a specific workload (the article gives a few examples). In other environments (servers, mobile) the workload is much more clearly defined, and you can indeed take advantage of accelerators.

The biggest advantage of ARM cpus is flexibility -- the ARM ecosystem is built on the idea to tailor silicon to your demands. This is also a substantial reason why Intel's efforts in the mobile market have been lackluster. Recently, Synology announced a new professional NAS (the DS2015xs) which was ARM-based rather than Intel-based. Despite its slower CPU cores, the throughput of this thing is massive -- in part, because it sports two (!) 10 GBit ethernet ports out of the box. Vendors are looking for niches where ARM-based servers could gain a foothold, so they are trying a lot of things and see what sticks.
goop666666 - Saturday, December 20, 2014 - link
LOL! Most of the comments here like this one seem to be written by people who think computers should all be like gaming machines or something.

Here'a tip: no-one cares about "complexity," "standardization," "RISC," or anything else you mention. All they care about in the target market for ARM server chips is price, performance and power, and I mean ALL THREE.

On this Intel cannot compete. They sell wildly overpriced legacy hardware propped up by massive R&D expenditures and they're wedded to that model. The rest of the industry is wedded to the new and cheap model. Just like how the industry moved to mobile devices and Intel stood still, this change will also wash over Intel while they sit still in denial.

There's a reason why Intel stock has gone no-where for years.
nlasky - Monday, December 22, 2014 - link
Jan 8, 2010, Intel stock price $20.83. Dec 19, 2010, Intel stock price $36.37. If by gone no-where in for years you mean increased by 70% I guess you would be correct. Intel can't compete because they are wedded to their model? They have a profit margin of 20% and an operating margin of 27%. They could easily cut prices to compete with any ARM offerings. Servers have been around forever, unlike the mobile computing platform. Intel has an even larger stranglehold on this industry than ARM has in the mobile space. Here's a tip - stop spewing a bunch of uniformed nonsense just to make an argument.
nlasky - Monday, December 22, 2014 - link
*Dec 19, 2014

ARM Challenging Intel in the Server Market: An Overview

RISC vs. CISC Revisited

Hardware Accelerators

Post Your Comment

78 Comments

View All Comments

hojnikb - Tuesday, December 16, 2014 - link

CajunArson - Tuesday, December 16, 2014 - link

hlmcompany - Tuesday, December 16, 2014 - link

Kevin G - Tuesday, December 16, 2014 - link

kepstin - Tuesday, December 16, 2014 - link

MrSpadge - Tuesday, December 16, 2014 - link

OreoCookie - Tuesday, December 16, 2014 - link

goop666666 - Saturday, December 20, 2014 - link

nlasky - Monday, December 22, 2014 - link

nlasky - Monday, December 22, 2014 - link

Log in

Don't have an account? Sign up now