Conclusions So Far

Of one thing we are sure: the "cheaper, smaller, higher volume option historically wins" is a very weak argument to make when claiming that ARM SoCs will overtake Intel in the server market. It is hard to make all of the puzzle pieces come together: performance, power, volume, and software. Low prices and volume are not enough. We would love to see some real competition in the server market, but Intel is a lot better positioned today to fend off attacks than the RISC players were back in the 90s.

The current ARM server SoCs are a lot more powerful than Calxeda's ECX-1000, but they do not face a hopelessly outdated Atom S1200 anymore. The Atom C2000 is a huge step forward and the Xeon E3 has continued to evolve in such a way that even eight of the best ARM cores cannot deliver more raw integer processing power than a quad-core E3 with SMT. Meanwhile, the Xeon-D will offer all the advantages of the high performance "Broadwell" architecture, the flexibility of Intel's Turbo Boost, Intel's excellent process technology, and the highly integrated Atom C2000 SoC in one very competitive package.

The first – albeit very rough – performance data indicates that the server ARMada is not ready (yet?) to take on the best Intel Xeons in a broad range of server applications, at least in terms of performance. However, the ARM challengers do have an opportunity. Despite the massive number of Intel SKUs, Intel's market segmentation is rather crude and assumes that all customers can easily be categorized into three (maybe four) large groups: For low budgets, get the low range Xeon E3 (e.g. E3-1220 v3). Pay a bit more and you get Hyper-Threading and higher clock speeds (E3-1240 v3). Pay slightly more and you get another speed bump. Pay much more and you get four memory channels. We'll throw in more cores and a larger cache as a bonus (Xeon E5).

What if I have a badly scaling HPC application (low core count) that needs a lot of memory bandwidth? There is no Xeon E3 with quad channel. What if I need massive amounts of memory but moderate processing power? The Xeon E3 only supports 32GB. What if my application needs lots of cores and bandwidth but does not benefit from large and slow LLC caches? There is no Xeon E5 for that; I can only choose one of the most expensive E5s. And these examples are not invented; applications like these exist in the real world and are not exotic exceptions. What if my application benefits from a certain hardware accelerator? Buy a few 100k of SoCs and we'll talk. Intel's market segmentation is based largely on the assumption that every need (I/O, caches, memory bandwidth, memory capacity) is proportional to processing power.

The ARM based challengers have the potential to serve those "odd" but relatively large markets better. The cost to develop new SoCs is lower and ARMv8 has the inherent RISC advantage of spending fewer transistors on ISA complexity. This lowers the Intel advantage of process technology leadership.

Cavium has a clear focus and targets the scale-out, telecom, and storage markets. We are very curious how the first chip which is specialized for "scale-out" applications will perform. It has been a long time since we have seen such a specialized SoC and it is crystal clear that performance will vary a lot depending on the application. Our first impression is that the chip will be ideal running lots of network intensive virtual machines on top of a hypervisor, such as Xen or KVM.

AppliedMicro's X-Gene seems to target a much wider range of applications, attacking the Intel Xeon E3 and the fastest Atom C2000. The hardware accelerators and quad-channel memory should give it an edge in some server applications while staying close enough in others. Much will depend on how quickly the X-Gene 2 is available in real servers. The X-Gene 2 "ShadowCat" is already up and running, so we have high hopes.

Broadcom seems to have a similar approach. Broadcom is late but is a market leader with deep pockets and an impressive list of customers. The same is true for Qualcomm. But we needs specs and not just broad and vague statements before we dedicate more words to the server plans of Qualcomm.

AMD's Opteron A1100 is definitely betting on undercutting Intel's low-end Xeons in price and features. Everything about it screams "time to market, inexpensive but proven low power design". The more ambitious AMD ARM SoCs will come later, however, as the current A1100 is missing a crucial feature: a link to the Freedom Fabric. The network fabric is a critical feature as OEMs can then build a low power, high performance networked micro server cluster. It was the strongest point of the Calxeda based servers as it kept power per node low, offered very low latency network, and lowered the investments in expensive network gear (Cisco et al.). AMD is a well known brand with the enterprise folks and has a lot of unique server/HPC IP.

Last but not least, many enterprises in the IT world including HP, Facebook and Google want to see more competition in the server market. So all ARM licensees can count on some goodwill to make it happen.

We from our side have been preparing as well. We have developed several new benchmarks to test this new breed of servers. Hard numbers say more than just words, but you'll have to wait for part two of this series for those.

 

 

The RISC Advantage
Comments Locked

78 Comments

View All Comments

  • esterhasz - Thursday, December 18, 2014 - link

    But this is exactly why a wider array of machines based on their chips would make sense: the R&D cost is already spent anyways, since iPhone and iPad need chips, selling more units thus reduces R&D cost per unit. Economies of scale.

    I don't believe a MBA variant with ARM is down the road either, but the rumored iPad Pro could develop into something similar rather quickly.
  • OreoCookie - Tuesday, December 16, 2014 - link

    If you want to talk about ARM on the desktop, that's a whole other discussion, but one that most certainly needs to include price: if the price difference between a Broadwell-based Core M and a fictitious Apple A9X is $200~$230, then this changes the discussion completely. Two other factors are graphics performance (the Core M has »only« 1.3 billion transistors, the A8X ~2 billion, indicating that the mythical A9X may have faster graphics) and the fact that Apple controls the release schedule and can spec the SoC to meet its projected needs. To view this topic solely through the lens of CPU performance is myopic.
  • darkich - Friday, December 19, 2014 - link

    Your comparisons missed the picture spectacularly.
    A8X is a 20nm 2-4W TDP chip with a price that is probably around 70$.
    Top of the line Core M5Y70 is a 14nm 4.5 W TDP chip with a price of 270$.
    And it has a weaker GPU, btw. (raw performance). And it throttles massively, effectively giving only 50% of the benchmark performance.

    If you're going to compare that to an Apple chip, compare it to a 14nm A9X with custom derived PowerVR series 7 GPU,(scales up to 1,4 TFLOPS) vastly expanded memory controllers connected to a much faster RAM (compared to one in the iPad) upclocked to 2GHz, that are available at any time.
  • darkich - Friday, December 19, 2014 - link

    .. *with cores upclocked to about 2GHz
  • Flunk - Tuesday, December 16, 2014 - link

    Nintendo already sells ARM systems, the 3DS and the DS before it are both ARM-based. The PSVita is ARM too. I don't see an ARM Macbook Air anytime soon, they need a bigger and higher-clocking chip for that and it doesn't look like that's going to happen anytime soon.
  • Nintendo Maniac 64 - Tuesday, December 16, 2014 - link

    Even the Game Boy Advance used an ARM7 for its main CPU.
  • jjj - Tuesday, December 16, 2014 - link

    Obviously there are handhelds using ARM but the point was about bigger cores and clearly not handhelds.
  • DLoweinc - Tuesday, December 16, 2014 - link

    Don't quote Wikipedia, not suitable for this level of writing.
  • garbagedisposal - Tuesday, December 16, 2014 - link

    Says DLoweinc, master of knowledge and scholarly writing.
    In contrast to your childish and outdated opinion, Wikipedia is a perfectly valid source of information, go read about it and quit crying.
  • Daniel Egger - Tuesday, December 16, 2014 - link

    The problem really is the custom solutions can simply not compete with Intel on any level for general purpose computing (which the majority of applications are), not on performace/price, performance/power and not even on features/price.

    For instance I can see a huge market for sub-Xeon (or Atom C) performance at a corresponding price -> not going to happen because everyone is targeting > Xeon performance at ridiculous prices because they're expecting the margin to be there however there're simply to many compromises to be made by the buyers so that has to fail.

    Also I can see a huge demand for Atom C - Xeon performance at lower power consumption however no one seems to be really targetting this, all we get are Raspberry Pi's and a bit beefier but close from even Atom C. The new virtualisation techniques (Docker et al) opened a whole new can of possibilities for non-x86(_64) devices because virtualisation is suddenly possible and much more lightweight than ever before but no one seems to want to jump this opportunity.

    I'd really like to buy some affordable general purpose (BYOM/BYOS) hardware which has a little bit of oomph and takes little power which should be the powerful sides of any of the contenders but somehow all fail to deliver and I don't even see an attempt to change that.

    If I want mind-boggling performance at decent performance/price ratio with real virtualisation and 100% standard software compatibility there's no way around the high end Xeons (and maybe AMD iff they manage to get their asses back up) and none of the contenders is ever going to challenge that so they might as well stop trying.

Log in

Don't have an account? Sign up now