Software Support

Calxeda supports Ubuntu and Fedora, though any distribution based on the (32-bit) ARM Linux kernel should in theory be able to run on the EnergyCore SoCs. As for availability, there are already prebuilt Highbank kernel images available in the Ubuntu ARM repository and Calxeda has set up a PPA of its own to ease its kernel development.

The company has also joined Linaro—the non-profit organization aiming to bring the open source ecosystem to ARM SoCs.

The ARM Server CPU

A dual Xeon E5 or Opteron 6300 server has much more processing power than most of us need to run one server application. That is the reason why it is not uncommon to see 10, 20 or even more virtual machines running on top of them. Extremely large databases and HPC applications are the noticeable exceptions, but in general, server purchasers are rarely worried about whether or not the new server will be fast enough to run one application.

Returning to our Boston Viridis server, the whole idea behind the server is not to virtualize but to give each server application its own physical node. Each server node has one quad-core Cortex-A9 with 4MB of L2 cache and 4GB of RAM. With that being the case, the question "what can this server node cope with?" is a lot more relevant. We will show you a real world load further in this review, but we thought it would be good to first characterize the performance profile of the EnergyCore-1000 at 1.4GHz. We used four different benchmarks: Stream, 7z LZMA compression, 7z LZMA decompression, and make/gcc building and compiling.

We compare the ECX-1000 (quad-core, 3.8~5W, 40nm) with an Intel Atom 230 (1.6GHz single-core plus Hyper-Threading, 4W TDP, 45nm), Atom N450 (1.66GHz single-core + HTT, 5.5W TDP, 45nm), Atom N2800 (1.86GHz dual-core + HTT, 6.5W, 32nm), and an Intel Xeon E5-2650L (1.8-2.3GHz octal-core, 70W TDP, 32nm).

The best comparable Atom would be the Atom S1200, which is Intel's first micro-server chip. However the latter was not available to us yet, but we are actively trying to get Intel's latest Atom in house for testing. We will update our numbers as soon as we can get an Atom S1200 system. The Atom N2800 should be very close to the S1200, as it has the same architecture, L2 cache size, TDP, and runs at similar clockspeeds. The Atom N2800 supports DDR3-1066 while Centerton will support DDR3-1333, but we have reason to believe (see further) that this won't matter.

The Atom 230/330 and N450 are old 45nm chips (2008-2010). And before you think using the Atom 230 and N450 is useless: the Atom architecture has not changed for years. Intel has lowered the power consumption, increased the clockspeed, and integrated a (slightly) faster memory controller, but essentially the Atom 230 has the same core as the latest Atom N2000. I quote Anand as he puts it succinctly: "Atom is in dire need of an architecture update (something we'll get in 2013)."

So for now, the Atom 230 and N450 numbers give us a good way to evaluate how the improvements in the "uncore" impact server performance. It is also interesting to see where the ECX-1000 lands. Does it outperform the N2800, or is just barely above the older Atom cores?

 

A Closer Look at the Server Node Benchmarking Configuration
Comments Locked

99 Comments

View All Comments

  • tech4real - Thursday, March 14, 2013 - link

    Calxeda quotes 6W for the whole SOC. We don't know how much is used for all these uncore stuff. It's possible A9 core only burns around 800mW. Still quite a gap to 1.25W.
  • Wilco1 - Thursday, March 14, 2013 - link

    Assuming the 800mW figure is accurate and the uncore power stays the same, then a node would go from 6W to 7.8W - ie. 30% more power for 100% more performance. Or they could voltage scale down to 1.5GHz and get 65% more performance for 5% more power. While a 28nm A15 uses more power in both scenarios, it is also much faster, so perf/Watt is significantly better.
  • tech4real - Thursday, March 14, 2013 - link

    1. I guess we have to wait to see if it's really 2X perf from a9 to a15 in real tests. I personally wouldn't bet on that just yet.
    2. mostly likely the uncore power will increase too. i don't think the larger memory bandwidth will come free.
  • Wilco1 - Thursday, March 14, 2013 - link

    1. We already know A15 is 50-60% faster than A9 per clock (and often more, particularly floating point), so that gives ~2x gain from 1.4GHz to 1.8GHz.
    2. The uncore power will be scaling down with process while the higher bandwidth demand from A15 will increase DRAM power. Without detailed figures it's reasonable to assume these balance each other out.
  • tech4real - Thursday, March 14, 2013 - link

    then let's wait to see anand benchmarks the future a15 system.
    also since the real microserver battle is between the future a15 system and 22nm atom system, I am eager to see how it plays out.
  • Th-z - Wednesday, March 13, 2013 - link

    Very interesting article, thanks! This really piques another curiosity: how does latest IBM Power based server fair these days.
  • Flunk - Wednesday, March 13, 2013 - link

    It really doesn't sound like the price\performance is there. Also, lack of Windows support makes it useless for those of us who run ASP.NET websites (like the company I work for).

    It's still nice to see companies trying something different from the standard strategy. Maybe this is be better in a few generations and take the web server market by storm. If we see a Windows Server arm I could see considering it as an option.
  • skyroski - Wednesday, March 13, 2013 - link

    I agree your testing suite's method is good and ok, so you were testing in consideration with hosting providers, fair enough.

    However on the topic of if you were serving a single site would a standard Xeon be better or ARM based ones? Which - is the case of consideration to FB/Twitter/Google/Baidu etc..., whom are as I have been led to believe by the media this past year, companies that ARM partners are trying to sell this piece of kit to. This test unfortunately cannot tell us.

    A quick search on Google on performance impact of VMs yielded a thread in the VMware community forum by a vExpert/Moderator that mentioned expectation of 90% performance, and frankly, no matter how small you think the performance impact of a VM maybe, it is still using up CPU cycles to emulate hardware, that point will remain true no matter how efficient the hypervisor gets.

    Secondly, coupled with the overhead of running 24 physical copies of the OS + Apache + DB on a box that would otherwise be running a single copy of the OS + Apache + DB is total overkill (on that topic)

    It would be great if you can also test Xeon's req/sec if it ran a single instance so we can see it from a different perspective, as of now as I said, your test is skewered towards hosting providers whom might invest in Calxeda to provide VPS alternatives. But to them (and their client base), the benefit of a VPS is it's portability, which, 24 physical ARM nodes isn't going to provide, so I don't see them considering it as an alternative solution anyway.
  • skyroski - Wednesday, March 13, 2013 - link

    I also want to ask if your Xeon test server's network adapter is capable of and was using Intel VT-c?
  • JohanAnandtech - Thursday, March 14, 2013 - link

    It was using VMDq/Netqueue (via VMXnet) but not SR-IOV/VT-c

Log in

Don't have an account? Sign up now