Finding a Good Fit

The previous benchmarks have shown that the first Calxeda server is not for the general IT market. As the slide below shows, Calxeda targets four kinds of workloads:

  • Web applications
  • Middle-tier applications
  • Offline analytics
  • Storage and file serving

 

For applications such as Memcache, the ECX-1000 1.4GHz lacks bandwidth and memory capacity. Once a Cortex-A15 based server is available, this can change quickly as performance will improve significantly and the amount of memory per CPU can be quadrupled to 16GB.

We did not test it yet, but our own experience tells us that the majority of the "scale out" applications are out of reach. Especially in the financial and risk modeling world, top performance and ultra low response times are prioritized.

Calxeda based Boston servers are already making inroads as storage servers. There is little doubt that a low power processing unit makes a lot of sense in a storage server.

That leaves the question whether or not Calxeda's latest server can make it in the web server and Content Delivery world. Calxeda claims 5W per server node, and no more than 250W for the complete server chassis with 24 server nodes. That's pretty cool, but currently there is another solution. Two octal-core Xeon E5 deliver no less 32 threads running on top of 16 very potent cores. Add a virtualization layer and you get tens of servers. The only limitation is typically the amount of RAM.

So assume you are a hosting provider. Which server do you use as your building block? You've got two choices:

The standard one, the Intel Xeon E5 server. The advantages are excellent performance whenever you need it, whether your application scales well with more threads or not. The Xeon can address up to 384GB of affordable RAM (16GB DIMMs). If that's not enough, 768GB is possible with more expensive LR-DIMMs.

Those are impressive specs, but what if most of your customers just want to host medium sized web sites, sites that are rich on content but rather low on processing requirements? Can the Boston Viridis server attract such users with a much lower power consumption? How far can you go with slicing and dicing the Xeon's monstruous performance into small virtual pieces? We decided to find out.

 

Integer Processing, gcc Our Real World Test
Comments Locked

99 Comments

View All Comments

  • tuxRoller - Tuesday, March 12, 2013 - link

    Why WOULD you expect DVFS to boost performance?
    You seem to think it slightly revelational that the scores are slightly lower (but perhaps statistically meaningless).
  • dig23 - Tuesday, March 12, 2013 - link

    On-demand seems fair choice to me, its what best you can do on this OSes. But I will be very interested to see energy efficiency numbers when DVFS working on swarm of ARM nodes...:)
  • tuxRoller - Tuesday, March 12, 2013 - link

    It's not cpu governor I'm talking about but DVFS in particular.
    There's bound to be some small amount of latency involved with the process.
    It's point isn't for best performance but energy efficiency thus why I made the comment in the first place.
  • JarredWalton - Tuesday, March 12, 2013 - link

    There's the potential for DVFS to optimize for better performance on a few cores while putting some of the other cores into a lower P-state, but I think that would be more for stuff like Turbo Boost/Turbo Core. It's also possible Johan is referring to the potential for the optimizations to simply improve performance in general.
  • CodyHall - Friday, March 15, 2013 - link

    Love my job, since I've been bringing in $5600… I sit at home, music playing while I work in front of my new iMac that I got now that I'm making it online.(Click Home information)
    http://goo.gl/9u8us
  • JohanAnandtech - Wednesday, March 13, 2013 - link

    Can you tell me where I got you confused? Because I write "This allowed us to make use of Dynamic Voltage and Frequency Scaling (DVFS, P-states) using the CPUfreq tool. First let's see if all these power saving tweaks have reduced the total throughput."

    So it should been clear that we are looking for a better performance/watt ratio. The interesting thing to note is that ARM benefits from p-states, and that Intel's excellent implementation of C-states makes p-states almost useless.
  • Twonky - Wednesday, March 13, 2013 - link

    For information about a year ago the following post on the Linkedin ARM Based Group gave a link to a M.Sc. thesis publishing figures on the performance/watt ratio for Cortex-A8 and Cortex-A9 based boards:
    www.linkedin.com/groups/Single-CortexA8-CortexA9-in-comparison-85447.S.84348310
  • AncientWisdom - Tuesday, March 12, 2013 - link

    Very interesting read, thanks!
  • staiaoman - Tuesday, March 12, 2013 - link

    Damn, Johan. As always- an incredible writeup. Interesting thought experiment to figure that an upper bound on damage to INTC server share might be found by simply looking at how much of the market is running applications like your web server here (where single-threaded performance isn't as important).

    Intel powering phones and ARM chips in servers...the end is nigh.
  • JohanAnandtech - Thursday, March 14, 2013 - link

    Thanks Staiaoman :-). I'll leave the though experiment to you :-)

Log in

Don't have an account? Sign up now