Benchmark Configuration

First of all, a big thanks to Wannes De Smet, who assisted me the benchmarks. Below you can read the configuration details of our "real servers". The Atom machines are a mix of systems. The Atom 230 is part of a 1U server featuring a Pegatron IPX7A-ION motherboard with 4GB of DDR2-667. The N450 is found inside an ASUS EeePC netbook, and the Atom N2800 is part of Intel's DN2800MT Marshalltown mainboard. The latter has 4GB of DDR3-1333 while the former only has 1GB of DDR2-667.

Supermicro SYS-6027TR-D71FRF Xeon E5 server (2U Chassis)
CPU Two Intel Xeon processor E5-2660 (2.2GHz, 8c, 20MB L3, 95W)
Two Intel Xeon processor E5-2650L (1.8GHz, 8c, 20MB L3, 70W)
RAM 64/128GB (8/16x8GB) DDR3-1600 Samsung M393B1K70DH0-CK0
Motherboard X9DRT-HIBFF
Chipset Intel C600
BIOS version R 1.1a
PSU PWS-1K28P-SQ 1280W 80 Plus Platinum

The Xeon E5 CPUs have four memory channels per CPU and support DDR3-1600, and thus our dual CPU configuration gets eight DIMMs for maximum bandwidth. Each core supports Hyper-Threading, so we're looking at 16 cores with 32 threads.

Boston Viridis Server
CPU 24x ECX-1000 4c Cortex-A9 1.4GHz
RAM 24x Netlist 4GB (96GB) low-voltage ECC PC3L-10600W-9-10-ZZ DRAM
Motherboard 6x EC-cards
Chipset none
Firmware version ECX-1000-v2.1.5
PSU SuperMicro PWS-704P-1R 750Watt

Common Storage System

An iSCSI LIO Unified Target accesses a DataON DNS-1640 DAS. Inside the DAS we have set up eight Intel SSDSA2SH032G1GN (X25-E 32GB SLC) in RAID-0.

Software Configuration

The Xeon E5 server runs VMware ESXi 5.1. All vmdks use thick provisioning, independent, and persistent. The power policy is "Low Power". We chose the "Low Power" policy as this enables C-states while the impact on performance is minimal. All other systems use Ubuntu 12.10. The power management policy is "ondemand". This enables P-states on the Atom and Calxeda ECX-1000.

Software Support & The ARM Server CPU Measuring Bandwidth
Comments Locked

99 Comments

View All Comments

  • JohanAnandtech - Wednesday, March 13, 2013 - link

    Ok, good question. I'll look into it, as I am definitely considering a follow-up
  • skyroski - Wednesday, March 13, 2013 - link

    I make performance oriented web apps for a living and I was looking forward to this performance test very much. However, I was quite disappointed at how you have done the "real world" test.

    If you're serving a single site you would never put a Xeon through the performance penalties of virtualisation, so I deem your real world results flawed/unusable.

    Basically, if I was to consider buying a Calxeda server tomorrow, I want to know if I can serve a site faster/better by using the "cluster in a box" solution which ARM's partners are going for or if a single Xeon server with standardised dedicated hardware will serve me and my businesses better.

    The other thing that I would have also tested is SSL request performance because Intel has AES-NI built in and I believe ARM has something similar? I would say the majority of request today for a serious web app/site will be traffic using the SSL protocol, so that would also be one of those deciding factors I would look at.

    If I was a cloud host provider your comparison may contain some truth as their business model would be to presumably let each ARM node out as a VPS alternative, but that isn't what you were testing were you?
  • JohanAnandtech - Wednesday, March 13, 2013 - link

    1. The single site: it is not meant to be an environment of one single site. The reason why we use the same site over and over again, is that it makes it easier to interpret the results and more repeatable. Consider a hosting provider who host many similar - but not the same - LAMP sites.
    The repeatable part is the part that most people don't understand very well: we don't just hit the same URL over and over again. We perform real user interactions and randomize them in realworld patterns (like logging in first and then several real actions) and then getting a repeatable benchmark gets very complex.
    2. The SSL comment is definitely good feedback. We are currently writing the connection code for such SSL websites but also need to find one or more good examples. If your site is a good example, maybe we can use yours (even under NDA if necessary) ?
    3. Lastly, the virtualization overhead of ESXi 5 is very small.
  • Kurge - Wednesday, March 13, 2013 - link

    You know, you can host multiple different LAMP sites on bare metal ;)
  • klmccaughey - Wednesday, March 13, 2013 - link

    It won't be LAMP sites any more though - take a trawl through something like the Linode forums to get an idea of what people are building. You are talking higher concurrency and more likely nginx.

    Someone made a valid comment about database sharding - for web apps this is much more likely as people try to make sure they have failover.

    Whilst initially very disappointed, if you imaging the refresh on the ARM cores over the next 2 years (and considering the rate of change due to the phone market) you might actualy be looking at a beast of a machine in two or three iterations. Imagine if you could buy these off the shelf for under $10k: That feels to me like mission critical failover systems in a box. I can see this taking off in a couple of years.
  • klmccaughey - Wednesday, March 13, 2013 - link

    And kudos for the review - I look forward to the follow-up. This is a space that needs watching!
  • Silma - Thursday, March 14, 2013 - link

    True but do you think Intel will stop product development for the next 3 years? In addition who will have the best fabs then? My guess is Intel.
  • Krysto - Monday, March 18, 2013 - link

    I don't know how fast it actually is, but relative to the ARMv7 architecture, AES should be up to 10x faster on ARMv8.
  • kfreund - Wednesday, March 13, 2013 - link

    Nice job, Johan. Can't wait to see your next one; we will be sure to get you an A15 based system as soon as we get it out! Let the debates begin!
  • kfreund - Wednesday, March 13, 2013 - link

    Regarding Stream performance, this is a known limitation of A9; it just can't handle a lot of concurrent memory requests. A15 will nearly triple the memory bandwidth at same DDR rate.

Log in

Don't have an account? Sign up now