Benchmark Configuration

First of all, a big thanks to Wannes De Smet, who assisted me the benchmarks. Below you can read the configuration details of our "real servers". The Atom machines are a mix of systems. The Atom 230 is part of a 1U server featuring a Pegatron IPX7A-ION motherboard with 4GB of DDR2-667. The N450 is found inside an ASUS EeePC netbook, and the Atom N2800 is part of Intel's DN2800MT Marshalltown mainboard. The latter has 4GB of DDR3-1333 while the former only has 1GB of DDR2-667.

Supermicro SYS-6027TR-D71FRF Xeon E5 server (2U Chassis)
CPU Two Intel Xeon processor E5-2660 (2.2GHz, 8c, 20MB L3, 95W)
Two Intel Xeon processor E5-2650L (1.8GHz, 8c, 20MB L3, 70W)
RAM 64/128GB (8/16x8GB) DDR3-1600 Samsung M393B1K70DH0-CK0
Motherboard X9DRT-HIBFF
Chipset Intel C600
BIOS version R 1.1a
PSU PWS-1K28P-SQ 1280W 80 Plus Platinum

The Xeon E5 CPUs have four memory channels per CPU and support DDR3-1600, and thus our dual CPU configuration gets eight DIMMs for maximum bandwidth. Each core supports Hyper-Threading, so we're looking at 16 cores with 32 threads.

Boston Viridis Server
CPU 24x ECX-1000 4c Cortex-A9 1.4GHz
RAM 24x Netlist 4GB (96GB) low-voltage ECC PC3L-10600W-9-10-ZZ DRAM
Motherboard 6x EC-cards
Chipset none
Firmware version ECX-1000-v2.1.5
PSU SuperMicro PWS-704P-1R 750Watt

Common Storage System

An iSCSI LIO Unified Target accesses a DataON DNS-1640 DAS. Inside the DAS we have set up eight Intel SSDSA2SH032G1GN (X25-E 32GB SLC) in RAID-0.

Software Configuration

The Xeon E5 server runs VMware ESXi 5.1. All vmdks use thick provisioning, independent, and persistent. The power policy is "Low Power". We chose the "Low Power" policy as this enables C-states while the impact on performance is minimal. All other systems use Ubuntu 12.10. The power management policy is "ondemand". This enables P-states on the Atom and Calxeda ECX-1000.

Software Support & The ARM Server CPU Measuring Bandwidth
Comments Locked

99 Comments

View All Comments

  • Gigaplex - Tuesday, March 12, 2013 - link

    I wouldn't call that a spectacular performance per watt ratio. It's a bit faster than the Xeon under a cherry picked benchmark (much slower under others), and is only marginally lower power. Best case it's an 80% improvement over Sandy Bridge with regards to performance per watt, and Atom wasn't represented. Considering all the hype, I was expecting something a little more... exciting. Ignoring Ivy Bridge improvements, Haswell isn't far off.
  • spronkey - Tuesday, March 12, 2013 - link

    Yeah... I agree. It also only seems to really come into its own in high concurrency. The Xeons idle quite similarly in terms of power - what happens if you compare it to more Xeon cores? It seems like on a per core basis, Intel still has the advantage on both fronts?
  • spronkey - Tuesday, March 12, 2013 - link

    I would also point out that the A15 has already been compared against Sandy and Ivy cores and come up short in performance per watt; so I'm very interested to see what the next step for these ARM node servers is.
  • JohanAnandtech - Wednesday, March 13, 2013 - link

    I warned against the hype in the first sentences. :-) ARM CPUs are still rather weak and not a good match for most applications. However, the fact that we could actually find a case where they do a lot better than the current Xeon systems was surprising to me.
  • wsw1982 - Wednesday, April 3, 2013 - link

    No, it should not surprise any people regarding how picky the use case is. I mean, I do think you can find a use case the ARM 11 output perform Xeon. E.g. Serving 1 web request per hour :)
  • LogOver - Tuesday, March 12, 2013 - link

    24 servers ran inside 24 VM's on Xeon server, while for ARM server you used the 24 physical server nodes... Hmm... Does not seems to me like apple to apple comparison. Why not to compare, for example, 16 physical nodes on both, xeon and arm servers?
  • haplo602 - Wednesday, March 13, 2013 - link

    And how do you slice the Xeon server into 16 physical nodes ? It does not support any kind of HW partitioning that I am aware of. On the other hand the Calxeda machine is a cluster by design. If you try 16 Xeon nodes you'll go through the roof with power.
  • Colin1497 - Wednesday, March 13, 2013 - link

    I think the question is this:

    Was 24 VM's optimal for the Xeon? Since we're visualizing the Xeon, why 24? Just because you had 24 ARM nodes? Would the Xeon done better with 4VM's? Or 16? Or 1000? 24 seems arbitrary.
  • JohanAnandtech - Wednesday, March 13, 2013 - link

    We tested with 16 as I briefly mentioned in the conclusion. The 2650L did 170 responses/s per VM, or about 40% better. Total Throughput = 2.7k/s, while with 24, 2.9 K/s. THe flexibility that the Xeon has to reduce the number of VMs if higher throughput is necessary is definitely an advantage, but the performance numbers are not that different with different VM configs.
  • Kurge - Wednesday, March 13, 2013 - link

    How about with 0 VM's? Just run it on the metal.

Log in

Don't have an account? Sign up now