Calxeda's ARM server tested

Name: Calxeda's ARM server tested
Item: Calxeda's ARM server tested
Author: Johan De Gelas

by Johan De Gelas on March 12, 2013 7:14 PM EST

99 Comments | Add A Comment

99 Comments

Pricing

So how much does this Boston Viridis server cost? $20,000 is the official price for one Boston Viridis with 24 nodes at 1.4GHz and 96GB of RAM. That is simply very expensive. A Dell R720 with dual 10 gigabit, 96GB of RAM and two Xeons E5-L2650L is in the $8000 range; you could easily buy two Dell R720 and double your performance. The higher power bill of the Xeon E5 servers is that case hardly an issue, unless you are very power constrained. However, these systems are targeted at larger deployments.

Buy a whole rack of them and the price comes down to $352 per server node, or about $8500 per server. We have some experience with medium quantity sales, and our best guess is that you get typically a 10 to 20% discount when you buy 20 of them. So that would mean that the Xeon E5 server would be around $6500-$7200 and the Boston Viridis around $8500. Considering that you get an integrated (5x 10Gbit) switch and a lower power bill with the Boston Viridis, the difference is not that large anymore.

Calxeda's Roadmap and Our Opinion

Let's be clear: most applications still run better on the Xeon E5. Our CPU benchmarks clearly indicate that any application that accesses the memory frequently or that needs high per thread integer processing power will run better on the Xeon E5. Compiling and installing software simply feels so much faster on the Xeon E5, there is no need to benchmark.

There's more: if your performance requirements are higher than what a quad-core Cortex-A9 can deliver, the Xeon E5 is a lot more flexible and a better choice in most cases. Scaling up is after all a lot easier than using load balancers and other complex software or hardware to scale out. Also, the management software of the Boston Viridis does the job, but Dell's DRAC, HP ILO, and Supermicro's IM are more user friendly.

Calxeda is aware of all this, as they label their first "highbank" server architecture with the ECX-1000 SoC as targeted to the "early adopter". That is why we deliberately tested a scenario that would be relevant to the potential early adopters: a cluster of web servers that is relatively network intensive as it serves a lot of media files. This is one of the better scenarios for Calxeda, but not the best: we could imagine that a streaming server or storage server would be an even better fit. Especially the latter catches on, and the storage version of the Boston Viridis sells well.

So on the one hand, no, the current Calxeda servers are no Intel Xeon killers (yet). However, we feel that the Calxeda's ECX-1000 server node is revolutionary technology. When we ran 16 VMs (instead of 24), the dual low power Xeon was capable of achieving the same performance per VM as the Calxeda server nodes. That this 24 node system could offer 50% more throughput at 10% lower power than one of the best Xeon machines available was honestly surprising to us. 8W at the wall per server node—exactly what Calxeda claimed—is nothing short of remarkable, because it means that the 48 server node machine, which is also available, is even more efficient.

To put that 8W number in perspective, the current Intel Atoms that offer similar performance need that kind of power for the SoC alone and are baked with Intel's superior 32nm process technology. The next generation ARM servers are already on the way and will probably hit the market in the third quarter of this year. The "Midway" SoC is based on a 28nm (TSMC) Cortex-A15 chip. A 28nm Cortex-A15 offers 50% higher single-threaded integer performance at slightly higher power levels and can address up to 16GB of RAM. With that it's safe to conclude that the next Calxeda server will be a good match for a much larger range of applications--memcached, larger web, and midrange database servers for examples. By then, virtualization will be available with KVM and Xen, but we think virtualization on ARM will only take off when the ARM A57 core with its 64-bit ARM V8 ISA hits the market in 2014.

Right now, the limited performance of the individual server nodes makes the Boston Viridis attractive for web applications with lower CPU demands in a power constrained data center. But the extremely low energy consumption and the rapidly increasing performance of the ARM cores show great potential for Calxeda's technology. Short term, this is a niche market, but in another year or two this style of approach could easily encroach on Intel's higher end markets.

Energy and Power

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

99 Comments

View All Comments

JohanAnandtech - Wednesday, March 13, 2013 - link
Thanks!
SunLord - Wednesday, March 13, 2013 - link
Hmm if these didn't cost $20,000 they would make a nice front end for larger websites and forums using less rack space and power. What setup using these would you use for anandtech? Would you guys keep the intel DB server?
Gunbuster - Wednesday, March 13, 2013 - link
I just got a Dell R720xd decked out with 384GB and 4.3TB of storage for a hair over that price.
JohanAnandtech - Wednesday, March 13, 2013 - link
Intel Xeons are still by far a better choice for relational databases that are very hard to split up (sharding is only a last resort)
zachj - Wednesday, March 13, 2013 - link
I'm not sure I agree with the absolutism that seems imlicit in your comment that Xeons are better for relational databases...I think there are cases where that won't be true.

Database scale-out doesn't always require sharding...using any of a number of different off-the-shelf capabilities built right into most SQL engines, you can create multiple active replicas of your database. This is generally better-suited to workloads that aren't write-intensive, but both clustering and replication allow for writes. While this may seem like a quick-and-dirty solution that is architecturally "less good" than sharding, hardware is a lot cheaper than paying people to design a sharding solution and the dollars very often drive the conversation. As long as the database size isn't terribly large this can be a very cost-effective way to scale out a database.

I would wager that the Anandtech website database (not the forum database) would probably be well-suited to this type of scale-out. You do waste some money on redundant storage but you more than make up for that cost by not having to pay a development team to implement sharding. If the comments section of the Anandtech website gets stored in the same underlying database, the size constraints and the write activity may appear to be incompatible with this approach, but I would in fact argue that comments don't require relational capabilities of SQL and would be more rightly stored as blobs in Hadoop or Azure Storage Tables. Then the Anandtech database is strictly articles and is both much more compact and almost entirely read-only (except for a few new articles per day).
rwei - Friday, March 15, 2013 - link
To the best of my understanding, replication does well for scaling reads but doesn't do much for writes. I'd still imagine that this would work decently well with AnandTech, where I can't see the volume of writes being that large relative to the volume of reads.
Kurge - Wednesday, March 13, 2013 - link
They would make a horrible front end for such websites. Just buy a single Xeon server and don't artificially limit it by using 24 VMs. Just run the app straight on the metal and it will perform massively better.
Oldboy1948 - Wednesday, March 13, 2013 - link
Very interesting Johan as your tests often are!
Interesting that the memory bw is so much lower than anything from Intel. In fact Iphone 5 looks much better...why? Only Intel has about the same rsults in compress and decompress.
JohanAnandtech - Wednesday, March 13, 2013 - link
Where did you see the stream results on the A6? I might have missed it somewhere. The only ones I could find reported only 1 GB/s in Triad. http://www.anandtech.com/show/6298/analyzing-iphon... The Quad ECX-1000 got 1.8 GB/s
PCTC2 - Wednesday, March 13, 2013 - link
Do you know what would be an interesting concept for a future version of these cluster-in-a-box systems? A solution like ScaleMP. ScaleMP is basically a reverse VM. A hypervisor on each server clusters together to run a single OS with an aggregation of all resources (cores, RAM, network, and disk). ScaleMP running on 4x Dual-socket 8-core Xeon systems w/ 32GB RAM results in a usable system with 64-cores and 128GB RAM as if it was running natively on the hardware. This would be an interesting concept to transfer to the ARM space (if a form of hardware virtualization ever is designed). In a box like this, there would be 192 cores and 192GB of RAM available to a single Fedora instance. Cluster 2 of these together and suddenly there's a system with 384 cores and 384GB of RAM in 4U. Just some food for thought.

Calxeda's ARM server tested

Pricing

Calxeda's Roadmap and Our Opinion

Post Your Comment

99 Comments

View All Comments

JohanAnandtech - Wednesday, March 13, 2013 - link

SunLord - Wednesday, March 13, 2013 - link

Gunbuster - Wednesday, March 13, 2013 - link

JohanAnandtech - Wednesday, March 13, 2013 - link

zachj - Wednesday, March 13, 2013 - link

rwei - Friday, March 15, 2013 - link

Kurge - Wednesday, March 13, 2013 - link

Oldboy1948 - Wednesday, March 13, 2013 - link

JohanAnandtech - Wednesday, March 13, 2013 - link

PCTC2 - Wednesday, March 13, 2013 - link

Log in

Don't have an account? Sign up now