It's a Cluster, Not a Server

When unpacking our Boston Viridis server, the first thing that stood out is the bright red front panel. That is Boston's way of telling us that we have the "Cloud Appliance" edition. The model with an orange bezel is intended to serve as a NAS appliance, purple stands for "web farm", and blue is more suited for a Hadoop cluster. Another observation is that the chassis looks similar to recent SuperMicro servers; it is indeed a bare bones system filled with Calxeda hardware.

Behind the front panel we find 24 2.5” drive bays, which can be fitted with SATA disks. If we take a look at the back, we can find a standard 750W 80 Plus Gold PSU, a serial port, and four SFP connectors. Those connectors are each capable of 10Gbit speeds, using copper and/or fiber SFP(+) transceivers.

When we open up the chassis, we find somewhat less standard hardware. Mounted on the bottom is what you might call the motherboard, a large, mostly-empty PCB that contains the shared Ethernet components and a number of PCIe slots.

The 10Gb Ethernet Media Access Controller (MAC) is provided on the EnergyCore SoC, but in order to allow every node to communicate via the SFP ports, each node forwards its Ethernet traffic to one of the first four cards (the cards in slots 0-3). These nodes are connected via a XAUI interface to one of the two Vitesse VSC8488 XAUI-to-serial transceivers that in turn control two SFP modules each. Hidden behind an air duct is a Xilinx Spartan-6 FPGA, configured to act as chassis manager.

Each pair of PCIe slots contains what turns this chassis into a server cluster: an EnergyCard (EC). Each EnergyCard contains four SoCs, each with one DIMM slot. An EnergyCard contains thus four server nodes, with each node running on a quad-core ARM CPU.

The chassis can hold as many as 12 EnergyCards, so currently up to 48 server nodes. That limit is only imposed by physical space constraints, as the fabric supports up to 4096 nodes, leaving the potential for significant expansion if Calxeda maintains backwards compatibility with their existing ECs.

The system we received can only hold 6 ECs; one EnergyCard slot is lost because of the SATA cabling, giving us six ECs with four server nodes each, or 24 server nodes in total. Some creative effort has been made to provide air baffles that direct the air through the heat sinks on the ARM chips.

The air baffles are made of a finicky plastic-coated paper, glued to gether and placed on the EC with plastic nails, making it difficult to remove them from an EC by hand. Each EC can be freely placed on the motherboard, with the exception of the Slot 0 card that needs a smaller baffle.

Every EnergyCard is thus fitted with four EnergyCore SoCs, each having access to one miniDIMM slot and four SATA connectors. In our configuration each miniDIMM slot was populated with a Netlist 4GB low-voltage (1.35V instead of 1.5V) ECC PC3L-10600W-9-10-ZZ DIMM. Every SoC provided was hooked up to a Samsung 256GB SSD (MZ7PC256HAFU, comparable to Samsung’s 310 Series consumer SSDs), filling up every disk slot in the chassis. We removed those SSDs and used our iSCSI SAN to boot the server nodes. This way it was easier to compare the system's power consumption with other servers.

Previous EC versions had a microSD slot per node at the back, but in our version it has been removed. The cards are topology-agnostic; each node is able to determine where it is placed. This enables you to address and manage nodes based on their position in the system.

Introduction A Closer Look at the Server Node
POST A COMMENT

102 Comments

View All Comments

  • JohanAnandtech - Wednesday, March 13, 2013 - link

    Thanks! Reply
  • SunLord - Wednesday, March 13, 2013 - link

    Hmm if these didn't cost $20,000 they would make a nice front end for larger websites and forums using less rack space and power. What setup using these would you use for anandtech? Would you guys keep the intel DB server? Reply
  • Gunbuster - Wednesday, March 13, 2013 - link

    I just got a Dell R720xd decked out with 384GB and 4.3TB of storage for a hair over that price. Reply
  • JohanAnandtech - Wednesday, March 13, 2013 - link

    Intel Xeons are still by far a better choice for relational databases that are very hard to split up (sharding is only a last resort) Reply
  • zachj - Wednesday, March 13, 2013 - link

    I'm not sure I agree with the absolutism that seems imlicit in your comment that Xeons are better for relational databases...I think there are cases where that won't be true.

    Database scale-out doesn't always require sharding...using any of a number of different off-the-shelf capabilities built right into most SQL engines, you can create multiple active replicas of your database. This is generally better-suited to workloads that aren't write-intensive, but both clustering and replication allow for writes. While this may seem like a quick-and-dirty solution that is architecturally "less good" than sharding, hardware is a lot cheaper than paying people to design a sharding solution and the dollars very often drive the conversation. As long as the database size isn't terribly large this can be a very cost-effective way to scale out a database.

    I would wager that the Anandtech website database (not the forum database) would probably be well-suited to this type of scale-out. You do waste some money on redundant storage but you more than make up for that cost by not having to pay a development team to implement sharding. If the comments section of the Anandtech website gets stored in the same underlying database, the size constraints and the write activity may appear to be incompatible with this approach, but I would in fact argue that comments don't require relational capabilities of SQL and would be more rightly stored as blobs in Hadoop or Azure Storage Tables. Then the Anandtech database is strictly articles and is both much more compact and almost entirely read-only (except for a few new articles per day).
    Reply
  • rwei - Friday, March 15, 2013 - link

    To the best of my understanding, replication does well for scaling reads but doesn't do much for writes. I'd still imagine that this would work decently well with AnandTech, where I can't see the volume of writes being that large relative to the volume of reads. Reply
  • Kurge - Wednesday, March 13, 2013 - link

    They would make a horrible front end for such websites. Just buy a single Xeon server and don't artificially limit it by using 24 VMs. Just run the app straight on the metal and it will perform massively better. Reply
  • Oldboy1948 - Wednesday, March 13, 2013 - link

    Very interesting Johan as your tests often are!
    Interesting that the memory bw is so much lower than anything from Intel. In fact Iphone 5 looks much better...why? Only Intel has about the same rsults in compress and decompress.
    Reply
  • JohanAnandtech - Wednesday, March 13, 2013 - link

    Where did you see the stream results on the A6? I might have missed it somewhere. The only ones I could find reported only 1 GB/s in Triad. http://www.anandtech.com/show/6298/analyzing-iphon... The Quad ECX-1000 got 1.8 GB/s Reply
  • PCTC2 - Wednesday, March 13, 2013 - link

    Do you know what would be an interesting concept for a future version of these cluster-in-a-box systems? A solution like ScaleMP. ScaleMP is basically a reverse VM. A hypervisor on each server clusters together to run a single OS with an aggregation of all resources (cores, RAM, network, and disk). ScaleMP running on 4x Dual-socket 8-core Xeon systems w/ 32GB RAM results in a usable system with 64-cores and 128GB RAM as if it was running natively on the hardware. This would be an interesting concept to transfer to the ARM space (if a form of hardware virtualization ever is designed). In a box like this, there would be 192 cores and 192GB of RAM available to a single Fedora instance. Cluster 2 of these together and suddenly there's a system with 384 cores and 384GB of RAM in 4U. Just some food for thought. Reply

Log in

Don't have an account? Sign up now