Assessing Cavium's ThunderX2: The Arm Server Dream Realized At Last

Name: Assessing Cavium's ThunderX2: The Arm Server Dream Realized At Last
Item: Assessing Cavium's ThunderX2: The Arm Server Dream Realized At Last
Author: Johan De Gelas

by Johan De Gelas on May 23, 2018 9:00 AM EST

97 Comments | Add A Comment

97 Comments

The ThunderX2 SKUs: 16 to 32 Cores

The SKU inside our test system was the ThunderX2 CN9980 2.2. This is the top SKU that is available right now, offering 32 cores at 2.2 GHz, which are able to further boost to 2.5 GHz.

According to Cavium's plans, many more SKUs will be available in the coming months. Cavium claims that a CN9980 at 2.5 GHz will be available soon, which would be capable of boosting to 3 GHz.

Cavium has listed all of their planned SKUs together alongside the comparable Intel SKU. By Cavium's definition, a comparable Intel SKU is a chip that achieves the same SPECInRate (2017) under gcc as Cavium's SKU.

As you can see, Cavium considers our CN9880 2.2 to be comparable to the much more expensive 8164. For our testing we will compare it to the 8176, as that was the Intel SKU available to us. Not that it should matter much: the 8176 only has a 3% higher clockspeed and 2 additional cores (+7%) over the 8164. Note however that if Cavium's ThunderX2 can really compete with these Intel SKUs, they are offering the same performance at one third of the cost of the Intel SKUs.

Cavium's "New" Core: Vulcan Benchmark Configuration & Energy Consumption

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

97 Comments

View All Comments

Gunbuster - Wednesday, May 23, 2018 - link
Because it's hard to explain the critical line of business software or database is having some unknown edge case issue because you thought look at me I'm so smart and saved 1% of the project cost using unproven low penetration hardware.
daanno2 - Wednesday, May 23, 2018 - link
I'm guessing you've never dealt with expensive enterprise software before. They are mostly licensed per-core, so getting the absolute best performance per core, even if the CPU is 2-3x more expensive, is worth it. At the end of the day, the CPUs might be <5% of the total cost.
SirPerro - Wednesday, May 23, 2018 - link
You can swallow a big risk if the benefit is 75% of the cost. Hey, it's definitely worth the try.

If your hardware makes up for 5% of the cost, saving a 3% of the total budget is not worth the risk of migration.
FunBunny2 - Thursday, May 24, 2018 - link
"You can swallow a big risk if the benefit is 75% of the cost. Hey, it's definitely worth the try."

the EOL of today's machines, the amortization schedules must be draconian. only if a 'different' server pays off in dozens of months, not years, will it have chance. to the extent that enterprise software is a C/C++ and *nix codebase, porting won't be onerous. but, I'm willing to guess, even Oracle code isn't all that parallel, so throwing a truckload of teeny cpu at it won't necessarily work.
name99 - Thursday, May 24, 2018 - link
The bigger problem here is the massive uncertainty around the meaning of the word "server" and thus the target for these new ARM CPUs.
Some people seem to think "server" means primarily boxes that run SAP or ORACLE, but I think it's clear that the ARM ecosystem has little interest in that, at least right now.

What's of much more interest is racks on racks of CPUs running commodity (LAMP) or homegrown software, ie data warehouses and HPC. I'm not even sure the Java benchmarks being run are of much interest to this market. The things that matter are the sorts of things Cloudflare was measuring when they tested Centriq -- memcached, nginx, transforming one type of data into another (compression/decompression, encrypt/decrypt, transcode,...) at massive throughput.
That's where I'd expect to see the big sales of the ARM "server" cores -- to Cloudflare, Baidu, Google, and so on.

Also now that Marvell is in the game, will be interesting to see the extent to which they pull this downward, into their traditional sorts of markets like infrastructure network and storage control (eg to go into network appliances and NAS boxes).
Ed469546 - Wednesday, June 13, 2018 - link
Some of the commercial software you pay per core. Intel had the best single threaded performance mening power license costs.

Interesting question is how the Thunderx2 cores are counted in this case: one core can run 4 threads.
andrewaggb - Wednesday, May 23, 2018 - link
I wonder what workloads they are targeting? High throughput with poor single threaded results is somewhat limiting.
peevee - Wednesday, May 23, 2018 - link
Web app servers. VM servers. Hadoop/Spark nodes. All benefit more from having more threads running in parallel instead of each request waiting or switching contexts.

If you are concerned about single-thread performance on 256-thread server (as 2-CPU server with this CPU will provide) AT ALL, you choose outrageously wrong hardware for the task to begin with. Go buy a 2-core i3. Practically the only test in this article which matters is Critical jOPS (assuming the used quality of service metric was configured realistically).
GeekyMcGeekface - Friday, May 25, 2018 - link
I’m building a cluster now with a few hundred Raspberry Pi’s because scale up is expensive and stupid. By distributing across a pool of clusters, I can handle far more memory bandwidth and compute. Consider 100 Raspberry PIs have 400 64-bit cores and 100GB of RAM. Total cost $3500 + power, mounting and switches.

Running three clusters of those with Kubernetes, Couchbase and Azure Functions provides 1200 64-bit cores, about 100GB of extremely high performance storage, incredible failover and a map-reduce environment to die for.

Add some 64GB MicroSD cards and an object storage system to the cluster and there’s 12TB of cold storage (4TB when made redundant).

Pay a service fee to some sweatshop in the Eastern Block to do the labor intensive bits and you can build a massively parallel, almost impossible to crash, CI/CD friendly, multi-tenant, infinitely scalable PaaS... for less than the cost of the RAM for a single one of the servers here.

The only expensive bits in the design are the Netscalers.

Oh... and the power foot print is about the same as one of these servers.

I honestly have no idea what I what I would use a server like these in a new design for.
jospoortvliet - Wednesday, May 30, 2018 - link
single-core performance with your pi's is considerably lower, as is inter-core bandwidth. If your tasks require little inter-process communication you're good but with highly interdependent compute it won't perform well. But for specific tasks, yes, it might be very cost effective.

Assessing Cavium's ThunderX2: The Arm Server Dream Realized At Last

The ThunderX2 SKUs: 16 to 32 Cores

Post Your Comment

97 Comments

View All Comments

Gunbuster - Wednesday, May 23, 2018 - link

daanno2 - Wednesday, May 23, 2018 - link

SirPerro - Wednesday, May 23, 2018 - link

FunBunny2 - Thursday, May 24, 2018 - link

name99 - Thursday, May 24, 2018 - link

Ed469546 - Wednesday, June 13, 2018 - link

andrewaggb - Wednesday, May 23, 2018 - link

peevee - Wednesday, May 23, 2018 - link

GeekyMcGeekface - Friday, May 25, 2018 - link

jospoortvliet - Wednesday, May 30, 2018 - link

Log in

Don't have an account? Sign up now