The Intel Xeon E5 v4 Review: Testing Broadwell-EP With Demanding Server Workloads

Name: The Intel Xeon E5 v4 Review: Testing Broadwell-EP With Demanding Server Workloads
Item: The Intel Xeon E5 v4 Review: Testing Broadwell-EP With Demanding Server Workloads
Author: Johan De Gelas

by Johan De Gelas on March 31, 2016 12:30 PM EST

112 Comments | Add A Comment

112 Comments

Big Data 101

Many of you have experienced this: you got a massive (text) file (a log of several weeks, some crawled web data) and you need to extract data from it. Moving it inside a text editor will not help you. The text editor will probably collapse and crash as it cannot handle the hundreds of gigabytes of text. Even if it doesn’t, repeated searches (via grep for example) are not exactly a very fast nor are they scientific way to analyze what is hidden inside that enormous hump of data.

Importing and normalizing it into a SQL database via the typical Extract, Transform and Load (ETL) process is tedious and extremely slow. SQL databases are built to handle limited amounts of structured data after all.

That is why Google created MapReduce: by splitting up those massive amounts of data in smaller slices (mapping) and reducing (aggregating, calculating, counting) them to the results that matters to you, you can avoid the rather sequential and slow query execution plans that need to evaluate the whole database to provide meaningful results. Combine this with a redundant and distributed filesystem HDFS that keeps data close to the processing node. The result is that you do not need to import data before you can use it, you do not need the ultimate SSD to quickly load so much data at once, and you can distribute your data mining over a large cluster.

I am of course talking about the grandfather of all Big Data crunching: Hadoop. However Hadoop had two significant disadvantages: although it could crunch through terabytes of data where most other data mining systems collapsed, it was pretty slow the moment you had go through iterative steps, as it wrote the intermediate results to disk. It also was pretty bad if you just want to launch a quick simple query.

Apache Spark 1.5: The Ultimate Big Data Cruncher

This brings us to Spark. In addressing Hadoop’s disadvantages, the members of UC Berkeley’s AMPlab invented a method of keeping the slices of data that need the same kind of operations in memory (Resilient Distributed Datasets). So Spark makes much better use of DRAM than MapReduce, and also avoids many write operations.

But there is more: the Spark framework also includes machine learning / AI libraries that you can use inside your scala/python code. The resulting workload is a marriage of machine learning and data mining that is highly parallel and does not need the best I/O technology to crunch through hundreds of gigabytes of data. In summary, Spark absolutely craves more CPU power and better RAM technology.

Meanwhile, according to Intel, this kind of big data technology is top growth driver of enterprise compute demand in the next few years, as enterprise demand is expected to grow by 67%. And Intel is not the only one that has seen the opportunity; IBM has a Spark Technology Center in San Francisco and launched "Insight Cloud Services", a cloud service based on top of Spark.

Intel now has a specialized Big Data Solutions group, led by Ananth Sankaranarayanan. This group spearheaded the "Big Bench" benchmark development, which was adopted by a TPC group as TPCx-BB. The benchmark is expressed in BBQs per minute...(BBQ = Big Bench Queries). (ed: yummy)

Among the contributors are Cloudera, Cisco, VMware and ... IBM and Huawei. Huawei is the parent company of the HiSilicon ARM processor, and IBM of course has the POWER 8. Needless to say, the new benchmark is going to be an important battleground which might decide whether or not Intel will remain the dominant enterprise CPU vendor. We have yet to evaluate TPC-BBx, but the next page gives you some hard benchmark numbers.

SAP S&D 2-tier Spark Benchmarking

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

112 Comments

View All Comments

Kevin G - Thursday, March 31, 2016 - link
Much like how Apple skipped Haswell-EP, they also skipped a generation of cards from AMD and nVidia. So even if Apple doesn't wait for new GPUs, their is certainly an update on the GPU side.

The more interesting possibility would be if Apple were to go with Xeon D in the Mac Pro instead of Broadwell-EP. Apple would need a big PLX chip considering the number of lanes they's want to use but it is possible.
bill.rookard - Thursday, March 31, 2016 - link
Another issue is that they're not under any pressure from any competition to really innovate. I don't even remember the last time I read anything about Opteron servers... let alone something about any NEW Opterons.
ComputerGuy2006 - Thursday, March 31, 2016 - link
A sign of things to come for Broadwell-e?

Seems like a tricky situation. Because skylake-e will come with a new platform in 2017, while broadwell-e isn't the fastest IPC and there are crazy rumors it will might cost $1500 (lol Intel). We also have Zen later this year that might give good performance with good cost/perf ratio.
extide - Thursday, March 31, 2016 - link
Yeah so Intel only gives us the LCC part for the -E platform, so we will see the 10-core SKU as the top, It will either be $1000, or $1500 ... so yeah not sure how that will end up. Although there will be 8 and 6 core options that should be pretty affordable.

Hopefully they do an 8 core part with 28 lanes for under $500, as THAT would be a great deal!
dragonsqrrl - Sunday, April 3, 2016 - link
I'm hoping the 8 core SKU is around $600, the position the x930K traditionally occupies. What makes me a little worried is that there will be 4 SKUs instead of 3 this time (one 10 core, one 8 core, and two 6 core), and I'm not sure there's enough room under the $600 price point for two 6 core processors.
jasonelmore - Thursday, March 31, 2016 - link
Can it run Star Citizen?
theduckofdeath - Thursday, March 31, 2016 - link
A question we'll never get an answer to? :D
JohanAnandtech - Friday, April 1, 2016 - link
It probably runs mostly on Xeons. Well, the back end that is :-)
extide - Thursday, March 31, 2016 - link
BOOM, 454mm^2 on the worlds best process. The "other" 14/16nm processes use bigger geometry than Intel's 14nm process.

Now we just need those other guys to catch up so we can see 450+mm GPU's!
Kevin G - Thursday, March 31, 2016 - link
Intel still has plenty of room to increase die size. The largest chip they've produced was the Tukwila Itanium 2 at 699 mm^2. Granted that was a 65 nm design but Haswell-EX is a juggarnaught at 662 mm^2 on Intel's more recent 22 nm process. Seems reasonable that SkyLake-EX could go to 32 cores as Intel has >200 mm^2 of rectal limit left.

As for GPU's, they're also huge. nVidia's GM200 is 601 mm^2 and AMD's Fiji is 'only' 596 mm^2 both on 28 nm process. TSMC's 20 nm process was skipped so even using the looser 16 nm FinFET, GPU's will see a significant shrink compared to the those high end chips.

The Intel Xeon E5 v4 Review: Testing Broadwell-EP With Demanding Server Workloads

Big Data 101

Apache Spark 1.5: The Ultimate Big Data Cruncher

Post Your Comment

112 Comments

View All Comments

Kevin G - Thursday, March 31, 2016 - link

bill.rookard - Thursday, March 31, 2016 - link

ComputerGuy2006 - Thursday, March 31, 2016 - link

extide - Thursday, March 31, 2016 - link

dragonsqrrl - Sunday, April 3, 2016 - link

jasonelmore - Thursday, March 31, 2016 - link

theduckofdeath - Thursday, March 31, 2016 - link

JohanAnandtech - Friday, April 1, 2016 - link

extide - Thursday, March 31, 2016 - link

Kevin G - Thursday, March 31, 2016 - link

Log in

Don't have an account? Sign up now