Scale-Out Big Data Benchmark: ElasticSearch

ElasticSearch is an open source, full text search engine that can be run on a cluster relatively easy. It's basically like an open source version of Google Search that can be deployed in an enterprise. It should be one of the poster-children of scale-out software and is one of the representatives of the so called "Big Data" technologies. Thanks to Kirth Lammens, one of the talented researchers at my lab, we have developed a benchmark that searches through all the Wikipedia content (+/- 40GB). Elasticsearch is – like many Big Data technologies – built on Java.

We are not sure why, but installing IBM's JDK caused a lot of headaches. For some reason the JVM stopped working in the middle of our tests. We got the same behavior running Apache Spark. This could be a result of our lack of experience with the IBM JDK, or the fact that the Linux LE ecosytem is still young. To cut a long story short, we ended up useing OpenJDK 8, which is part of the Ubuntu 15.04 distribution. OpenJDK is very similar to and based upon the same code as Oracle's HotSpot JDK.

We limited the systems to one socket to avoid the issues associated with garbage collection pauses and other scaling issues. There is reason why many Java benchmarks on these massive machines are using multiple JVMs.

Elastic Search

Although the POWER8 can probably perform a bit better with the IBM JDK, performance is in the same league as the best Xeons. Meanwhile as a further point of comparison we also included the score of the Xeon D from our previous article.

Database Performance: MySQL Energy and Pricing
Comments Locked

146 Comments

View All Comments

  • Mondozai - Friday, November 6, 2015 - link

    That's too bad. Over 90% of the world population exists outside of it and even if you look at the HPC market, the vast majority of that is, too.

    The world doesn't revolve around you. Get out of your bubble.
  • bji - Friday, November 6, 2015 - link

    He never claimed the world revolved around him, he just made a true statement that may be worth consideration. Your response is unnecessarily hostile and annoying.

    I would expand Jtaylor1986's statement: I believe that most if not all native English speaking populations use commas for thousands grouping in numbers. Since this site is written in English, it might be worthwhile to stick to conventions of native English speakers.

    It's possible that there are many more non-native English speakers reading this site who would prefer dots instead of commas, but I doubt it. Only the site maintainers would know though.
  • Jtaylor1986 - Friday, November 6, 2015 - link

    You read my mind :)
  • mapesdhs - Tuesday, November 10, 2015 - link

    Talking to numerous people around Europe about tech stuff, I can't think of any nation from which someone used the decimal point in their emails instead of a comma in this context. I'd assumed the comma was standard for thousands groupings. So which non-US countries do use the point instead? Anyone know?
  • lmcd - Friday, November 6, 2015 - link

    Cool on the rest of the world part, but the period vs comma as delimiters in the world numeric system ARE backward. In language (universal, or nearly), a comma is used to denote a pause or minor break, and a period is used to denote the end of a complete thought or section. Applied to numerics, and you end up with the American way of doing it.

    ^my take
  • JohanAnandtech - Saturday, November 7, 2015 - link

    Just for the record, this was not an attempt to nag the US people. Just the mighty force of habit.
  • ZeDestructor - Saturday, November 7, 2015 - link

    For future use: just use a space for thousands seperation (that's how I do it on anything that isn't limited to a 7seg-style display), and confuse readers by mixing commas and periods for decimals :P
  • tygrus - Sunday, November 8, 2015 - link

    I like to use a fullstop for the decimal point, an apostrophe for the thousands separator, a comma for separating items in the list, don't start a sentance with a digit.
    One list of numbers may be : 3'500'000, 45.08, 12'500.8, 9'500. Second list : 45'000, 15'000, 25'000. We use apostrophes when we contract words like don't so why not use it for contracting numbers where we would otherwise have the words thousand, millions, billions etc ?
  • mapesdhs - Tuesday, November 10, 2015 - link

    I have a headache in my eyeballs! :D
  • ws3 - Friday, November 6, 2015 - link

    North America is on the majority side on this issue. Asia, in particular, is almost completely on the side of using a dot as the decimal separator and a comma to put breaks in long numbers.

    Get with the program Europe. The world doesn't revolve around you!

Log in

Don't have an account? Sign up now