Web Server Performance

Websites based on the LAMP stack - Linux, Apache, MySQL, and PHP - are very popular. Few people write html/PHP code from scratch these days, so we turned to a Drupal 7.21 based site. The web server is Apache 2.4.7 and the database is MySQL 5.5.38 on top of Ubuntu 14.04 LTS.

Drupal powers massive sites (e.g. The Economist and MTV Europe) and has a reputation of being a hardware resource hog. That is a price more and more developers happily pay for lowering the time to market of their work. We tested the Drupal website with our vApus stress testing framework and increased the number of connections from 5 to 300.

We report the maximum throughput achievable with 95% percent of request being handled faster than 1000 ms. 

Drupal Website

Let us be honest: the graph above is not telling you everything. The truth is that, on the Xeon D and Xeon E5, we ran into several other bottlenecks (OS and Database related) before we ever could measure a 1000 ms 95th percentile response time. So the actual throughput at 1 second response time is higher.

Basically, the performance of the Xeon D and Xeon E5 was too high for our current benchmark setup. Let us zoom in a bit to get a more accurate picture. The picture below shows you the 95th percentile of the response time (Y-axis) versus the amount of concurrent requests/users (X-axis). We did not show the results of the Atom C2750 beyond 200 req/s to keep the graph readable.  

We warm up the machine with 5 concurrent requests, but that is not enough for some servers. Notice that the response time of the Xeon D between 50 and 200 requests per second is lower than at 25 request per second. So let us start our analyses at 50 request per second. 

The Xeon E3-1230L clock speed fluctuates between 1.8, 2.3 and 2.8 GHz. It is amazing low power chip, but you pay a price: the 95th percentile never goes below 100 ms. The highly clocked Xeon E3s like the 1240 keeps the response time below 100 ms unless your website is hit more than 100 times per second. 

The Xeon D once again delivers astonishing performance. Unless the load is more than 200 concurrent requests per second, the server responds within 100 ms. There is more. Imagine that you want to keep your 95th percentile. response time below half a second. With a previous generation Xeon E3, even the 80W chip will hit that limit at around 200-250 requests per second. The Xeon D sustains about 800 (!) requests per second (not shown on graph) before a small percentage of the users will experience that response time.  In other words, you can sustain up to 4 times as manyhits with the Xeon D-1540 compared to the E3.   

Java Server Performance ElasticSearch
Comments Locked

90 Comments

View All Comments

  • zodiacfml - Tuesday, June 23, 2015 - link

    this is the reason why Intel focuses on mobile, it benefits their server cpus too.

    the 14nm process is the one to thank for these massive improvements. Samsung also has 14nm and the S6 Exynos is in similar achievement
  • Refuge - Tuesday, June 23, 2015 - link

    I disagree, the Exynos is no where close to a similar achievement.

    Granted it is doing better than Qualcomm's equivalent at the moment.

    But I'm also faster than a fat man with a broken leg running on a hot and humid day.
  • zodiacfml - Tuesday, June 23, 2015 - link

    Still, these 14nm SoCs are the best in their class as they pack more cores while using less power.
  • LukaP - Thursday, June 25, 2015 - link

    Just a note, Samsung's (and TSMC's 16nm FF(+) process isnt really 16nm entirely. The interconnects are still 28nm making it not nearly as dense as intel's 14nm, as well as being more leaky. IIRC their density and leakage can be compared to intels 22nm TriGate in the times of Ivy Bridge
  • nils_ - Tuesday, June 23, 2015 - link

    Few questions:
    1. Why did you disable x2apic?
    2. Did the Large Page allocation in the Java Benchmark actually work? It can be a bit tricky some times and then falls back to 4KiB pages
    3. What were the JVM settings for elasticsearch?
  • JohanAnandtech - Thursday, June 25, 2015 - link

    1. Was out of the box disabled. I have to admit I did not check that option. Performance impact should be neglible though.
    2. I have no monitored that, but there was a performance impact if we disabled it.
    3. ES_heap_size = 20 G; otherwise standard ES settings
  • Daniel Egger - Tuesday, June 23, 2015 - link

    Wow, that is still quite pricey here. For the price of the SuperMicro tower you can actually get a 1U 2S Xeon E5 system with one socket equipped and some memory. I'd really love to replace my home server (running on Core i5 rather than Xeon E3 for efficiency reasons, those C chipset suck balls) with one of those systems if they can make them efficient and quiet.
  • hifiaudio2 - Tuesday, June 23, 2015 - link

    Two questions:

    1. How does the Xeon D compare to the c2700 series for a home NAS that will also serve as an Emby server and HDHR DVR (when that software is available). Could be one or two 1080p transcodes going on at the same time at most. Usually no transcoding if I am using Kodi or something that can natively play back the file, but for remote viewing or random uses over the network, some transcoding by Emby could be required -- if you are not familiar with Emby think of the same thing using Plex. So would the extra power of the Xeon D be of use to me, or is the 8 core c2750 plenty for the aforementioned use case?

    2. If I do go with this unit, which dimms specifically does it use? The Supermicro c2750 board takes laptop style dimms. What does this take?
  • JohanAnandtech - Tuesday, June 23, 2015 - link

    I can answer 2: see the picture here: http://www.anandtech.com/show/9185/intel-xeon-d-re... RDIMMs or UDIMMS (= basically "normal" DDR-4) will do.
  • hifiaudio2 - Tuesday, June 23, 2015 - link

    Thanks.. So this ram:?

    http://www.amazon.com/Crucial-PC4-2133-Registered-...

    And what is the SR x4 / DR x8 difference in the two choices for the 8gb sticks?

Log in

Don't have an account? Sign up now