CDN

Serving static files is a common task for web servers. However, it is best to offload this from the application web servers to dedicated web servers that can be configured optimally. Those servers, which together form the Content Delivery Network (CDN), are placed in networks closer to the end-user, saving IP transport costs and improving response times. These CDN servers need to cache files as much as they can, and are thus ideal candidates for some high capacity DIMM testing.

A big thanks to Wannes De Smet, my colleague at the Sizing Servers Lab, part of Howest University for making the CDN test a reality.

Details of the CDN setup

We simulate our CDN test with one CDN server and three client machines in a 10 Gbit/s network. Each client machine simulates thousands of users requesting different files. Our server runs Ubuntu 13.04 with Apache. Apache was chosen over NginX as it uses the file system cache whereas NginX does not. Apache has been optimized to provide maximum throughput.

Apache is set to use the event multi process model, configured as follows:

StartServers 1
MinSpareThreads 25
MaxSpareThreads 75
ThreadLimit 150
ThreadsPerChild 150
MaxClients 1500
MaxRequestsPerChild 0

Each client machine has an X540-T1 10Gbit adapter connected to the same switch as the HP DL380 (see benchmark configuration). To load balance traffic across the two links, the balanced-rr algorithm is selected for the bond. No network optimization has been carried out, as network conditions are optimal.

The static files requested originate from sourceforge.org, a mirror of the /a directory, containing 1.4TB of data or 173000 files. To model real-world load for the CDN two different usage patterns or workloads are executed simultaneously.

The first workload contains a predefined list of 3000 files from the CDN with a combined concurrency of 50 (50 requests per second), resembling files that are “hot” or requested frequently. These files have a high chance to reside in the cache. Secondly, another workload requests 500 random files from the CDN per concurrency (maximum concurrent requests is 45). This workload simulates users that are requesting older or less frequently accessed files.

Test Run

An initial test run will deliver all files straight off an iSCSI LUN. The LUN consists of a Linux bcache volume, with an Intel 720 SSD as cache and three 1TB magnetic disks in RAID-5, handled by an Adaptec ASR72405 RAID adapter, as backing device. Access is provided over a 10Gbit SAN network. Handling these files will fill the page cache, thus filling memory. Linux will automatically use up to 95% of the available RAM memory.

Subsequent runs are measured (both throughput and response time) and will deliver as many files from memory as possible. This depends on the amount of RAM installed and how frequent a file is requested, as the page cache uses a Least Recently Used algorithm to control which pages ought to remain in cache.

Measuring Latency The CDN Results
Comments Locked

27 Comments

View All Comments

  • slideruler - Thursday, December 19, 2013 - link

    Am I the only one who's concern with DDR4 in our future?

    Given that it's one-to-one we'll lose the ability to stuff our motherboards with cheap sticks to get to "reasonable" (>=128gig) amount of RAM... :(
  • just4U - Thursday, December 19, 2013 - link

    You really shouldn't need more than 640kb.... :D
  • just4U - Thursday, December 19, 2013 - link

    seriously though .. DDR3 prices have been going up. as near as I can tell their approximately 2.3X the cost of what they once were. Memory makers are doing the semi-happy dance these days and likely looking forward to the 5x pricing schemes of yesteryear.
  • MrSpadge - Friday, December 20, 2013 - link

    They have to come up with something better than "1 DIMM per channel using the same amount of memory controllers" for servers.
  • theUsualBlah - Thursday, December 19, 2013 - link

    the -Ofast flag for Open64 will relax ansi and ieee rules for calculations, whereas the GCC flags won't do that.

    maybe thats the reason Open64 is faster.
  • JohanAnandtech - Friday, December 20, 2013 - link

    Interesting comment. I ran with gcc, Opencc with O2, O3 and Ofast. If the gcc binary is 100%, I get 110% with Opencc (-O2), 130% (-O3) and the same with Ofast.
  • theUsualBlah - Friday, December 20, 2013 - link

    hmm, thats very interesting.

    i am guessing Open64 might be producing better code (atleast) when it comes to memory operations. i gave up on Open64 a while back and maybe i should try it out again.

    thanks!
  • GarethMojo - Friday, December 20, 2013 - link

    The article is interesting, but alone it doesn't justify the expense for high-capacity LRDIMMs in a server. As server professionals, our goal is usually to maximise performance / cost for a specific role. In this example, I can't imagine that better performance (at a dramatically lower cost) would not be obtained by upgrading the storage pool instead. I'd love to see a comparison of increasing memory sizes vs adding more SSD caching, or combinations thereof.
  • JlHADJOE - Friday, December 20, 2013 - link

    Depends on the size of your data set as well, I'd guess, and whether or not you can fit the entire thing in memory.

    If you can, and considering RAM is still orders of magnitude faster than SSDs I imagine memory still wins out in terms of overall performance. Too large to fit in a reasonable amount of RAM and yes, SSD caching would possibly be more cost effective.
  • MrSpadge - Friday, December 20, 2013 - link

    One could argue that the storage optimization would be done for both memory configurations.

Log in

Don't have an account? Sign up now