CDN

Serving static files is a common task for web servers. However, it is best to offload this from the application web servers to dedicated web servers that can be configured optimally. Those servers, which together form the Content Delivery Network (CDN), are placed in networks closer to the end-user, saving IP transport costs and improving response times. These CDN servers need to cache files as much as they can, and are thus ideal candidates for some high capacity DIMM testing.

A big thanks to Wannes De Smet, my colleague at the Sizing Servers Lab, part of Howest University for making the CDN test a reality.

Details of the CDN setup

We simulate our CDN test with one CDN server and three client machines in a 10 Gbit/s network. Each client machine simulates thousands of users requesting different files. Our server runs Ubuntu 13.04 with Apache. Apache was chosen over NginX as it uses the file system cache whereas NginX does not. Apache has been optimized to provide maximum throughput.

Apache is set to use the event multi process model, configured as follows:

StartServers 1
MinSpareThreads 25
MaxSpareThreads 75
ThreadLimit 150
ThreadsPerChild 150
MaxClients 1500
MaxRequestsPerChild 0

Each client machine has an X540-T1 10Gbit adapter connected to the same switch as the HP DL380 (see benchmark configuration). To load balance traffic across the two links, the balanced-rr algorithm is selected for the bond. No network optimization has been carried out, as network conditions are optimal.

The static files requested originate from sourceforge.org, a mirror of the /a directory, containing 1.4TB of data or 173000 files. To model real-world load for the CDN two different usage patterns or workloads are executed simultaneously.

The first workload contains a predefined list of 3000 files from the CDN with a combined concurrency of 50 (50 requests per second), resembling files that are “hot” or requested frequently. These files have a high chance to reside in the cache. Secondly, another workload requests 500 random files from the CDN per concurrency (maximum concurrent requests is 45). This workload simulates users that are requesting older or less frequently accessed files.

Test Run

An initial test run will deliver all files straight off an iSCSI LUN. The LUN consists of a Linux bcache volume, with an Intel 720 SSD as cache and three 1TB magnetic disks in RAID-5, handled by an Adaptec ASR72405 RAID adapter, as backing device. Access is provided over a 10Gbit SAN network. Handling these files will fill the page cache, thus filling memory. Linux will automatically use up to 95% of the available RAM memory.

Subsequent runs are measured (both throughput and response time) and will deliver as many files from memory as possible. This depends on the amount of RAM installed and how frequent a file is requested, as the page cache uses a Least Recently Used algorithm to control which pages ought to remain in cache.

Measuring Latency The CDN Results
Comments Locked

27 Comments

View All Comments

  • JohanAnandtech - Friday, December 20, 2013 - link

    First of all, if your workload is read intensive, more RAM will almost always be much faster than any flash cache. Secondly, it greatly depends on your storage vendor whether adding more flash can be done at "dramatically lower cost". The tier-one vendors still charge an arm and a leg for flash cache, while the server vendors are working at much more competitive prices. I would say that in general it is cheaper and more efficient to optimize RAM caching versus optimizing your storage (unless your are write limited).
  • blaktron - Friday, December 20, 2013 - link

    Not only are you correct, but significantly so. Enterprise flash storage at decent densities is more costly PER GIG than DDR3. Not only that, but you need the 'cadillac' model SANs to support more than 2 SSDs. Not to mention fabric management is a lot more resource intensive and more prone to error.

    Right now, the best bet (like always) to get performance is to stuff your servers with memory and distribute your workload. Because its poor network architecture that creates bottlenecks in any environment where you need to stuff more than 256GB of RAM into a single box.
  • hoboville - Friday, December 20, 2013 - link

    Another thing about HPC is that, as long as a processor has: enough RAM to do its dataset on the CPU/GPU before it needs more data, the quantity of RAM is enough. Saving on RAM can let you buy more nodes, which gives you more performance capacity.
  • markhahn - Saturday, January 4, 2014 - link

    headline should have been: if you're serving static content, your main goal is to maximize ram per node. not exactly a shocker eh? in the real world, at least the HPC corner of it, 1G/core is pretty common, and 32G/core is absurd. hence, udimms are actually a good choice sometimes.
  • mr map - Monday, January 20, 2014 - link

    Very interesting article, Johan!

    I would very much like to know what specific memory model (brand, model number) you are referring to regarding the 32GB LRDIMM—1866 option.
    I have searched at no avail.
    Johan? / Anyone?
    Thank you in advance!
    / Tomas
  • Gasaraki88 - Thursday, January 30, 2014 - link

    A great article as always.
  • ShirleyBurnell - Tuesday, November 5, 2019 - link

    I don't know why people are still going after server hardware. I mean it's the 21st century. Now everything is on cloud. Where you have the ability to scale your server anytime you want to. I mean the hosting provider companies like: AWS, DigitalOcean, Vultr hosting https://www.cloudways.com/en/vultr-hosting.php, etc. has made it very easy to rent your server.

Log in

Don't have an account? Sign up now