Professional Performance: Windows

Agisoft Photoscan – 2D to 3D Image Manipulation: link

Agisoft Photoscan creates 3D models from 2D images, a process which is very computationally expensive. The algorithm is split into four distinct phases, and different phases of the model reconstruction require either fast memory, fast IPC, more cores, or even OpenCL compute devices to hand. Agisoft supplied us with a special version of the software to script the process, where we take 50 images of a stately home and convert it into a medium quality model. This benchmark typically takes around 15-20 minutes on a high end PC on the CPU alone, with GPUs reducing the time.

Agisoft Photoscan 1.0.0

Photoscan, on paper, would offer more possibilities for faster memory to make a difference. However it would seem that the most memory dependent stage (stage 3) is actually a small part of the overall calculation and was absorbed by the natural variation in the larger stages, giving at most a 1.1% difference between times.

Cinebench R15

Cinebench R15 - Single Thread

Cinebench R15 - MultiThread

Cinebench is historically CPU dependent, giving a 2% difference from JEDEC to peak results.

3D Particle Movement

3DPM is a self-penned benchmark, taking basic 3D movement algorithms used in Brownian Motion simulations and testing them for speed. High floating point performance, MHz and IPC wins in the single thread version, whereas the multithread version has to handle the threads and loves more cores.

3D Particle Movement: Single Threaded

3D Particle Movement: MultiThreaded

3DPM is also relatively memory agnostic for DDR4 on Haswell-E, showing that DDR4-2133 is good enough.

Professional Performance: Linux

Built around several freely available benchmarks for Linux, Linux-Bench is a project spearheaded by Patrick at ServeTheHome to streamline about a dozen of these tests in a single neat package run via a set of three commands using an Ubuntu 14.04 LiveCD. These tests include fluid dynamics used by NASA, ray-tracing, molecular modeling, and a scalable data structure server for web deployments. We run Linux-Bench and have chosen to report a select few of the tests that rely on CPU and DRAM speed.

C-Ray: link

C-Ray is a simple ray-tracing program that focuses almost exclusively on processor performance rather than DRAM access. The test in Linux-Bench renders a heavy complex scene offering a large scalable scenario.

Linux-Bench c-ray 1.1 (Hard)

Natural variation gives a 4% difference, although the faster and more dense memory gave slower times.

NAMD, Scalable Molecular Dynamics: link

Developed by the Theoretical and Computational Biophysics Group at the University of Illinois at Urbana-Champaign, NAMD is a set of parallel molecular dynamics codes for extreme parallelization up to and beyond 200,000 cores. The reference paper detailing NAMD has over 4000 citations, and our testing runs a small simulation where the calculation steps per unit time is the output vector.

Linux-Bench NAMD Molecular Dynamics

NAMD showed little difference between our memory kits, peaking at 0.7% above JEDEC.

NPB, Fluid Dynamics: link

Aside from LINPACK, there are many other ways to benchmark supercomputers in terms of how effective they are for various types of mathematical processes. The NAS Parallel Benchmarks (NPB) are a set of small programs originally designed for NASA to test their supercomputers in terms of fluid dynamics simulations, useful for airflow reactions and design.

Linux-Bench NPB Fluid Dynamics

Despite the 4x8 GB results going south of the border, the faster memory does give a slight difference in NPB, peaking at 4.3% increased performance for the 3000+ memory kits.

Redis: link

Many of the online applications rely on key-value caches and data structure servers to operate. Redis is an open-source, scalable web technology with a b developer base, but also relies heavily on memory bandwidth as well as CPU performance.

Linux-Bench Redis Memory-Key Store, 100x

When tackling a high number of users, Redis performs up to 17% better using 2800+ memory, indicating our best benchmark result.

Memory Scaling on Haswell-E: CPU Real World Memory Scaling on Haswell: Single GTX 770 Gaming
Comments Locked

120 Comments

View All Comments

  • Tunnah - Thursday, February 5, 2015 - link

    Solid data I can use to stop myself being impulsive and upgrading my rig, thank you!

    Every now and again I get upgrade pangs, trying to justify it with numbers, and this article does a great job of showing what I already know - my system is fine, an upgrade will only show results on paper.

    *Doffs cap*
  • HiTechObsessed - Thursday, February 5, 2015 - link

    Just further proof that faster (more expensive) RAM doesn't do anything for gaming. I laugh when people buy Dominator Platinums for 2x or even 3x the cost of regular Corsair or G Skill for solely gaming rigs.
  • FlushedBubblyJock - Sunday, February 15, 2015 - link

    Despair not, one must understand that inside that stupid thick skulll, and beneath that irritating idiot bragging because he's so stupid he doesn't know any better, the doofus is happy, because he is so thick and so easily parted with his less than adequate money supply.

    So bottom line is every time dummy sits down to game, his moron noggin gets all fired up and happy because ignorance in that case is bliss.
  • MrSpadge - Thursday, February 5, 2015 - link

    This calibration at boot slowing the process down 5-8s: can't the system save the proper values from the last boot and start optimization from this point on? Wouldn't those values change only slowly, e.g. when the module is aging or their amount is changed?
  • name99 - Thursday, February 5, 2015 - link

    I understand that the goal here is to test the PAIR of Haswell-E and DDR4.
    However, when it becomes practical, might I suggest that you try for a comparison of
    (easier) AMD and DDR-4
    (harder) one of the ARM server chips and DDR-4

    The reason I suggest this is that we all know that Intel, especially on Xeon, has the best cache+memory controller subsystem in the business, which, by design, means they're the least helped or hurt by changes to DIMM performance. Vendors whose memory subsystems are not as spectacular will likely see larger swings in performance, and it would be of interest to see how large those swings are (which, in a way, also tells us something about the gap between these vendors' memory subsystems and Intel).
  • MikeMurphy - Thursday, February 5, 2015 - link

    I'm flood that precise timings aren't built into the eeprom for each system to use. Why is XMP even necessary with DDR4??
  • davidthemaster30 - Thursday, February 5, 2015 - link

    I would have liked to see DDR3 clocked to 2133 15-15-15 (like the JEDEC DDR4 spec) vs DDR4 at the same speeds in single, dual, triple and quad channel to see scaling from DDR3 to DDR4 and from the # of channels. Also in the DDR3 vs DDR4 page, the specs for DDR4 are "DDR4-2133 14-14-14 350 2T" but I'm pretty sure that 350 is supposed to be 35 ... and the speed of the DDR3 for those tests is not stated.
  • Ranger101 - Friday, February 6, 2015 - link

    A very detailed, well written article, but for me, somewhat academic as
    the conclusion in comparative memory articles always seems to be the
    same."There are a few edge cases where upgrading to faster memory makes
    sense."
  • galta - Friday, February 6, 2015 - link

    Yes, because this is the only logical conclusion.
    Having said that, the community should probably stop discussing RAM, at least until we get to DDR9
  • menting - Friday, February 6, 2015 - link

    that means never discussing RAM again :)

Log in

Don't have an account? Sign up now