Readers of our motherboard review section will have noted the trend in modern motherboards to implement a form of MultiCore Enhancement / Acceleration / Turbo (read our report here) on their motherboards.  This does several things – better benchmark results at stock settings (not entirely needed if overclocking is an end-user goal), at the expense of heat and temperature, but also gives in essence an automatic overclock which may be against what the user wants.  Our testing methodology is ‘out-of-the-box’, with the latest public BIOS installed and XMP enabled, and thus subject to the whims of this feature.  It is ultimately up to the motherboard manufacturer to take this risk – and manufacturers taking risks in the setup is something they do on every product (think C-state settings, USB priority, DPC Latency / monitoring priority, memory subtimings at JEDEC).  Processor speed change is part of that risk which is clearly visible, and ultimately if no overclocking is planned, some motherboards will affect how fast that shiny new processor goes and can be an important factor in the purchase.

Using our consumer level i7-4960X CPU, the P9X79-E WS does implement MultiCore Turbo when XMP is enabled.  This gives a full 4.0 GHz CPU power under any loading.

Point Calculations - 3D Movement Algorithm Test

The algorithms in 3DPM employ both uniform random number generation or normal distribution random number generation, and vary in various amounts of trigonometric operations, conditional statements, generation and rejection, fused operations, etc.  The benchmark runs through six algorithms for a specified number of particles and steps, and calculates the speed of each algorithm, then sums them all for a final score.  This is an example of a real world situation that a computational scientist may find themselves in, rather than a pure synthetic benchmark.  The benchmark is also parallel between particles simulated, and we test the single thread performance as well as the multi-threaded performance.

3D Particle Movement Single Threaded

3D Particle Movement MultiThreaded

The P9X79-E WS results come across as very efficient in our 3DPM ST test, with none of the issues we found with the RIVBE coming through.

Compression - WinRAR 4.2

With 64-bit WinRAR, we compress the set of files used in the USB speed tests. WinRAR x64 3.93 attempts to use multithreading when possible, and provides as a good test for when a system has variable threaded load.  WinRAR 4.2 does this a lot better! If a system has multiple speeds to invoke at different loading, the switching between those speeds will determine how well the system will do.

WinRAR 4.2

Image Manipulation - FastStone Image Viewer 4.2

FastStone Image Viewer is a free piece of software I have been using for quite a few years now.  It allows quick viewing of flat images, as well as resizing, changing color depth, adding simple text or simple filters.  It also has a bulk image conversion tool, which we use here.  The software currently operates only in single-thread mode, which should change in later versions of the software.  For this test, we convert a series of 170 files, of various resolutions, dimensions and types (of a total size of 163MB), all to the .gif format of 640x480 dimensions.

FastStone Image Viewer 4.2

Again, the P9X79-E WS blasts past the RIVBE here due to the turbo issue.

Video Conversion - Xilisoft Video Converter 7

With XVC, users can convert any type of normal video to any compatible format for smartphones, tablets and other devices.  By default, it uses all available threads on the system, and in the presence of appropriate graphics cards, can utilize CUDA for NVIDIA GPUs as well as AMD WinAPP for AMD GPUs.  For this test, we use a set of 33 HD videos, each lasting 30 seconds, and convert them from 1080p to an iPod H.264 video format using just the CPU.  The time taken to convert these videos gives us our result.

Xilisoft Video Converter 7

Rendering – PovRay 3.7

The Persistence of Vision RayTracer, or PovRay, is a freeware package for as the name suggests, ray tracing.  It is a pure renderer, rather than modeling software, but the latest beta version contains a handy benchmark for stressing all processing threads on a platform. We have been using this test in motherboard reviews to test memory stability at various CPU speeds to good effect – if it passes the test, the IMC in the CPU is stable for a given CPU speed.  As a CPU test, it runs for approximately 2-3 minutes on high end platforms.

PovRay 3.7 Multithreaded Benchmark

Video Conversion - x264 HD Benchmark

The x264 HD Benchmark uses a common HD encoding tool to process an HD MPEG2 source at 1280x720 at 3963 Kbps.  This test represents a standardized result which can be compared across other reviews, and is dependent on both CPU power and memory speed.  The benchmark performs a 2-pass encode, and the results shown are the average of each pass performed four times.

x264 HD Benchmark Pass 1x264 HD Benchmark Pass 2

Grid Solvers - Explicit Finite Difference

For any grid of regular nodes, the simplest way to calculate the next time step is to use the values of those around it.  This makes for easy mathematics and parallel simulation, as each node calculated is only dependent on the previous time step, not the nodes around it on the current calculated time step.  By choosing a regular grid, we reduce the levels of memory access required for irregular grids.  We test both 2D and 3D explicit finite difference simulations with 2n nodes in each dimension, using OpenMP as the threading operator in single precision.  The grid is isotropic and the boundary conditions are sinks.  Values are floating point, with memory cache sizes and speeds playing a part in the overall score.

Explicit Finite Difference Grid Solver (2D)Explicit Finite Difference Grid Solver (3D)

Interestingly something seems to be holding back the 2D Explicit numbers.

Grid Solvers - Implicit Finite Difference + Alternating Direction Implicit Method

The implicit method takes a different approach to the explicit method – instead of considering one unknown in the new time step to be calculated from known elements in the previous time step, we consider that an old point can influence several new points by way of simultaneous equations.  This adds to the complexity of the simulation – the grid of nodes is solved as a series of rows and columns rather than points, reducing the parallel nature of the simulation by a dimension and drastically increasing the memory requirements of each thread.  The upside, as noted above, is the less stringent stability rules related to time steps and grid spacing.  For this we simulate a 2D grid of 2n nodes in each dimension, using OpenMP in single precision.  Again our grid is isotropic with the boundaries acting as sinks. Values are floating point, with memory cache sizes and speeds playing a part in the overall score.

Implicit Finite Difference Grid Solver (2D)

Point Calculations - n-Body Simulation

When a series of heavy mass elements are in space, they interact with each other through the force of gravity.  Thus when a star cluster forms, the interaction of every large mass with every other large mass defines the speed at which these elements approach each other.  When dealing with millions and billions of stars on such a large scale, the movement of each of these stars can be simulated through the physical theorems that describe the interactions. The benchmark detects whether the processor is SSE2 or SSE4 capable, and implements the relative code.  We run a simulation of 10240 particles of equal mass - the output for this code is in terms of GFLOPs, and the result recorded was the peak GFLOPs value.

n-body Simulation via C++ AMP

System Benchmarks Gaming Benchmarks
Comments Locked

53 Comments

View All Comments

  • Hammerfist - Friday, January 10, 2014 - link

    What are the effects of PLX chip when using two or more R9 290 in crossfire ?
    We know that when doing AFR , R9 290 and R9 290X uses the PCIe lanes to move the frame around from on GPU to another .
    A frame time testing with two GPUs in different lanes will be very interesting .
    PCIe 1 - PCIe 2 -> Goes through the PLX chip and QS
    PCIe 2 - PCIe 3 -> Goes through QS only
    PCIe 1 - PCIe 5 -> Goes through both PLX chips
    PCIe 2 - PCIe 6 -> Goes through both PLX and QS chips
    and possibly more combinations.
    I am not saying that all possible combinations need to be tested , just two combinations to give us and idea of the latency involved is good enough like
    1) PCIe1 - PCIe3 (only PLX)
    2) PCIe2 - PCIe6 (both PLX and QS)
  • Ian Cutress - Saturday, January 11, 2014 - link

    I did some PLX testing on various Z87 motherboards that use one of the chips, and the overall defecit over ideal routing was a 1-2% loss per PLX chip in the worst case scenario. This is better than the old NF200s, which had up to a 5-10% loss iirc? Of course with X79 it's a little different in that the CPU could go for an x16/x8/x8/x8 layout and whether going for an x16/x16/x16/x16 would make a difference. While I don't have 290 cards to hand, I do have 7970s and now GTX 770s to do a small comparison in the future.
  • watersb - Saturday, January 11, 2014 - link

    Ian, thanks very much for this review.

    I am not a gamer, but my science and storage workloads are well met by Xeon workstations. The build-your-own route can make financial sense sometimes, depends on the job.

    Glad you are there checking it all out.
  • mapesdhs - Saturday, January 11, 2014 - link

    The main benefit of a DIY oc build is gaining access to the performance equivalent to an
    expensive high-core XEON on a lower budget. XEONs with lots of cores have much lower
    clocks, so a 6-core SB-E or IB-E at 4.7+ runs very well. There are tradeoffs of course,
    such as non-ECC RAM being used; this might rule out the idea for some tasks. Still, there's
    a lot of scope for building something fast without breaking the bank. If one needs a degree
    of reliability though then I guess just step back a step or two on the oc, say 4.5GHz, and/or
    go for top-end cooling by default such as an H110 + suitable case.

    Ian.
  • Pooyan - Saturday, January 11, 2014 - link

    Great article, Ian. Although I wish you focused more on workstation aspects of the motherboard, not gamin and stuff :D
    1. Do you know any motherboards from other manufacturers with similar specs?
    2. ASUS says it's a CEB motherboard. So the case has to be CEB as well? Or can it be E-ATX? Isn't that kinda small for it?
    Thanks again for the review.
  • mapesdhs - Saturday, January 11, 2014 - link

    The only other board I could find that came close in overall concept to ASUS' X79 WS
    series is Asrock's X79 Extreme 11. However, apart from being quite a bit more expensive,
    in the end I felt Asrock messed up a bit by not using a SAS controller with any onboard
    cache, which can spoil 4K performance. Given the board cost, I can't imagine why the
    didn't choose an equivalent LSI chip that had a 1GB cache or something, would have
    been much better. Maybe the added cost was just too much.

    Can't remember offhand about CEB vs. EATX; I think CEB means the board can be
    deeper aswell as longer. Either way, fits fine in a HAF 932, though the case I'd
    recommend atm is an Aercool X-Predator. Caveat: if one has to move a system
    around a lot, eg. transport to company sites, then choose a different case that has
    handles. Either way, for max expandability, use a 10-slot case.

    Ian.
  • Pooyan - Tuesday, January 14, 2014 - link

    I thought it only fits in a CEB case. That's why I was gonna get a Silverstone RV03, because that's the only CEB I could find. This is a great help for me. It means I have other options for the case. Thanks a lot!
  • mapesdhs - Tuesday, June 7, 2016 - link

    An old thread I know, but a minor update for anyone who finds this for some reason as I recently built an editing setup with a P9X79-E WS I managed to get for only 200 quid (fitted with an i7 3970X, Quadro 6000, GTX 580 3GB, etc.): now I'm using a Corsair C70 Military Green case, definitely better. More rear slots than the HAF 932, though I'm only using two NDS fans with the H110 (decided after several builds that four is unnecessary). The C70 has fewer front 5.25" bays than the 932, but using more SSDs, etc. has meant that's not an issue.

    Hoping to see if it's possible to boot from a 950 Pro soon...

    Ian.
  • Umbongo - Saturday, January 11, 2014 - link

    "Being a Workstation board, the P9X79-E WS is designed to accept any socket 2011 Xeon, as well as ECC memory – up to 64GB is listed on the specification sheet, although 16GB ECC DRAM modules are now available through Newegg for $210 each."

    The X79 chipset supports unbuffered ECC with a Xeon. 16GB DIMMs are not available as ECC unbuffered, only ECC registered. You need a C600 series chipset with a Xeon to use registered memory.
  • Ian Cutress - Saturday, January 11, 2014 - link

    Ah, I thought I had seen 16GB unregistered memory. Seems like I was mistaken (!)

Log in

Don't have an account? Sign up now