CPU Benchmarks

Point Calculations - 3D Movement Algorithm Test

The algorithms in 3DPM employ both uniform random number generation or normal distribution random number generation, and vary in amounts of trigonometric operations, conditional statements, generation and rejection, fused operations, etc. The benchmark runs through six algorithms for a specified number of particles and steps, and calculates the speed of each algorithm, then sums them all for a final score. This is an example of a real world situation that a computational scientist may find themselves in, rather than a pure synthetic benchmark. The benchmark is also parallel between particles simulated, and we test the single threaded performance as well as the multi-threaded performance.

3D Particle Movement Single Threaded

3D Particle Movement MultiThreaded

As mentioned in previous reviews, this benchmark is written how most people would tackle the situation – using floating point numbers. This is also where Intel excels, compared to AMD’s decision to move more towards INT ops (such as hashing), which is typically linked to optimized code or normal OS behavior.

The 4770K comes in top in single threaded performance, showcasing the IPC gains of the new architecture.  This is also shown in multithreaded tests with MCT both off and on.

Compression - WinRAR x64 3.93 + WinRAR 4.2

With 64-bit WinRAR, we compress the set of files used in our motherboard USB speed tests. WinRAR x64 3.93 attempts to use multithreading when possible and provides a good test for when a system has variable threaded load. WinRAR 4.2 does this a lot better! If a system has multiple speeds to invoke at different loading, the switching between those speeds will determine how well the system will do.

WinRAR 3.93

WinRAR 4.2

Due to the late inclusion of 4.2, our results list for it is a little smaller than I would have hoped. But it is interesting to note that with the Core Parking updates, an FX-8350 overtakes an i5-2500K with MCT.

Image Manipulation - FastStone Image Viewer 4.2

FastStone Image Viewer is a free piece of software I have been using for quite a few years now. It allows quick viewing of flat images, as well as resizing, changing color depth, adding simple text or simple filters. It also has a bulk image conversion tool, which we use here. The software currently operates only in single-thread mode, which should change in later versions of the software. For this test, we convert a series of 170 files, of various resolutions, dimensions and types (of a total size of 163MB), all to the .gif format of 640x480 dimensions.

FastStone Image Viewer 4.2

In terms of pure single thread speed, it is worth noting the X6-1100T is leading the AMD pack, and the 4770K takes the top spot. .

Video Conversion - Xilisoft Video Converter 7

With XVC, users can convert any type of normal video to any compatible format for smartphones, tablets and other devices. By default, it uses all available threads on the system, and in the presence of appropriate graphics cards, can utilize CUDA for NVIDIA GPUs as well as AMD WinAPP for AMD GPUs. For this test, we use a set of 33 HD videos, each lasting 30 seconds, and convert them from 1080p to an iPod H.264 video format using just the CPU. The time taken to convert these videos gives us our result.

Xilisoft Video Converter 7

XVC is a little odd in how it arranges its multicore processing. For our set of 33 videos, it will arrange them in batches of threads – so if we take the 8 thread FX-8350, it will arrange the videos into 4 batches of 8, and then a fifth batch of one. That final batch will only have one thread assigned to it (!), and will not get a full 8 threads worth of power. This is also why the 2x X5690 finishes in 6 seconds but the normal X5690 takes longer – you would expect a halving of time moving to two CPUs but XVC arranges the batches such that there is always one at the end that only gets a single thread.

Rendering – PovRay 3.7

The Persistence of Vision RayTracer, or PovRay, is a freeware package for as the name suggests, ray tracing. It is a pure renderer, rather than modeling software, but the latest beta version contains a handy benchmark for stressing all processing threads on a platform. We have been using this test in motherboard reviews to test memory stability at various CPU speeds to good effect – if it passes the test, the IMC in the CPU is stable for a given CPU speed. As a CPU test, it runs for approximately 2-3 minutes on high end platforms.

PovRay 3.7 Multithreaded Benchmark

The SMP engine in PovRay is not perfect, though scaling up in CPUs gives almost a 2x effect. The results from this test are great – here we see an FX-8350 CPU below an i7-3770K (with MCT), until the Core Parking updates are applied, meaning the FX-8350 performs better!  The 4770K also has a chance to flex its compute muscles, performing almost as well as the 8-core Westmere CPU.

Video Conversion - x264 HD Benchmark

The x264 HD Benchmark uses a common HD encoding tool to process an HD MPEG2 source at 1280x720 at 3963 Kbps. This test represents a standardized result which can be compared across other reviews, and is dependent on both CPU power and memory speed. The benchmark performs a 2-pass encode, and the results shown are the average of each pass performed four times.

Explicit Finite Difference Grid Solver (2D)

Explicit Finite Difference Grid Solver (3D)

Grid Solvers - Explicit Finite Difference

For any grid of regular nodes, the simplest way to calculate the next time step is to use the values of those around it. This makes for easy mathematics and parallel simulation, as each node calculated is only dependent on the previous time step, not the nodes around it on the current calculated time step. By choosing a regular grid, we reduce the levels of memory access required for irregular grids. We test both 2D and 3D explicit finite difference simulations with 2n nodes in each dimension, using OpenMP as the threading operator in single precision. The grid is isotropic and the boundary conditions are sinks. Values are floating point, with memory cache sizes and speeds playing a part in the overall score.

Explicit Finite Difference Grid Solver (2D)

Explicit Finite Difference Grid Solver (3D)

Grid solvers do love a fast processor and plenty of cache in order to store data. When moving up to 3D, it is harder to keep that data within the CPU and spending extra time coding in batches can help throughput. Our simulation takes a very naïve approach in code, using simple operations.

Grid Solvers - Implicit Finite Difference + Alternating Direction Implicit Method

The implicit method takes a different approach to the explicit method – instead of considering one unknown in the new time step to be calculated from known elements in the previous time step, we consider that an old point can influence several new points by way of simultaneous equations. This adds to the complexity of the simulation – the grid of nodes is solved as a series of rows and columns rather than points, reducing the parallel nature of the simulation by a dimension and drastically increasing the memory requirements of each thread. The upside, as noted above, is the less stringent stability rules related to time steps and grid spacing. For this we simulate a 2D grid of 2n nodes in each dimension, using OpenMP in single precision. Again our grid is isotropic with the boundaries acting as sinks. Values are floating point, with memory cache sizes and speeds playing a part in the overall score.

Implicit Finite Difference Grid Solver (2D)

2D Implicit is harsher than an Explicit calculation – each thread needs more a lot memory, which only ever grows as the size of the simulation increases.

Point Calculations - n-Body Simulation

When a series of heavy mass elements are in space, they interact with each other through the force of gravity. Thus when a star cluster forms, the interaction of every large mass with every other large mass defines the speed at which these elements approach each other. When dealing with millions and billions of stars on such a large scale, the movement of each of these stars can be simulated through the physical theorems that describe the interactions. The benchmark detects whether the processor is SSE2 or SSE4 capable, and implements the relative code. We run a simulation of 10240 particles of equal mass - the output for this code is in terms of GFLOPs, and the result recorded was the peak GFLOPs value.

n-body Simulation via C++ AMP

As we only look at base/SSE2/SSE4 depending on the processor (auto-detection), we don’t see full AVX numbers in terms of FLOPs.

Testing Methodology, Hardware Configurations GPU Benchmarks: Metro2033
Comments Locked

116 Comments

View All Comments

  • majorleague - Wednesday, June 5, 2013 - link

    Here is a youtube link showing 3dmark11 and windows index rating for the 4770k 3.5ghz Haswell. Not overclocked.
    This is apparently around 10-20fps slower than the 6800k in most games. And almost twice the price!!
    Youtube link:
    http://www.youtube.com/watch?v=k7Yo2A__1Xw
  • kilkennycat - Wednesday, June 5, 2013 - link

    Quote:" The only way to go onto 3-way or 4-way SLI is via a PLX 8747 enabled motherboard, which greatly enhances the cost of a motherboard build. This should be kept in mind when dealing with the final results."

    The only way? X79 supports up to 4 8X channels of PCie 2/3.
    The 4-core 3820 overclocks readily and on a X79 board is a very small cost enhancement
    over a high-end non-PLX8747 1155-socket setup. Plus the upgrade benefit of stepping up to the 6-core 3930K if one wants to combine usage for professional multicore applications with gaming.
  • random2 - Wednesday, June 5, 2013 - link

    "What we see is 30.73% of gamers running at 1080p, but 4.16% of gamers are above 1080p."

    So an article and benches are provided for the benefit of 4.16% of the gamers who might be running more pixels vs the 65% (almost 3 million) lions share of gamers that must be running at fewer pixels than found at 1080p. Very strange.
  • Dribble - Thursday, June 6, 2013 - link

    Just to point out the blindingly obvious but who would spend big $$$ on a 1440p monitor and a top end gpu and then buy a low end budget cpu (A8-5600)...

    The realistic min recommendation is going to be a i3570K.
  • xineis - Thursday, June 6, 2013 - link

    So, how would a 955BE perform compared to the CPUs on the test? From what I understand, I should just keep this CPU, as a new one is not going to make much of a difference?
  • Zoatebix - Friday, June 7, 2013 - link

    Thank you for doing all this work. A great follow-up to the original!

    Could you please correct some charts on the CPU Benchmarks page, though? The "Video Conversion - x264 HD Benchmark" section is displaying the charts for the "Grid Solvers - Explicit Finite Difference" section.
  • Klimax - Saturday, June 8, 2013 - link

    Frankly not best article. Resolution too high for GPU and then recommending CPU based on it. CPU, which will not provide performance needed for games. (Techreport showed that APU is not good idea when paired with real GPU; FPS might be in range, but latency is in hell)
  • JNo - Sunday, June 9, 2013 - link

    Ian, I'm afraid I have to agree with some of the naysayers here. You've tried so hard to have clean *scientific* analysis that you've failed to see the wood for the trees. In actual fact I fear you've reached the opposite of a scientific conclusion *because* you only focussed on easily obtainable/reproducible results.

    Just because results for modern games are hard to obtain, doesn't mean you can ignore them despite it being a hard path to walk. I have 1440p but agree that it's not relevant to the vast majority and anyone affording a 1440p monitor won't care to save $40 on AMD A8 vs core i5. So you have to be *realistic* (as well as scientific).

    I know from a few years of international finance analysis that when doing an independent study, there is a chance you can come to a conclusion that flies in the face of the market or common opinion. You have to be *SO* careful when this happens and quadruple check what you have ended up with because 99% of the time, the market or 'hive mind' is correct and there is an error or misunderstanding in your own work. After all, the conglomerate conclusion of hundreds of often intelligent people is hardly likely to wrong, even if you are a smart guy. The chance that you have found the truth and that everyone else is wrong really is about 1% (yes it does happen but it is a once in a blue moon type of event).

    It might seem a huge hit to admit that much of your hard work was misdirected but it could save more pain in the long run to go back to the drawing board and consider what you are trying to achieve and how best to go about it. A very small sample of older titles at unpopular resolutions really could skew results to be misleading.
  • CiccioB - Wednesday, June 12, 2013 - link

    I agree. However we have still to understand what was the thesis Ian wanted to demonstrate.
    If it was "AMD CPU don't have to appear so bad vs Intel" the strategy used for the demonstration is quite good.
    On the other hand, if it was "Let's see which is the best CPU for playing games" the strategy is a complete fail. And it still is partially the same if it were "Let's see which is the cheapest CPU to cope with a bottlenecked GPU", as those old games, but Civ5, all do not have any complex AI o scripts which are a CPU intensive task .
    If I were to judge this work as a homework I would evaluate it as F because it is intended for a small part of the market, using old benchmarks not valid today, incomplete (lack of FCAT) with a wrong setup (bottlenecking GPUs to evaluate CPU performances?).
    Wrong on all aspects but, unless said, the intent was to show that AMD CPU are just trailing Intel most expensive ones instead of being a complete generation behind. In this case evaluation can be a B, but becomes quite limited if we look at the represented market (is 3% of a market that is capable of spending well more that an average gamers a good target to demonstrate that they can spare few bucks using an otherwise castrated CPU?)

    For all these reasons I may say that this is one of the worst article I have ever read on this site. It show some incompetence or worse a bias.
  • Filiprino - Thursday, June 20, 2013 - link

    It's cool that you test old CPUs, so we can see the improvement of CPU processing power over the years.

Log in

Don't have an account? Sign up now