CPU Tests: Simulation

Simulation and Science have a lot of overlap in the benchmarking world, however for this distinction we’re separating into two segments mostly based on the utility of the resulting data. The benchmarks that fall under Science have a distinct use for the data they output – in our Simulation section, these act more like synthetics but at some level are still trying to simulate a given environment.

DigiCortex v1.35: link

DigiCortex is a pet project for the visualization of neuron and synapse activity in the brain. The software comes with a variety of benchmark modes, and we take the small benchmark which runs a 32k neuron/1.8B synapse simulation, similar to a small slug.

The results on the output are given as a fraction of whether the system can simulate in real-time, so anything above a value of one is suitable for real-time work. The benchmark offers a 'no firing synapse' mode, which in essence detects DRAM and bus speed, however we take the firing mode which adds CPU work with every firing.

The software originally shipped with a benchmark that recorded the first few cycles and output a result. So while fast multi-threaded processors this made the benchmark last less than a few seconds, slow dual-core processors could be running for almost an hour. There is also the issue of DigiCortex starting with a base neuron/synapse map in ‘off mode’, giving a high result in the first few cycles as none of the nodes are currently active. We found that the performance settles down into a steady state after a while (when the model is actively in use), so we asked the author to allow for a ‘warm-up’ phase and for the benchmark to be the average over a second sample time.

For our test, we give the benchmark 20000 cycles to warm up and then take the data over the next 10000 cycles seconds for the test – on a modern processor this takes 30 seconds and 150 seconds respectively. This is then repeated a minimum of 10 times, with the first three results rejected. Results are shown as a multiple of real-time calculation.

(3-1) DigiCortex 1.35 (32k Neuron, 1.8B Synapse)

DigiCortex seems to fall into layers of performance, and the Core i7-5775C, with DDR3-1600, comes very close to the Core i7-6700K with DDR4-2133.

Dwarf Fortress 0.44.12: Link

Another long standing request for our benchmark suite has been Dwarf Fortress, a popular management/roguelike indie video game, first launched in 2006 and still being regularly updated today, aiming for a Steam launch sometime in the future.

Emulating the ASCII interfaces of old, this title is a rather complex beast, which can generate environments subject to millennia of rule, famous faces, peasants, and key historical figures and events. The further you get into the game, depending on the size of the world, the slower it becomes as it has to simulate more famous people, more world events, and the natural way that humanoid creatures take over an environment. Like some kind of virus.

For our test we’re using DFMark. DFMark is a benchmark built by vorsgren on the Bay12Forums that gives two different modes built on DFHack: world generation and embark. These tests can be configured, but range anywhere from 3 minutes to several hours. After analyzing the test, we ended up going for three different world generation sizes:

  • Small, a 65x65 world with 250 years, 10 civilizations and 4 megabeasts
  • Medium, a 127x127 world with 550 years, 10 civilizations and 4 megabeasts
  • Large, a 257x257 world with 550 years, 40 civilizations and 10 megabeasts

DFMark outputs the time to run any given test, so this is what we use for the output. We loop the small test for as many times possible in 10 minutes, the medium test for as many times in 30 minutes, and the large test for as many times in an hour.

(3-2a) Dwarf Fortress 0.44.12 World Gen 65x65, 250 Yr(3-2b) Dwarf Fortress 0.44.12 World Gen 129x129, 550 Yr(3-2c) Dwarf Fortress 0.44.12 World Gen 257x257, 550 Yr

Here's where we start to see some of the benefits of the lower latency eDRAM out to 128 MB. That larger cache pushes both Broadwell parts very near to modern CPUs, putting all the older models down the list. This is something AMD's APUs aren't particularly good at, due to the very limited L3 cache in play.

Dolphin v5.0 Emulation: Link

Many emulators are often bound by single thread CPU performance, and general reports tended to suggest that Haswell provided a significant boost to emulator performance. This benchmark runs a Wii program that ray traces a complex 3D scene inside the Dolphin Wii emulator. Performance on this benchmark is a good proxy of the speed of Dolphin CPU emulation, which is an intensive single core task using most aspects of a CPU. Results are given in seconds, where the Wii itself scores 1051 seconds.

(3-3) Dolphin 5.0 Render Test

Unfortunately Dolphin isn't a fan of the eDRAM versions.

CPU Tests: Office and Science CPU Tests: Rendering
Comments Locked

120 Comments

View All Comments

  • bernstein - Monday, November 2, 2020 - link

    GDDR6 would be ideally suited as an L4 CPU cache... it has >500GB/s throughput and relatively low cost...
  • e36Jeff - Monday, November 2, 2020 - link

    Sure, if you build a 256-bit bus and somehow cram 8 GDDR6 chips onto the CPU package. You'd also be losing 30-40W of TDP to that.
    This is an application that HBM2 would be much better for. You can easily cram up to 4GB into the package with a much lower TDP impact and still get your 500+GB/s throughput. The biggest issue for this is going to be the impact of having to add in another memory controller and the associated die space and power that it eats up.
  • FreckledTrout - Monday, November 2, 2020 - link

    This is also how I see it playing out. Certainly by the time Intel/AMD switch to using GAAFET maybe before. You just need a couple die shrinks that bring densities up and power down.
  • bernstein - Monday, November 2, 2020 - link

    scratch that, GDDR6 has much too high latency...
  • stanleyipkiss - Monday, November 2, 2020 - link

    The 5775C was ahead of its time. Don't know why they didn't go down that rabbit hole (of increasing the size with each gen)
  • hecksagon - Monday, November 2, 2020 - link

    Adding an extra 84mm2 of die area is a recipe for margin erosion, especially when the benefit is situational.
  • CrispySilicon - Monday, November 2, 2020 - link

    Well, I use a 5775C for my main home PC (using it now) and it's more than that. Broadwell was designed for low power. It doesn't run well over 4Ghz and it's not made to.

    My rig idles at about 800mhz, clocks up to 4ghz on all cores, 2ghz on the edram, and 2ghz on DDR3L (overclocked 1866 hyperx fury), yes, 3L, becuase THAT'S where the magic happens. Low power performance.

    I've also used TridentX 2400CL10 modules in it, not worth the higher voltage.

    I'm going to upgrade finally next year. CXL and DDR5 will finally retire this diamond in the rough.

    Retest with nothing in the BIOS changed except the eDRAM multiplier to 20 and see what happens.
  • Notmyusualid - Wednesday, November 4, 2020 - link

    I usually run my Broadwell at 4.4GHz 24/7. However I have a failed bios battery so using the m/b default 4.0GHz overclock settings today. I don't let mine idle at low speeds, its High Performance mode only & I only boot the Desktop for gaming, or Software Define Radio. Both of which want GHz.

    Memory is Vengeance LED 3200MHz (CL15 & only stable at 3000MHz, XMP is not stable either), and 32GB is currently installed.

    Given;
    C:\Windows\System32>winsat mem
    Windows System Assessment Tool
    > Running: Feature Enumeration ''
    > Run Time 00:00:00.00
    > Running: System memory performance assessment ''
    > Run Time 00:00:05.45
    > Memory Performance 54386.55 MB/s
    > Total Run Time 00:00:06.65

    I think that is why my Broadwell missed out on any eDRAM - it wasn't necessary.

    Dolphin runs about 35x seconds, as I remember it.

    6950X running cool in 2020...
  • MrCommunistGen - Monday, November 2, 2020 - link

    HA. Epic timing. Just starting to read this now, but I recently built a system with a Broadwell-based Xeon E3 chip I got for cheap on eBay. Mostly just because I wanted to play with a chip that had eDRAM and the price of entry for an i5 or i7 has remained pretty high.

    This will be a very interesting read!
  • alufan - Monday, November 2, 2020 - link

    News all day as long as its about Intel so it seems on here said it before and have seen nothing since to change my mind

Log in

Don't have an account? Sign up now