CPU Tests: Simulation

Simulation and Science have a lot of overlap in the benchmarking world, however for this distinction we’re separating into two segments mostly based on the utility of the resulting data. The benchmarks that fall under Science have a distinct use for the data they output – in our Simulation section, these act more like synthetics but at some level are still trying to simulate a given environment.

DigiCortex v1.35: link

DigiCortex is a pet project for the visualization of neuron and synapse activity in the brain. The software comes with a variety of benchmark modes, and we take the small benchmark which runs a 32k neuron/1.8B synapse simulation, similar to a small slug.

The results on the output are given as a fraction of whether the system can simulate in real-time, so anything above a value of one is suitable for real-time work. The benchmark offers a 'no firing synapse' mode, which in essence detects DRAM and bus speed, however we take the firing mode which adds CPU work with every firing.

The software originally shipped with a benchmark that recorded the first few cycles and output a result. So while fast multi-threaded processors this made the benchmark last less than a few seconds, slow dual-core processors could be running for almost an hour. There is also the issue of DigiCortex starting with a base neuron/synapse map in ‘off mode’, giving a high result in the first few cycles as none of the nodes are currently active. We found that the performance settles down into a steady state after a while (when the model is actively in use), so we asked the author to allow for a ‘warm-up’ phase and for the benchmark to be the average over a second sample time.

For our test, we give the benchmark 20000 cycles to warm up and then take the data over the next 10000 cycles seconds for the test – on a modern processor this takes 30 seconds and 150 seconds respectively. This is then repeated a minimum of 10 times, with the first three results rejected. Results are shown as a multiple of real-time calculation.

(3-1) DigiCortex 1.35 (32k Neuron, 1.8B Synapse)

 

Dwarf Fortress 0.44.12: Link

Another long standing request for our benchmark suite has been Dwarf Fortress, a popular management/roguelike indie video game, first launched in 2006 and still being regularly updated today, aiming for a Steam launch sometime in the future.

Emulating the ASCII interfaces of old, this title is a rather complex beast, which can generate environments subject to millennia of rule, famous faces, peasants, and key historical figures and events. The further you get into the game, depending on the size of the world, the slower it becomes as it has to simulate more famous people, more world events, and the natural way that humanoid creatures take over an environment. Like some kind of virus.

For our test we’re using DFMark. DFMark is a benchmark built by vorsgren on the Bay12Forums that gives two different modes built on DFHack: world generation and embark. These tests can be configured, but range anywhere from 3 minutes to several hours. After analyzing the test, we ended up going for three different world generation sizes:

  • Small, a 65x65 world with 250 years, 10 civilizations and 4 megabeasts
  • Medium, a 127x127 world with 550 years, 10 civilizations and 4 megabeasts
  • Large, a 257x257 world with 550 years, 40 civilizations and 10 megabeasts

DFMark outputs the time to run any given test, so this is what we use for the output. We loop the small test for as many times possible in 10 minutes, the medium test for as many times in 30 minutes, and the large test for as many times in an hour.

(3-2a) Dwarf Fortress 0.44.12 World Gen 65x65, 250 Yr(3-2b) Dwarf Fortress 0.44.12 World Gen 129x129, 550 Yr

 

Dolphin v5.0 Emulation: Link

Many emulators are often bound by single thread CPU performance, and general reports tended to suggest that Haswell provided a significant boost to emulator performance. This benchmark runs a Wii program that ray traces a complex 3D scene inside the Dolphin Wii emulator. Performance on this benchmark is a good proxy of the speed of Dolphin CPU emulation, which is an intensive single core task using most aspects of a CPU. Results are given in seconds, where the Wii itself scores 1051 seconds.

(3-3) Dolphin 5.0 Render Test

 

CPU Tests: Office and Science CPU Tests: Rendering
Comments Locked

229 Comments

View All Comments

  • schujj07 - Monday, May 17, 2021 - link

    You are comparing a gaming laptop against a high end professional laptop. First the 4900HS is a 35W CPU and the 10810U is a 15W CPU. If both laptops have equally size batteries, the one with the lower TDP "should" have longer battery life. On top of the the G14 has a 120Hz display and a dGPU. Both of those will pull extra power and the screen was specifically talked about in reviews of the laptop. Setting the screen to a 60Hz refresh rate instead of 120Hz significantly increased battery life. Finally the weird freezes is most likely due to the dual GPU design and switching between the iGPU and dGPU. Unless you are using so much RAM that you are page swapping.
  • Otritus - Wednesday, May 19, 2021 - link

    @schujj07 I have the same zephyrus laptop as morello159. I haven't experienced weird freezes when switching between gpus on mine, or my old laptop with an intel processor and nvidia gpu, so optimus working isn't likely to be causing the freezing. The random high power draw is a valid complaint though. I think the randomness is caused by Asus's turbo settings, which was mostly fixed by me modifying power limits and disabling turbo. But, the default experience is the processor randomly boosting ridiculously high when it should be in a near idle state and not clocking anywhere near as high. Like the chip randomly pushes all 8 cores to 3.8Ghz, when it should be running in the 1.4-1.7Ghz range.
  • bji - Monday, May 17, 2021 - link

    Why should I care AT ALL that one platform has been more stable *for you* (your words)? You are irrelevant. Just one piece of anecdotal data.
  • Calin - Tuesday, May 18, 2021 - link

    They are trading blows in performance, but AMD is doing that on 35W instead of 45W for Intel.
    For manufacturers that use the same chassis with Intel AND AMD processors, the Intel one will run hotter, be noisier and/or have lower battery life when working hard (I don't seem to find anything related to idle/low power consumption).
  • jenesuispasbavard - Monday, May 17, 2021 - link

    If you're planning on further testing, maybe using Intel XTU you can limit the PL1/PL2 to 45W and see how that performs?
  • jenesuispasbavard - Monday, May 17, 2021 - link

    Maybe I should scroll to page 2 before commenting on page 1...
  • vyor - Monday, May 17, 2021 - link

    I'm sorry, but your SPECFP2017 results are just wrong. There is no possible way that the 1185G7 is faster than the 11980HK by 2x in 503.bwaves

    That's just absurd, especially when every other test bar 3 shows the exact opposite results, and even of those that show similar results it isn't nearly to the same degree baring 549.fotonik, and that one has the 4900HS somehow being faster than the 5980HS.

    So no, your testing is just wrong and broken.
  • Otritus - Monday, May 17, 2021 - link

    The 1185G7 being twice as fast is a little questionable, and possibly the results for the Tiger Lake processors were switched accidentally.

    As to the 4900HS being faster than the 5980HS in one very specific subtest, I suppose companies have never released a new CPU architecture slower than the old one. That's why Bulldozer was well received for its incredible performance over Thuban. Rocket Lake was well loved for consistently beating Comet Lake and Zen 3 in gaming, with the 11900K always at the top of the chart. Broadwell-S of course isn't better than Skylake or competitive with Coffee Lake in gaming.
  • vyor - Monday, May 17, 2021 - link

    Except that Zen3 is consistently faster in almost every way.
  • Otritus - Monday, May 17, 2021 - link

    Keyword "almost"
    Zen 3 does not win every benchmark over Zen 2, just the vast majority of them due to superior clock speeds and IPC. In 1 very specific subtest, out of all the test conducted, is it really unreasonable to see the older architecture get a win. The last time I can think of a new architecture winning every single benchmark was Conroe. Sandybridge might also get this title with workloads that didn't need more than 4 cores, but I don't exactly recall. Remember IPC is an average of performance at a given frequency, so if a few benchmarks have negative improvements in IPC, but most have large positive improvements, you can easily see a 20% IPC uplift.

Log in

Don't have an account? Sign up now