** = Old results marked were performed with the original BIOS & boost behaviour as published on 7/7.

Benchmarking Performance: Web Tests

While more the focus of low-end and small form factor systems, web-based benchmarks are notoriously difficult to standardize. Modern web browsers are frequently updated, with no recourse to disable those updates, and as such there is difficulty in keeping a common platform. The fast paced nature of browser development means that version numbers (and performance) can change from week to week. Despite this, web tests are often a good measure of user experience: a lot of what most office work is today revolves around web applications, particularly email and office apps, but also interfaces and development environments. Our web tests include some of the industry standard tests, as well as a few popular but older tests.

We have also included our legacy benchmarks in this section, representing a stack of older code for popular benchmarks.

All of our benchmark results can also be found in our benchmark engine, Bench.

WebXPRT 3: Modern Real-World Web Tasks, including AI

The company behind the XPRT test suites, Principled Technologies, has recently released the latest web-test, and rather than attach a year to the name have just called it ‘3’. This latest test (as we started the suite) has built upon and developed the ethos of previous tests: user interaction, office compute, graph generation, list sorting, HTML5, image manipulation, and even goes as far as some AI testing.

For our benchmark, we run the standard test which goes through the benchmark list seven times and provides a final result. We run this standard test four times, and take an average.

Users can access the WebXPRT test at http://principledtechnologies.com/benchmarkxprt/webxprt/

WebXPRT 3 (2018)

WebXPRT 2015: HTML5 and Javascript Web UX Testing

The older version of WebXPRT is the 2015 edition, which focuses on a slightly different set of web technologies and frameworks that are in use today. This is still a relevant test, especially for users interacting with not-the-latest web applications in the market, of which there are a lot. Web framework development is often very quick but with high turnover, meaning that frameworks are quickly developed, built-upon, used, and then developers move on to the next, and adjusting an application to a new framework is a difficult arduous task, especially with rapid development cycles. This leaves a lot of applications as ‘fixed-in-time’, and relevant to user experience for many years.

Similar to WebXPRT3, the main benchmark is a sectional run repeated seven times, with a final score. We repeat the whole thing four times, and average those final scores.

WebXPRT15

Speedometer 2: JavaScript Frameworks

Our newest web test is Speedometer 2, which is a accrued test over a series of javascript frameworks to do three simple things: built a list, enable each item in the list, and remove the list. All the frameworks implement the same visual cues, but obviously apply them from different coding angles.

Our test goes through the list of frameworks, and produces a final score indicative of ‘rpm’, one of the benchmarks internal metrics. We report this final score.

Speedometer 2

Google Octane 2.0: Core Web Compute

A popular web test for several years, but now no longer being updated, is Octane, developed by Google. Version 2.0 of the test performs the best part of two-dozen compute related tasks, such as regular expressions, cryptography, ray tracing, emulation, and Navier-Stokes physics calculations.

The test gives each sub-test a score and produces a geometric mean of the set as a final result. We run the full benchmark four times, and average the final results.

Google Octane 2.0

Mozilla Kraken 1.1: Core Web Compute

Even older than Octane is Kraken, this time developed by Mozilla. This is an older test that does similar computational mechanics, such as audio processing or image filtering. Kraken seems to produce a highly variable result depending on the browser version, as it is a test that is keenly optimized for.

The main benchmark runs through each of the sub-tests ten times and produces an average time to completion for each loop, given in milliseconds. We run the full benchmark four times and take an average of the time taken.

Mozilla Kraken 1.1

Web Tests Analysis

Overall, in the web tests, the new Ryzen 3900X and 3700X perform very well with both chips showcasing quite large improvements over the 2700X.

We’re seeing quite an interesting match-up against Intel’s 9700K here, which is leading all of the benchmarks. The reason for this is that SKU has SMT turned off. The singe-threaded performance advantage of this is that the CPU core no longer has to share the µOP cache structure between to different threads, and has the whole capacity dedicated to one thread. Web workloads in particular are amongst the most instruction pressure heavy workloads out there, and they benefit extremely from turning SMT off on modern cores.

Whilst we didn’t have the time yet to test the new 3900X and 3700X with SMT off, AMD’s core and op cache works the same in that it’s sharing the capacity amongst two threads, statically partitioning it. I’m pretty sure we’d see larger increases in the web benchmarks when turning off SMT as well, and we’ll be sure to revisit this particular point in the future.

SPEC2006 & 2017: Industry Standard - ST Performance & IPC Benchmarking Performance: CPU System Tests
Comments Locked

447 Comments

View All Comments

  • Daeros - Monday, July 15, 2019 - link

    The only mitigation for MDS is to disable Hyper-Threading. I feel like there would be a pretty significant performance penalty for this.
  • Irata - Sunday, July 7, 2019 - link

    Well, at least Ryzen 3000 CPU were tested with the latest Windows build that includes Ryzen optimizations, but tbh I find it a bit "lazy" at least to not test Intel CPU on the latest Windows release which forces security updates that *do* affect performance negatively.

    This may or may not have changed the final results but would be more proper.
  • Oxford Guy - Sunday, July 7, 2019 - link

    Lazy doesn't even begin to describe it.
  • Irata - Sunday, July 7, 2019 - link

    Thing is I find this so completely unnecessary.

    Not criticising thereview per se, but you see AT staff going wild on Twitter over people accusing them of bias when simple things like testing both Intel and AMD systems on the same Windows version would be an easy way to protect themselves against criticism.

    It the same as the budget CPU review where the Pentium Gold was recommended due to its price/ performance, but many posters pointed out that it simply was not available anywhere for even near the suggested price and AT failed to acknowledge that.

    Zombieload ? Never heard of it.

    This is what I mean by lazy - acknowledge these issues or at least give a logical reason why. This is much easier than being offended on Twitter. If you say why you did certain things, there is no reason to post "Because they crap over the comment sections with such vitriol; they're so incensed that we did XYZ, to the point where they're prepared to spend half an hour writing comments to that effect with the most condescending language. " which basically comes down to saying "A ton of our readers are a*holes.

    Sure, PC related comment sections can be extremely toxic, but doing things as proper as possible is a good way to safeguard against such comments or at least make those complaining look like ignorant fools rather than actually encouraging this.
  • John_M - Sunday, July 7, 2019 - link

    A good point and you made it very well and in a very civil way.
  • Ryan Smith - Monday, July 8, 2019 - link

    Thanks. I appreciate the feedback, as I know first hand it can sometimes be hard to write something useful.

    When AMD told us that there were important scheduler changes in 1903, Ian and I both groaned a bit. We're glad AMD is getting some much-needed attention from Microsoft with regards to thread scheduling. But we generally would avoid using such a fresh OS, after the disasters that were the 1803 and 1809 launches.

    And more to the point, the timeframe for this review didn't leave us nearly enough time to redo everything on 1903. With the AMD processors arriving on Wednesday, and with all the prep work required up to that, the best we could do in the time available was run the Ryzen 3000 parts on 1903, ensuring that we tested AMD's processor with the scheduler it was meant for. I had been pushing hard to try to get at least some of the most important stuff redone on 1903, but unfortunately that just didn't work out.

    Ultimately laziness definitely was not part of the reason for anything we did. Andrei and Gavin went above and beyond, giving up their weekends and family time in order to get this review done for today. As it stands, we're all beat, and the work week hasn't even started yet...

    (I'll also add that AnandTech is not a centralized operation; Ian is in London, I'm on the US west coast, etc. It brings us some great benefits, but it also means that we can't easily hand off hardware to other people to ramp up testing in a crunch period.)
  • RSAUser - Monday, July 8, 2019 - link

    But you already had the Intel processors beforehand so could have tested them on 1903 without having to wait for the Ryzen CPU? Your argument is weird.
  • Daeros - Monday, July 15, 2019 - link

    Exactly. They knew that they needed to re-test the Intel and older Ryzen chips on 1903 to have a level, relevant playing field. Knowing that it would penalize Intel disproportionately to have all the mitigations 1903 bakes in, they simply chose not to.
  • Targon - Monday, July 8, 2019 - link

    Sorry, Ryan, but test beds are not your "daily drivers". With 1903 out for more than one month, a fresh install of 1903(Windows 10 Media Creation tool comes in handy), with the latest chipset and device drivers, it should have been possible to fully re-test the Intel platform with all the latest security patches, BIOS updates, etc. The Intel platform should have been set and re-benchmarked before the samples from AMD even showed up.

    It would have been good to see proper RAM used, because anyone who buys DDR4-3200 RAM with the intention of gaming would go with DDR4-3200CL14 RAM, not the CL16 stuff that was used in the new Ryzen setup. The only reason I went CL16 with my Ryzen setup was because when pre-ordering Ryzen 7 in 2017, it wasn't known at the time how significant CL14 vs. CL16 RAM would be in terms of performance and stability(and even the ability to run at DDR4-3200 speeds).

    If I were doing reviews, I'd have DDR4-3200 in various flavors from the various systems being used. Taking the better stuff out of another system to do a proper test would be expected.
  • Ratman6161 - Thursday, July 11, 2019 - link

    "ho buys DDR4-3200 RAM with the intention of gaming would go with DDR4-3200CL14 RAM"

    Well I can tell you who. First Ill address "the intention of gaming". there are a lot of us who could care less about games and I am one of them. Second, even for those who do play games, if you need 32 GB of RAM (like I do) the difference in price on New Egg between CAS 16 and CAS 14 for a 2x16 Kit is $115 (comparing RipJaws CAS 16 Vs Trident Z CAS 14 - both G-Skill obviously). That's approaching double the price. So I sort of appreciate reviews that use the RAM I would actually buy. I'm sure gamers on a budget who either can't or don't want to spend the extra $115 or would rather put it against a better video card, the cheaper RAM is a good trade off.

    Finally, there are going to be a zillion reviews of these processors over the next few days and weeks. We don't necessarily need to get every single possible configuration covered the first day :) Also, there are many other sites publishing reviews so its easy to find sites using different configurations. All in all, I don't know why people are being so harsh on this (and other) reviews. its not like I paid to read it :)

Log in

Don't have an account? Sign up now