Analyzing Performance Per Dollar

While measuring the performance per watt is an interesting metric, workstation processors are at the top of the stack when it comes to power consumption: the point of these processors (typically) is getting work done fast, and their users don't mind using a bit more power to get there. When it comes to designing a workstation level system for an office, the person who signs off on the project is not going to look at the performance per watt - they are going to look at the performance per dollar. Does this purchase represent the best value for the environment, and is the purchase likely to give us the best return? The following graphs attempt to answer that information, and we've chosen a varied selection including variable threaded loads.

For this analysis, we've taken all of the processors we have tested from what Intel has launched in 2017. This covers all of the consumer level Kaby Lake-S i7 and i5 parts (except T), the high-end desktop Kaby Lake-X processors, and all the high-end Skylake-X desktop parts. This is represented in the graphs in blue under the 'Intel 2017' title. From AMD, we have all eleven AMD Ryzen processors tested, under the red line.

Compile Performance Per Dollar

One of our popular benchmarks is our compile test. This takes a fixed version of Chromium v56, and runs it through the MSVC compiler with linking as per the directions given on by the Chromium developers. A typical test can run from 40 minutes to 3 hours depending on the processor, and taxes the single thread, the multi-thread and the memory performance of the system. We've seen that it does not particularly take kindly to processors with victim caches, such as Skylake-X or AMD Ryzen, with limited scaling on the code workflow. Despite the per-core performance dip from Skylake-S to Skylake-X, the top processor still has the best absolute performance. Converting our data to the number of compiles per day per dollar gives the following graph:

The cheaper end of the graph is surprisingly high, dominated by Intel's dual-core Pentium (with hyperthreading) being offered for super low prices. In the standard 'consumer' price range below $300, the mid-range Ryzen processors have a slight advantage, but, beyond the Ryzen 7 1700, Intel has the performance per dollar advantage all the way out to $2000.

Agisoft Performance Per Dollar

The Agisoft Photoscan software has been a key part of our performance testing for several years, demonstrating a true office workflow: archival purposes of taking 2D photos and converting them into 3D models. This is a computationally interesting algorithm, involving multiple single-threaded and multi-threaded stages.

Due to the single threaded elements of the algorithm, linear scaling is not observed as we ramp up through the core counts. At the mid-range consumer processor pricing, the twelve-thread Ryzen 5 processors sit above the quad-thread Core i5 parts, but beyond $330 or so, as we move into Intel's quad-core offerings and above, the performance per dollar is solely on Intel's side.

Blender Performance Per Dollar

The Blender benchmark has been an interesting debate in the last few months, with the new architectures from Intel and AMD pushing updates into the code for faster rendering. Our test takes a well used Blender release and one of the standard benchmarks (rather than anything vendor specified). Results are given in renders of this benchmark per day per dollar.

Interestingly AMD takes the peak PPD across the full range. At $999, where the competition is expected to be highest, AMD has over a 10% advantage. In the four digit range, even though the PPD of Intel's processors is lower, the absolute performance is still better. For our Blender test, this translates so a few seconds over a 2-3 minute test.

Cinebench R15 nT Performance Per Dollar

Next we move into the pure multithreaded benchmarks, which can be a significant number of workstation workloads. Here Intel might be at a disadvantage, with AMD offering more cores and more threads at each price point - Intel's IPC advantage will have to offset this in order to move ahead.

To lay some background here: AMD has been plugging Cinebench R15 nT benchmark numbers since the launch of Zen, citing better PPD. Intel's rebuttal is that in absolute performance, when you need the absolute best results, their hardware still wins.

Corona Rendering Performance Per Dollar

Ray tracing is another example of light threads taking advantage of more cores, more frequency, higher IPC and accelerated intructions with fast FP throughput. Citing back to Johan's EPYC review again, which showed AMD's good base FP performance, it will be an interesting comparison.

Similar to some of the previous graphs, the best PPD is held at Intel's low end Pentium processors - however these do not give the best overall throughput. In the mainstream price range, the Ryzen 5 1600 and 1600X are suprising peak results. In the $500-$800 range, Intel and AMD are about equal, however at $999 the Threadripper is ahead of the Core i9. Again, at the $1500+ range, Intel offers the better overall throughput, despite the lower PPD.

Power Consumption and Power Efficiency Intel Core i9-7980XE and Core i9-7960X Conclusion
Comments Locked

152 Comments

View All Comments

  • CrazyHawk - Tuesday, September 26, 2017 - link

    "Intel also launched Xeon-W processors in the last couple of weeks."

    Just where can one purchase these mythical Xeon-W processors? There hasn't been a single peep about them since the "launch" week. I've only heard of two motherboards that will support them. They seem to be total vaporware. On Intel's own site, it says they were "Launched" in 3Q2017. Intel had better hurry up, 3Q will be up in 4 days!
  • samer1970 - Tuesday, September 26, 2017 - link

    I dont understand why intel disables ECC on their i9 CPU , they are losing low budget workstation buyers who will 100% choose AMD threadripper over intel i9..

    Even if they are doing this to protect their xeons chips ,they can enable non buffered ECC and not allow Registered ECC on the i9 problem solved. unbuffered ECC has Size limitation and people who want more RAM will go for xeons.

    Remember that their i3 has ECC support , but only the i3 ...

    intel , you are stupid.
  • vladx - Wednesday, September 27, 2017 - link

    Newsflash, these chips don't target "low budget workstation buyers". Golden rule is always: "If you can't afford it, you're not the target customer.".
  • samer1970 - Wednesday, September 27, 2017 - link

    Thats not a Golden Rule anymore with the Threadripper chips around . it is called "Stupid rule" ...

    They are allowing AMD to steal the low budget workstation buyers by not offering them an alternative to choose from.
  • vladx - Wednesday, September 27, 2017 - link

    The "low budget workstation buyers" as you call them are a really insignificant percentage of an already really small piece of the huge pie of Intel customers.
  • samer1970 - Wednesday, September 27, 2017 - link

    who told you so ? Most engineering students at universities need one , and Art Students who render alot as well. all these people will buy threadripper CPU and avoid intel , for intel xeon are 50% more expensive .

    andI dont cae about the percentage in intel Pie ... hundreds of thousands student enter uiviersites around the world each year . Low percentage or not they are alot ...

    how much do you think a low budget workstation costs ? they start from $3000 ... and with xeon Pricing , it will be very difficult to add alot of RAM and a good workstation card and fast SSD .
  • esi - Wednesday, September 27, 2017 - link

    What's the explanation for some of the low scores of the 7980XE on the SPECwpc benchmarks? Particularly Poisson, where the 6950X is 3.5X higher.
  • ZeDestructor - Wednesday, September 27, 2017 - link

    Most likely cache-related
  • esi - Wednesday, September 27, 2017 - link

    Maybe. But one that really makes no sense is the Dolphin 5.0 render test. How can the 7980XE take nearly twice as long as the 7960X?
  • esi - Wednesday, September 27, 2017 - link

    So I ran the Poisson benchmark on by 6950X. It uses all 10 cores (20 h/w threads), but can be configured to run in different ways: you can set the number of s/w threads per process. It then creates enough processes to ensure there's one s/w thread per h/w thread. Changing the s/w threads per processes significantly effects the result:

    20 - 1.34
    10 - 2.5
    5 - 3.31
    4 - 3.47
    2 - 3.67
    1 - 0.19

    Each process only uses about 2.5MB of RAM. So the 1-thread per process probably has a low result as this will result in more RAM usage than L3 cache, whereas the others should all fit in.

    Would be interesting to see what was used for the 7980/7960. Perhaps the unusual number of cores resulted in a less than optimal process/thread mapping.

Log in

Don't have an account? Sign up now