The Fastest for Serial Workloads

If you asked ‘what made the best processor’ fifteen years ago, the obvious answers were performance, power and price. As time has marched on, this list has started to include integrated graphics, bandwidth, platform integration, platform upgradability, core-to-core latency, and of course, cores. Marching up from a single x86 core through to CPUs that carry 10 cores for consumers, 28 cores for enterprise and 72 cores for add-in cards makes the task of creating a perfect processor almost impossible – there is no way to satisfy all of the properties that build a processor today. Both AMD and Intel start from basic building blocks (a single core) and then configure processors around that core, adding in more cores, connectivity, and then binning to the right voltage/frequency and pricing appropriately. The end result is a stack of processors aimed at different segments of the market.

The pair of Kaby Lake-X processors cover one main area listed above more than any others: core performance. By having the latest CPU microarchitecture and placing it on the newest high-end desktop platform there is room at the top for more frequency leading to a higher pure performance product. As a byproduct these CPUs are power efficient, leading to a high performance per watt, and are situated in a platform with extensive IO options. Ultimately this is where the Kaby Lake-X customer will sit: someone who wants high single thread performance but is not after massive multi-core performance. This would typically cover the majority of gamers and enthusiasts, but not necessarily content creators.

The benefits in the benchmarks are clear against the nearest competition: these are the fastest CPUs to open a complex PDF, at the top for office work, and at the top for most web interactions by a noticeable amount.

The downsides are pure throughput workflows, such as neuron simulation, rendering and non-video encoding.

The parts in the middle are the ones to dissect, and these get interesting. Let me pull up a few graphs that illustrate this middle of the road position: Chromium Compilation, Agisoft Photoscan and WinRAR.

Office: Chromium Compile (v56)

System: Agisoft Photoscan 1.0 Total Time

Encoding: WinRAR 5.40

These three results show the Core i7-7740X performing above any AMD chips of similar price, but the Core i5-7640X performing below any Ryzen 7 or Ryzen 5 parts. This comes down to the workload in each of these benchmarks, and how the processor configurations affect that. All three of these real-world benchmarks are variable-thread workloads. Some elements are serialized and rely on a high single-thread performance, while other elements are fully parallelizable and can take advantage of cores and threads (sometimes threads do not necessarily help). The benchmarks are ultimately limited by Amdahl’s Law, where single thread speed affects the whole test, but multiple-threads only helps the parallelizable parts. With sufficiently parallelizable code, it becomes a balance between the two.

So for the Core i7-7740X, up against the Ryzen 7 1700 at an equivalent price, the Core i7 has eight threads and the Ryzen 7 has sixteen, but the Core i7 has a much higher single thread performance. So for these benchmarks, having a high performance metric like this means that despite having half the cores/threads of the AMD part, the Core i7 can take the lead very easily.

But the Core i5-7640X has a different task. It has four cores, like the Core i7, but no hyperthreading, so it sits at four threads. Its direct competitor, the Ryzen 5 1600X, has six cores with simultaneous multithreading, leading to twelve threads. This gives the AMD processor a 3:1 advantage in threads, and for each of these three benchmarks it can parallelize the code sufficiently that the single thread performance of the Intel CPU is not enough. Moving from a 2:1 ratio with the Core i7 to a 3:1 ratio with the Core i5 is a turning point for ST performance compared to MT performance.

So with the X299 confusion, are these CPUs worth recommending?

When Kaby Lake-X first came out, a number of technology experts were confused at Intel’s plans. It made sense to launch the latest microarchitecture on the high-end desktop platform, although launching it in a quad-core form was an idea out-of-left-field, especially for a platform that is geared towards multiple cores, more memory, and more memory bandwidth. In that paradigm, the Kaby Lake-X is an oddball processor design choice.

There are bigger factors at play however – if Intel launched 6-10 core parts on KBL, it would cannibalize their Skylake-X and Skylake-SP sales. Also, as we’ve seen with Skylake-X CPUs, those enterprise cores are now different to the consumer Skylake-S cores, with different cache structures and AVX-512. So if Intel had launched >4 cores on KBL-X, they would have likely had to scrap Skylake-X.

But that’s a slight tangent.

The Core i7-7740X appeals to users who want the fastest out-of-the-box single thread x86 processor on the market today. This means financial traders, gamers, and professionals working with serial code bases, or anyone with deep pockets that might think about upgrading to Skylake-X in the future. Enthusiast overclockers are likely to find the better binned CPUs fun as well.

That’s if you do not mind paying a premium for the X299 platform. For users who mind the cost, the Core i7-7700K is 98% of the way there on performance but can save a hundred dollars on the motherboard and offers the same functionality. In some of our benchmarks, where despite the high single thread performance having more cores helped, then spending a little more on the Skylake-X six-core Core i7-7800X is beneficial: for example, Luxmark and POV-Ray scored +33% for the 7800X over the 7740X.

The Core i7-7740X makes certain sense for a number of niche scenarios. By contrast, the Core i5-7640X doesn’t make much sense at all. There’s still the benefit of high single-thread performance and some good gaming performance in older titles, but in the variable threaded workloads it loses to AMD’s processors, sometimes by as much as 45%.  For a chip that comes in at $242, users should expect to pay about the same on a motherboard – whereas either an AMD part or the Core i5-7600K can go in a $120 motherboard and still be overclocked.

There are only two scenarios I can see where the Core i5 adds up. Firstly, users who just want to get onto X299 now and upgrade to a bigger CPU for quad-channel memory and more PCIe lanes later. The second is for professionals that know that their code cannot take advantage of hyperthreading and are happy with the performance. Perhaps in light of a hyperthreading bug (which is severely limited to minor niche edge cases), Intel felt a non-HT version was required.

In our recent CPU Buyers’ Guide (link autoupdates to the latest CPU guide) we suggested the Core i7-7740X for anyone wanting a Peak VR experience, and we still stand by that statement. It has enough threads and the biggest grunt to take on VR and the majority of enthusiast gaming experiences, if a user has pockets big enough.

The recommendations of the new CPUs boil down to platform costs. They seem a minor upgrade to the Kaby Lake-K processors and the Z270 platform, which is a platform that caters to a big audience with a more cost-sensitive structure for motherboards in mind. 

Power Consumption and Overclocking to 5.0 GHz
Comments Locked

176 Comments

View All Comments

  • iwod - Monday, July 24, 2017 - link

    Intel has 10nm and 7nm by 2020 / 2021. Core Count is basically a solved problem, limited only by price.

    What we need is a substantial breakthrough in single thread performance. May be there are new material that could bring us 10+Ghz. But those aren't even on the 5 years roadmap.
  • mapesdhs - Monday, July 24, 2017 - link

    That's more down to better sw tech, which alas lags way behind. It needs skills that are largely not taught in current educational establishments.
  • wolfemane - Monday, July 24, 2017 - link

    Under Handbrake testing, just above the first graph you state:
    "Low Quality/Resolution H264: He we transcode a 640x266 H264 rip of a 2 hour film, and change the encoding from Main profile to High profile, using the very-fast preset."

    I think you mean to say "HERE we transcode..."

    Great article overall. Thank you!
  • Ian Cutress - Monday, July 24, 2017 - link

    Thanks, corrected :)
  • wolfemane - Monday, July 24, 2017 - link

    I wish your team would finally add in an edit button to comments! :)

    On the last graph ENCODING: Handbrake HEVC (4k) you don't list the 1800x, but it is present in the previous two graphs @ LQ and HQ. Was there an issue with the 1800x preventing 4k testing? Quite interested in it's results if you have them.
  • Ian Cutress - Monday, July 24, 2017 - link

    When I first did the HEVC testing for the Ryzen 7 review, there was a slight issue in it running and halfway through I had to change the script because the automation sometimes dropped a result (like the 1800X which I didn't notice until I was 2-3 CPUs down the line). I need to put the 1800X back on anyway for AGESA 1006, which will be in an upcoming article.
  • IanHagen - Monday, July 24, 2017 - link

    One thing that caught my eye for a while is how compile tests using GCC or clang show much better results on Ryzen compared to using Microsoft's VS compiler. Phoronix tests clearly shows that. Thus, I cannot really believe yet on Ian's recurring explanation of Ryzen suffering from its victim L3 cache. After all, the 1800X beats the 7700K by a sizable margin when compiling the Linux kernel.

    Isn't Ryzen relatively poor performance compiling Chromium due to idiosyncrasies of the VS compiler?
  • Ian Cutress - Monday, July 24, 2017 - link

    The VS compiler seems to love L3 cache, then. The 1800X does have 2x threads and 2x cores over the 7700K, accounting for the difference. We saw a -17% drop going from SKL-S with its fully inclusive L3 to SKL-SP with a victim L3, clock for clock.

    Chromium was the best candidate for a scripted, consistent compile workflow I could roll into our new suite (and runs on Windows). Always open for suggestions that come with an ELI5.
  • ddriver - Monday, July 24, 2017 - link

    So we are married to chromium, because it only compiles with msvc on windows?

    Or maybe because it is a shitty implementation that for some reason stacks well with intel's offerings?

    Pardon my ignorance, I've only been a multi-platform software developer for 8 years, but people who compile stuff a lot usually don't compile chromium all day.

    I'd say go GCC or Clang, because those are quality community drive open source compilers that target a variety of platforms, unlike msvc. I mean if you really want to illustrate the usefulness of CPUs for software developers, which at this point is rather doubtful...
  • Ian Cutress - Monday, July 24, 2017 - link

    Again, find me something I can rope into my benchmark suite with an ELI5 guide and I try and find time to look into it. The Chromium test took the best part of 2-3 days to get in a position where it was scripted and repeatable and fit with our workflow - any other options I examined weren't even close. I'm not a computer programmer by day either, hence the ELI5 - just years old knowledge of using Commodore BASIC, batch files, and some C/C++/CUDA in VS.

Log in

Don't have an account? Sign up now