The Ice Lake Benchmark Preview: Inside Intel's 10nm

Name: The Ice Lake Benchmark Preview: Inside Intel's 10nm
Item: The Ice Lake Benchmark Preview: Inside Intel's 10nm
Author: Dr. Ian Cutress

by Dr. Ian Cutress on August 1, 2019 9:00 AM EST

261 Comments | Add A Comment

261 Comments

Section by Andrei Frumusanu

SPEC2017 and SPEC2006 Results (15W)

SPEC2017 and SPEC2006 is a series of standardized tests used to probe the overall performance between different systems, different architectures, different microarchitectures, and setups. The code has to be compiled, and then the results can be submitted to an online database for comparsion. It covers a range of integer and floating point workloads, and can be very optimized for each CPU, so it is important to check how the benchmarks are being compiled and run.

We run the tests in a harness built through Windows Subsystem for Linux, developed by our own Andrei Frumusanu. WSL has some odd quirks, with one test not running due to a WSL fixed stack size, but for like-for-like testing is good enough. SPEC2006 is deprecated in favor of 2017, but remains an interesting comparison point in our data. Because our scores aren’t official submissions, as per SPEC guidelines we have to declare them as internal estimates from our part.

For compilers, we use LLVM both for C/C++ and Fortan tests, and for Fortran we’re using the Flang compiler. The rationale of using LLVM over GCC is better cross-platform comparisons to platforms that have only have LLVM support and future articles where we’ll investigate this aspect more. We’re not considering closed-sourced compilers such as MSVC or ICC.

clang version 8.0.0-svn350067-1~exp1+0~20181226174230.701~1.gbp6019f2 (trunk)
clang version 7.0.1 (ssh://git@github.com/flang-compiler/flang-driver.git
24bd54da5c41af04838bbe7b68f830840d47fc03)

-Ofast -fomit-frame-pointer
-march=x86-64
-mtune=core-avx2
-mfma -mavx -mavx2

Our compiler flags are straightforward, with basic –Ofast and relevant ISA switches to allow for AVX2 instructions. Despite ICL supporting AVX-512, we have not currently implemented it, as it requires a much greater level of finesse with instruction packing. The best AVX-512 software uses hand-crafted intrinsics to provide the instructions, as per our 3PDM AVX-512 test later in the review.

For these comparisons, we will be picking out CPUs from across our dataset to provide context. Some of these might be higher power processors, it should be noted.

SPECint2006

SPECint2006 Speed Estimated Scores

Amongst SPECint2006, the one benchmark that really stands out beyond all the rest is the 473.astar. Here the new Sunny Cove core is showcasing some exceptional IPC gains, nearly doubling the performance over the 8550U even though it’s clocked 100MHz lower. The benchmark is extremely branch misprediction sensitive, and the only conclusion we can get to rationalise this increase is that the new branch predictors on Sunny Cove are doing an outstanding job and represent a massive improvement over Skylake.

456.hmmer and 464.h264ref are very execution bound and have the highest actual instructions per clock metrics in this suite. Here it’s very possible that Sunny Cove’s vastly increased out-of-order window is able to extract a lot more ILP out of the program and thus gain significant increases in IPC. It’s impressive that the 3.9GHz core here manages to match and outpace the 9900K’s 5GHz Skylake core.

Other benchmarks here which are limited by other µarch characteristics have various increases depending on the workload. Sunny Cove doubled L2 cache should certainly help with workloads like 403.gcc and others. However because we’re also memory latency limited on this platform the increases aren’t quite as large as we’d expect from a desktop variant of ICL.

SPECfp2006(C/C++) Speed Estimated Scores

In SPECfp2006, Sunny Cove’s wider out-of-order window can again be seen in tests such as 453.povray as the core is posting some impressive gains over the 8550U at similar clocks. 470.lbm is also instruction window as well as data store heavy – the core’s doubled store bandwidth here certainly helps it.

SPEC2006 Speed Estimated Total

Overall in SPEC2006, the new i7-1065G7 beats a similarly clocked i7-8550U by a hefty 29% in the int suite and 34% in the fp suite. Of course this performance gap will be a lot smaller against 9^th gen mobile H-parts at higher clocks, but these are also higher TDP products.

The 1065G7 comes quite close to the fastest desktop parts, however it’s likely it’ll need a desktop memory subsystem in order to catch up in total peak absolute performance.

SPEC2006 Speed Estimated Performance Per GHz

Performance per clock increases on the new Sunny Cove architecture are outstandingly good. IPC increases against the mobile Skylake are 33 and 38% in the integer and fp suites, though we also have to keep in d mind these figures go beyond just the Sunny Cove architecture and also include improvements through the new LPDDR4X memory controllers.

Against a 9900K, although apples and oranges, we’re seeing 13% and 14% IPC increases. These figures likely would be higher on an eventual desktop Sunny Cove part.

SPEC2017

SPECint2017 Rate-1 Estimated Scores

SPECfp2017 Rate-1 Estimated Scores

SPEC2017 Rate-1 Estimated Total

The SPEC2017 results look similar to the 2006 ones. Against the 8550U, we’re seeing grand performance uplifts, just shy of the best desktop processors.

SPEC2017 Speed Estimated Performance Per GHz

Here the IPC increase also look extremely solid. In the SPECin2017 suite the Ice Lake part achieves a 14% increase over the 9900K, however we also see a very impressive 21% increase in the fp suite.

Overall in the 2017 suite, we’re seeing a 19% increase in IPC over the 9900K, which roughly matches Intel’s advertised metric of 18% IPC increase.

Security Updates, Improved Instruction Performance and AVX-512 Updates Power Results (15W and 25W)

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

261 Comments

View All Comments

Phynaz - Friday, August 2, 2019 - link
What? TDP doesn’t mean what you think it does.
Alexvrb - Monday, August 5, 2019 - link
I didn't feel like quoting the entire paragraph. But please DO elaborate. Then tell me how useful TDP is when they let OEMs set PL2 and Tau to... anything, really. You can take two "95W" processors and their power and thermals under load are radically different across a range of mainboards. The is reflected in mobile as well, where they let OEMs do pretty much whatever - the results aren't constrained by the processor no matter what the claimed TDP is. That doesn't even COUNT overclocking.

Meanwhile AMD chips don't hand over control to mainboards unless you ARE overclocking, which is how it SHOULD be.
Alistair - Friday, August 2, 2019 - link
I didn't see any discussion or comparison vs. the i7-9850H. Let's see a 28W TDP version of the 6 core i7-9850H put against these new chips. Same money, 50 percent more cores. Anyone in their right mind should be looking for an i7-9850H or 9750H laptop instead over these 10nm products. Where is the 6 core 10nm CPU? Don't buy a 4 cores laptop if you're looking for good performance in 2019-2020 imo.

If you want a 4 core laptop get a cheaper 14nm based laptop. If you want performance get a 6 core. I really really don't see the point in these products.
Alexvrb - Friday, August 2, 2019 - link
They gotta do *something* with all those 10nm wafers. Ian can't eat them all, and China said they don't want any more half-baked 10nm products after the last go-around. Maybe in 2020 we'll see 10nm++ and it will be as good as phase one 10nm was supposed to be.

But yeah, their current 10nm products are a bit disappointing outside of the fatter GPUs and better memory speeds. If you're using something with a dGPU there's little point vs their own 14++, it only starts to make sense if you want AMD-like iGPU performance with the latest Core processor design. Even then that's only limited to models with a high EU count (48+) as the 32 EU models just look meh.

They're going to have some stiff competition when 7nm Zen 2 APUs launch. I guess that's why they're attacking the low-power first, as AMD is still stuck on 12nm rehash Zen+ products for now.
InvidiousIgnoramus - Friday, August 2, 2019 - link
I still find it amusing that the architecture with "Ice" in it's name has low clock speeds presumably from power/heat issues.
abufrejoval - Friday, August 2, 2019 - link
Great work! And kudos to AMD to make Intel work so much harder to get good news out!

Two die carrier layouts but the chips looking identical:

First of all, I assume that the bigger and square chip is essentially the North-Bridge in 14nm?

And the smaller rectangular one the CPU+iGPU?

And I guess at 64EU we are talking about more than 60% of die area going to iGPU while even at quad core and AES-512 the CPU + cache will be perhaps 30%?

Is there any HSA or GPGPU compute to 'pay' for that iGPU surface and power in professional workloads?

Or is it really just for gaming?

Am I also correct to assume that of the extra thermal budget in the 28Watt parts, none really goes to the CPU, only allows it to stay within the 15 Watt envelope while the iGPU is also running?

Are we talking different die layouts and sizes for dual/quad CPUs and 64/32 iGPU EUs or is it really all just binning, meaning that an Core i3-1000G1 is a chip where 70% surface area of an Core i7-1060G7 failed to make it?

Why am I thinking they are heading down a path without consumer value returns?

I got a Lenovo S730 i7-8565U or Whisky Lake recently for a little over €1000 and I got a couple of J5005 Atoms recently for a little over €100 (admittedly complete notebook vs. RAM less Mini-ITX mainboard). The difference in power is 15 vs 10 Watts.

Both are fairly competent 2D machines even at 4k. Both are terrible gaming machines, but I don't really think that ultrabook portable gaming performance is a selling point.

If I were free to choose CPU vs. GPU real-estate, I'd definitely go left, say 6 or 8 CPU cores or just higher sustained turbos and make do with the J5005's 18 iGPU EUs, because CPU power is what I profit from professionally.

For GPU, every € I spend gets me vastly more gaming experience in less mobile form factors, which is fine: I don't see how I could run in a game and outside without breaking my newest toy.
Sahrin - Friday, August 2, 2019 - link
$426 for a quad core in 2019. What a time to be alive.
eva02langley - Friday, August 2, 2019 - link
So basically... expensive, low yield, 4 cores, low frequency.

Outside of better IGPU, barely matching AMD offering, and AVX512, which is not even a matter for a 4 cores CPU, 10nm is an abysmal failure.
Phynaz - Friday, August 2, 2019 - link
So basically....you’re an imbecile
Korguz - Friday, August 2, 2019 - link
your one to talk phynaz, i guess you want to be stuck on quad cores in notebooks for ever ???

The Ice Lake Benchmark Preview: Inside Intel's 10nm

SPEC2017 and SPEC2006 Results (15W)

SPECint2006

SPEC2017

Post Your Comment

261 Comments

View All Comments

Phynaz - Friday, August 2, 2019 - link

Alexvrb - Monday, August 5, 2019 - link

Alistair - Friday, August 2, 2019 - link

Alexvrb - Friday, August 2, 2019 - link

InvidiousIgnoramus - Friday, August 2, 2019 - link

abufrejoval - Friday, August 2, 2019 - link

Sahrin - Friday, August 2, 2019 - link

eva02langley - Friday, August 2, 2019 - link

Phynaz - Friday, August 2, 2019 - link

Korguz - Friday, August 2, 2019 - link

Log in

Don't have an account? Sign up now