Frequency Analysis: Cutting Back on AVX2 vs Kaby Lake

Analyzing a new CPU family as a mobile chip is relatively difficult. Here we have a platform that is very much hamstrung by its thermal settings and limitations. Not only that, the BIOS adjustments available for mobile platforms are woeful in comparison to what we can test on desktop. This applies to the Intel NUC that came to retail in December as well as the Lenovo Ideapad E330-15ICN that we have for testing.

The issue is that for a 15W processor, even when built in a ’35 W’ capable environment, might still hit thermal limits depending on the configuration. We’ve covered why Intel’s TDP often bares little relation to power consumption, and it comes down to the different power levels that a system defines. It can also depend a lot on how the chip performs – most processors have a range of valid voltage/power curves which are suitable for that level of performance, and users could by chance either get a really good chip that stays cool, or a bad chip that rides the thermal limits. Ideally we would have all comparison chips in a desktop-like environment, such as when we tested the ‘Customer Reference Board’ version of Broadwell, which came in a desktop-like design. Instead, we have to attach as big of a cooling system as we can, along with extra fans, just in case. Otherwise potential variations can affect performance.

For our testing, we chose Intel’s Core i3-8130U mobile processor as the nearest competition. This is a Kaby Lake dual core processor, which despite the higher number in its name is using the older 14nm process and older Kaby Lake microarchitecture. This processor is a 15W part, like our Cannon Lake Core i3-8121U, with the same base frequency, but with a slightly higher turbo frequency. Ultimately this means that this older 14nm processor, on paper, should be more efficient than Intel’s latest 10nm process. Add on to this, the Core i3-8130U has active integrated graphics, while the Cannon Lake CPU does not.

Because both CPUs have turbo modes, it’s important to characterize the frequencies during testing. Here are the specifications and turbo tables for each processor:

Comparing Cannon Lake to Kaby Lake
10m Cannon Lake
Core i3-8121U
AnandTech 14nm Kaby Lake
Core i3-8130U
2 / 4 Cores / Threads 2 / 4
15 W Rated TDP 15 W
2.2 GHz Base Frequency 2.2 GHz
3.2 GHz Single Core Turbo 3.4 GHz
3.1 GHz Dual Core Turbo 3.4 GHz
2.2 GHz AVX2 Frequency 2.8 GHz
1.8 GHz AVX512 Frequency -

The Cannon Lake processor loses frequency as the cores are loaded, and severely loses frequency when AVX2/AVX512 is applied based on our testing. Comparing that to the Kaby Lake on Intel’s mature 14nm node, it keeps its turbo and only loses a few hundred MHz with AVX2. This part does not have AVX512, which is a one up for the Cannon Lake.

The biggest discrepancy we observed for AVX2 was in our POV-Ray test.

Here the Kaby Lake processor sustains a much higher AVX2 frequency, and completes the test quicker for a 26% better performance. This doesn’t affect every test as we’ll see in the next few pages, and for AVX-512 capable tests, the Cannon Lake goes above and beyond, despite the low AVX-512 frequency. For example, at 2.2 GHz, the Kaby Lake chip scores 615 in our 3DPM test in AVX2 mode, whereas the Cannon Lake chip scores 3846 in AVX512 mode, over 6x higher.

The system we are using for the Core i3-8130 is ASUS’ PN60 Mini-PC. This device is an ultra-compact mini-PC that measures 11.5mm square and under 5cm tall. It is just big enough for me to install our standard Crucial MX200 1TB SSD and 2x4GB of G.Skill DDR4-2400 SO-DIMMs.

For the Cannon Lake based Lenovo Ideapad 330-15ICN, we removed the low-end SSD and HDD that was shipped with the design and put in our own Crucial MX200 1TB and 2x4 GB DDR4 SO-DIMMs for testing. Unfortunately we can’t probe the exact frequency the memory seems to be running at, nor the sub-timings, because of the nature of the system. However the default SPD of the modules is DDR4-2400 17-17-17.

Intel’s Core i3-8121U: Uncovering the Microarchitecture Secrets Our Testing Suite for 2018 and 2019
Comments Locked

129 Comments

View All Comments

  • qcmadness - Saturday, January 26, 2019 - link

    I am more curious on the manufacturing node. Zen (14 / 12nm from GF) has 12 metal layers. Cannon Lake has 13 metal layers, with 3 quad-patterning and 2 dual patterning. How would these impact the yield and manufacturing time of production? I think the 3 quad-patterning process will hurt Intel in the long run.
  • KOneJ - Sunday, January 27, 2019 - link

    More short-run I would say actually. EUV is coming to simplify and homogenize matters. This is a patch job. Unfortunately, PL analysis and comparison is not an apples-to-apples issue as there are so many facets to implementation in various design stages. A broader perspective that encompasses the overall aspects and characteristics is more relevant IMHO. It's like comparing a high-pressure FI SOHC motor with a totally unrelated low-pressure FI electrically-spooling DOHC motor of similar displacement. While arguing minutiae about design choices is interesting to satisfy academic curiosity, it's ultimately the reliability, power-curve and efficiency that people care about. Processors are much the same. As a side note, I think it's the attention to all these facets and stages that has given Jim Keller such consistent success. Intel's shaping up for a promising long-term. The only question there is where RISC designs and AMD will be when the time comes. HSA is coming, but it will be difficult due to the inherent programming challenges. Am curious to see where things are in ten or fifteen years.
  • eastcoast_pete - Sunday, January 27, 2019 - link

    Good point and question! With the GPU functions apparently simply not compatible with Intel's 10 nm process, does anyone here know if any GPUs out there that use quad-patterning at all?
  • anonomouse - Sunday, January 27, 2019 - link

    @Ian or @Andrei Is dealII missing from the spec2006fp results table for some reason? Is this just a typo/oversight, or is there some reason it's being omitted?
  • KOneJ - Sunday, January 27, 2019 - link

    Great write up, but isn't this backwards on the third page?
    "a 2-input NAND logic cell is much smaller than a complex scan flip-flop logic cell"
    "90.78 MTr/mm^2 for NAND2 gates and 115.74 MTr/mm^2 for Scan Flip Flops"
    NAND cell is smaller than flip-flop cell, but there is more flip-flop than NAND in a square millimeter?
    Or am I missing something?
  • Rudde - Sunday, January 27, 2019 - link

    A NAND logic cell consists of 2 transistors, while a Scan flip flop logic cell can consist of different count of transistors depending on where it is used. If I remeber correctly, Intel uses 8, 10 and 12 transistor designs.
    That gives 45.39 million NAND cells per mm² (basically SRAM) and ~12 million flip-flop cells.

    The NAND cell is smaller because it consists of fewer transistors.
  • KOneJ - Sunday, January 27, 2019 - link

    It would be great if you guys could get a CNL sample in the hands of Agner Fog. He might be able to answer some of the micro-architecture questions through his tests.
  • dragosmp - Sunday, January 27, 2019 - link

    Awesome review, great in depth content and well explained. Considering the amount of work this entailed, it's clear why these reviews don't happen every day. Thanks
  • dragosmp - Sunday, January 27, 2019 - link

    I'll just add...many folks are saying AMD should kick arse. They should, but Intel has been in this situation before - they had messed up the 90nm process; probably not quite as bad as the chips to be unusable, but it opened the door to AMD and its Athlon 64. What did AMD do? Messed it up in turn with slow development and poor design choices. Hopefully they'll capitalize this time so that we get an actual dupoloy, rather than the monopoly on performance we had since Intel's 65nm chips.
  • eva02langley - Sunday, January 27, 2019 - link

    Euh... You mean this...?

    https://www.youtube.com/watch?v=osSMJRyxG0k

    Anti-competitive tactics? They bought the OEM support to prevent competition.

    And, all lately, this came up...

    https://www.tomshardware.com/news/msi-ceo-intervie...

    "Relationship with Intel: Chiang told us that, given Intel's strong support during the shortage, it would be awkward to tell Intel if he chose to come out with an AMD-powered product. "It's very hard for us to tell them 'hey, we don't want to use 100 percent Intel,' because they give us very good support," he said. He did not, however, make any claims that Intel had pressured him or the company."

    Yeah right, Intel is winning because they have better tech... /sarcasm

Log in

Don't have an account? Sign up now