Frequency Analysis: Cutting Back on AVX2 vs Kaby Lake

Analyzing a new CPU family as a mobile chip is relatively difficult. Here we have a platform that is very much hamstrung by its thermal settings and limitations. Not only that, the BIOS adjustments available for mobile platforms are woeful in comparison to what we can test on desktop. This applies to the Intel NUC that came to retail in December as well as the Lenovo Ideapad E330-15ICN that we have for testing.

The issue is that for a 15W processor, even when built in a ’35 W’ capable environment, might still hit thermal limits depending on the configuration. We’ve covered why Intel’s TDP often bares little relation to power consumption, and it comes down to the different power levels that a system defines. It can also depend a lot on how the chip performs – most processors have a range of valid voltage/power curves which are suitable for that level of performance, and users could by chance either get a really good chip that stays cool, or a bad chip that rides the thermal limits. Ideally we would have all comparison chips in a desktop-like environment, such as when we tested the ‘Customer Reference Board’ version of Broadwell, which came in a desktop-like design. Instead, we have to attach as big of a cooling system as we can, along with extra fans, just in case. Otherwise potential variations can affect performance.

For our testing, we chose Intel’s Core i3-8130U mobile processor as the nearest competition. This is a Kaby Lake dual core processor, which despite the higher number in its name is using the older 14nm process and older Kaby Lake microarchitecture. This processor is a 15W part, like our Cannon Lake Core i3-8121U, with the same base frequency, but with a slightly higher turbo frequency. Ultimately this means that this older 14nm processor, on paper, should be more efficient than Intel’s latest 10nm process. Add on to this, the Core i3-8130U has active integrated graphics, while the Cannon Lake CPU does not.

Because both CPUs have turbo modes, it’s important to characterize the frequencies during testing. Here are the specifications and turbo tables for each processor:

Comparing Cannon Lake to Kaby Lake
10m Cannon Lake
Core i3-8121U
AnandTech 14nm Kaby Lake
Core i3-8130U
2 / 4 Cores / Threads 2 / 4
15 W Rated TDP 15 W
2.2 GHz Base Frequency 2.2 GHz
3.2 GHz Single Core Turbo 3.4 GHz
3.1 GHz Dual Core Turbo 3.4 GHz
2.2 GHz AVX2 Frequency 2.8 GHz
1.8 GHz AVX512 Frequency -

The Cannon Lake processor loses frequency as the cores are loaded, and severely loses frequency when AVX2/AVX512 is applied based on our testing. Comparing that to the Kaby Lake on Intel’s mature 14nm node, it keeps its turbo and only loses a few hundred MHz with AVX2. This part does not have AVX512, which is a one up for the Cannon Lake.

The biggest discrepancy we observed for AVX2 was in our POV-Ray test.

Here the Kaby Lake processor sustains a much higher AVX2 frequency, and completes the test quicker for a 26% better performance. This doesn’t affect every test as we’ll see in the next few pages, and for AVX-512 capable tests, the Cannon Lake goes above and beyond, despite the low AVX-512 frequency. For example, at 2.2 GHz, the Kaby Lake chip scores 615 in our 3DPM test in AVX2 mode, whereas the Cannon Lake chip scores 3846 in AVX512 mode, over 6x higher.

The system we are using for the Core i3-8130 is ASUS’ PN60 Mini-PC. This device is an ultra-compact mini-PC that measures 11.5mm square and under 5cm tall. It is just big enough for me to install our standard Crucial MX200 1TB SSD and 2x4GB of G.Skill DDR4-2400 SO-DIMMs.

For the Cannon Lake based Lenovo Ideapad 330-15ICN, we removed the low-end SSD and HDD that was shipped with the design and put in our own Crucial MX200 1TB and 2x4 GB DDR4 SO-DIMMs for testing. Unfortunately we can’t probe the exact frequency the memory seems to be running at, nor the sub-timings, because of the nature of the system. However the default SPD of the modules is DDR4-2400 17-17-17.

Intel’s Core i3-8121U: Uncovering the Microarchitecture Secrets Our Testing Suite for 2018 and 2019
Comments Locked

129 Comments

View All Comments

  • Gondalf - Friday, January 25, 2019 - link

    For now they have nothing out in cpu departement, so i don't see any AMD bright year in front of us.
    I remember you we are already in 2019.
  • vegajf51 - Friday, January 25, 2019 - link

    Icelake Desktop 3q 2020, intel will have another 14nm refresh before then.
  • HStewart - Saturday, January 26, 2019 - link

    Intel is expected to release 10nm+ with Covey Lake by Christmas seasons. This canon lake chip is just a test chip.
  • pugster - Friday, January 25, 2019 - link

    Thanks for the review. While the performance is not great, what about the power consumption compared with the 8130U?
  • Yorgos - Friday, January 25, 2019 - link

    it's not great obviously when you are stuck at 2.2GHz, while the prev gen cpu with the same capabilities(except the avx) can go up to 3.4GHz.
    I bet the 8130 would've been faster even if configured at 10Watt TDP.
  • Yorgos - Friday, January 25, 2019 - link

    ...and before jumping on me about that "stuck at 2.2GHz" let me report this:
    in certain loads the locked freq is slower than the unlocked one.
    What does this mean? it most probably means that the unlocked freq makes the cpu run hot, throttle and then try to balance between temperature and consumption.

    and a subnote on this. I think Intel should stop pushing the AVX instructions. It doesn't work as intended, it's not needed in most cases, especially when you have to design 256bit buses for 512bit data transfer on a low power cpu. Also it takes a lot of space on the die, it taxes the cache buses and it's useless when you disable your igpu(which is a good SIMD machine but not hUMA) and you have a dGPU up all the time just rendering your desktop.
    They should try focusing on HSA/hUMA on their cpus+igpus instead of integrating wide SIMD instructions inside their cores.
  • 0ldman79 - Saturday, January 26, 2019 - link

    Thing is when AVX2 and AVX512 are used the performance increase can be rather massive.

    PCSX2, PS2 emulator, runs identically between my 3.9GHz Ivy Bridge Xeon (AVX) and my 2.8GHz i5 Skylake mobile (AVX2).

    AVX2 makes several games playable. You can choose your plugin and the AVX plugin cannot play Gran Turismo 4 @ 2.8GHz, the AVX2 plugin can.

    You may not find it useful, others do.
  • HStewart - Saturday, January 26, 2019 - link

    It would be interesting to see the emulator re-factor to work with AVX 512 - it would like be twice the speed of AVX 2
  • levizx - Sunday, January 27, 2019 - link

    Nope, even with the simplest data set where AVX512 can perform twice the speed of AVX2 per cycle, the frequency has to drop significantly (~30% on Xeon Gold 5120 for example), so the upper limit is more like 40% gain. And that's PURE AVX512 code, you won't get that in real life. Assuming 50% AVX2 and 50% AVX512 code - that's a very generous assumption for non-datacentre usage, you'll have a 5% net gain.
  • levizx - Sunday, January 27, 2019 - link

    5%~20% net gain, depending on how the scaling works.

Log in

Don't have an account? Sign up now