IPC Increases: Double L1 Data Cache, Better Branch Prediction

One of the biggest changes in the design is the increase in the L1 data cache, doubling its size from 64 KB to 128 KB while keeping the same efficiency. This is combined with a better prefetch pipeline and branch prediction to reduce the level of cache misses in the design. The L1 data cache is also now an 8-way associative design, but with the better branch prediction when needed it will only activate the one segment required and when possible power down the rest.  This includes removing extra data from 64-bit word constructions. This reduces power consumption by up to 2x, along with better clock gating and minor adjustments. It is worth pointing out that doubling the L1 cache is not always easy – it needs to be close to the branch predictors and prefetch buffers in order to be effective, but it also requires space. By using the high density libraries this was achieved, as well as prioritizing lower level cache. Another element is the latency, which normally has to be increased when a cache increases in size, although AMD did not elaborate into how this was performed.

As listed above, the branch prediction benefits come about through a 50% increase in the BTB size. This allows the buffer to store more historic records of previous interactions, increasing the likelihood of a prefetch if similar work is in motion. If this requires floating point data, the FP port can initiate a quicker flush required to loop data back into the next command. Support for new instructions is not new, though AVX2 is something a number of high end software packages will be interested in using in the future.

These changes, according to AMD, relate to a 4-15% higher IPC for Excavator in Carrizo compared to Steamroller in Kaveri.  This is perhaps a little more what we normally would expect from a generational increase (4-8% is more normal), but AMD likes to stress that this comes in addition to lower power consumption and with a reduced die area. As a result, at the same power Carrizo can have both an IPC advantage and a frequency advantage.

As a result, AMD states that for the same power, Cinebench single threaded results will go up 40% and multithreaded results up 55%. The benefits are fewer however the further up the power band you go despite the increase, as the higher density libraries perform slightly worse at higher power than Kaveri.

Efficiency and Die Area Savings Power Saving and Power Consumption
Comments Locked

137 Comments

View All Comments

  • VeixES - Wednesday, June 3, 2015 - link

    Some OEM needs to pick this up fast.
    Carrizo based "NUC" device with HDMI2.0 output with more barebones approach than intel to reduce the cost of entry.
  • gostan - Wednesday, June 3, 2015 - link

    Anandtech - AMD's marketing arm.
  • bloodypulp - Wednesday, June 3, 2015 - link

    Dumbest thing I've heard yet today.
  • jabber - Thursday, June 4, 2015 - link

    Everyone knows AMD has never had a marketing arm. That's why no one buys em.

    Seriously, the OEMs have moved on. Why bother with AMD when Intel sells because the average consumer has heard of Intel? Price doesn't come into it.
  • watzupken - Thursday, June 11, 2015 - link

    To gostan, I find your comment above baseless and unconstructive to be honest. One article on AMD means AMD marketing arm. So what does that make you then?
  • l_d_allan - Wednesday, June 3, 2015 - link

    My impression is that it will be difficult (almost impossible?) for AMD to compete with a 28nm part against Intel's 14nm parts.
    And I think the next "tick tock" from Intel will be 10nm. Or not?
  • Novacius - Wednesday, June 3, 2015 - link

    It'll be a tick, codenamed Cannonlake. But i don't expect it before the end of 2016/beginning of 2017.
  • The_Assimilator - Wednesday, June 3, 2015 - link

    Which will still be before AMD gets to 14nm.
  • cjs150 - Wednesday, June 3, 2015 - link

    Finally AMD release a reasonably power efficient chip.

    At 15W this is perfect for a passively cooled HTPC with 4k capability built in. I appreciate the HTPC market is small, but AMD have something that potentially (will reserve judgment until it is out and tested) beats everything Intel have comprehensively.

    The problem for AMD will be that people like me already have a HTPC (in my case using i7-3770T which is overkill) and until the world moves to 4K there is no need to upgrade but if they produced something the size of Intel NUC but passively cooled I would be very tempted
  • watzupken - Wednesday, June 3, 2015 - link

    I think this makes a very interesting APU. In fact, the most interesting APU from AMD to date. Unfortunately, it may not reach the shores from where I come from. It is either limited availability or the distros are not interested to carry in due to them expecting a low demand.

Log in

Don't have an account? Sign up now