• What
    is this?

    You've landed on the AMD Portal on AnandTech. This section is sponsored by AMD. It features a collection of all of our independent AMD content, as well as Tweets & News from AMD directly. AMD will also be running a couple of huge giveaways here so check back for those.

    PRESENTED BY

Cache Improvements

The shared L1 instruction cache grew in size with Steamroller, although AMD isn’t telling us by how much. Bulldozer featured a 2-way 64KB L1 instruction cache, with each “core” using one of the ways. This approach gave Bulldozer less cache per core than previous designs, so the increase here makes a lot of sense. AMD claims the larger L1 can reduce i-cache misses by up to 30%. There’s no word on any possible impact to L1 d-cache sizes.

Although AMD doesn’t like to call it a cache, Steamroller now features a decoded micro-op queue. As x86 instructions are decoded into micro-ops, the address and decoded op are both stored in this queue. Should a fetch come in for an address that appears in the queue, Steamroller’s front end will power down the decode hardware and simply service the fetch request out of the micro-op queue. This is similar in nature to Sandy Bridge’s decoded uop cache, however it is likely smaller. AMD wasn’t willing to disclose how many micro-ops could fit in the queue, other than to say that it’s big enough to get a decent hit rate. 
 
The L1 to L2 interface has also been improved. Some queues have grown and logic is improved.
 
 
Finally on the caching front, Steamroller introduces a dynamically resizable L2 cache. Based on workload and hit rate in the cache, a Steamroller module can choose to resize its L2 cache (powering down the unused slices) in 1/4 intervals. AMD believes this is a huge power win for mobile client applications such as video decode (not so much for servers), where the CPU only has to wake up for short periods of time to run minor tasks that don’t have large L2 footprints. The L2 cache accounts for a large chunk of AMD’s core leakage, so shutting half or more of it down can definitely help with battery life. The resized cache is no faster (same access latency); it just consumes less power. 
 
Steamroller brings no significant reduction in L2/L3 cache latencies. According to AMD, they’ve isolated the reason for the unusually high L3 latency in the Bulldozer architecture, however fixing it isn’t a top priority. Given that most consumers (read: notebooks) will only see L3-less processors (e.g. Llano, Trinity), and many server workloads are less sensitive to latency, AMD’s stance makes sense. 
 

Looking Forward: High Density Libraries

 
This one falls into the reasons-we-bought-ATI column: future AMD CPU architectures will employ higher levels of design automation and new high density cell libraries, both heavily influenced by AMD’s GPU group. Automated place and route is already commonplace in AMD CPU designs, but AMD is going even further with this approach.
 
The methodology comes from AMD’s work in designing graphics cores, and we’ve already seen some of it used in AMD’s ‘cat cores (e.g. Bobcat). As an example, AMD demonstrated a 30% reduction in area and power consumption when these new automated procedures with high density libraries were applied to a 32nm Bulldozer FPU:

The power savings comes from not having to route clocks and signals as far, while the area savings are a result of the computer automated transistor placement/routing and higher density gate/logic libraries.
 
The tradeoff is peak frequency. These heavily automated designs won’t be able to clock as high as the older hand drawn designs. AMD believes the sacrifice is worth it however because in power constrained environments (e.g. a notebook) you won’t hit max frequency regardless, and you’ll instead see a 15 - 30% energy reduction per operation. AMD equates this with the power savings you’d get from a full process node improvement.
 
We won’t see these new libraries and automated designs in Steamroller, but rather its successor in 2014: Excavator.
 

Final Words

 
Steamroller seems like a good evolutionary improvement to AMD’s Bulldozer and Piledriver architectures. While Piledriver focused more on improving power efficiency, Steamroller should make a bigger impact on performance.
 
The architecture is still slated to debut in 2013 on GlobalFoundries' 28nm bulk process. The improvements look good on paper, but the real question remains whether or not Steamroller will be enough to go up against Haswell.
Front End & Execution Improvements
POST A COMMENT

126 Comments

View All Comments

  • Origin64 - Thursday, August 30, 2012 - link

    Just like Phenom II was what Phenom should've been, but by then it was too late. AMD is always a generation behind.
    In the notebook market this isn't much of a problem, Intel's even further behind there, but in the mid-end desktop chips it shows. Which is a shame, because mid-end Intel is way too expensive.

    Although I still insist that this Bulldozer/steamroller/whatever architecture will have its 15 months of fame when games start running on octocores.
    Reply
  • Spunjji - Thursday, August 30, 2012 - link

    Pretty much what you said at the start there. God knows I miss AMD being competitive in the CPU market, but in anything but a "value" sense I don't see them bringing that game for another 2/3 years, if ever. Reply
  • Dracconus - Friday, November 30, 2012 - link

    If you think that AMD has "always been a generation behind" then you're seriously mistaken. The AMD Athlon 64 series STOMPED the living HELL out of the Pentium 4 series processors and cost less. AMD WAS good at single threaded applications, but they started focusing on the growth abilities of the 64 bit architecture, and lost track of what would in the end be most important. You can't fault a company for looking to the future, and attempting to expand their horizons. Had software developers thought more about the future of hardware instead of the present limitations then things would have gone in AMD's favor CONSIDERABLY.

    Intel has good processors, we'll give them that. But where they have ALWAYS lacked is price-performance ratio. They don't scale, overclock, cool, or deal with heat as well, and up until the I5 series they BARELY managed to give two shits about power consumption.
    Yes, Intel is better for RAW performance, but quite frankly, how many average gamers are going to be able ot afford a 2 thousand dollar processor just to play their favorite game in six years? NONE How many enthusiasts...plenty.
    AMD serves a greater portion of the population, and they know it. They just got freaking lazy, and it started to show.
    They have a chance to pick it back up, and it's up to them to admit they slipped up, but don't get fooled. Even IF AMD slips, they'll still have budget minded consumers worried about price to performance ratios, and will ALWAYS have customers as long as they're in business solely because of the economic standstill the world is in.
    Reply
  • yankeeDDL - Tuesday, August 28, 2012 - link

    Any idea of when could the first legitimate benchmark start to surface?
    The lack of competition in the CPU market is not healthy for users, that's for sure.
    I'd love to see AMD back in the game in other areas, in addition to Netbooks (with Brazos) and Value (Llano offers pretty good bang for the bucks).
    Reply
  • SpamHammer - Wednesday, August 29, 2012 - link

    I fail to see how it's been negative to users? Have you seen the cost to performance ratio of Intel's Sandy Bridge and Ivy Bridge chips?? I mean, seriously! When the Core i5 2500k came out, UN-overclocked, it was able to go toe-to-toe with the $1,000 Intel Core Extreme from the generation before! All this from a chip that runs $220?? That's insane value!

    The value is only furthered when you take into account its low thermal output, and it's high overhead for over clocking. I have mine OC'd to 4.1GHz, and that's not even beginning to stretch it. I've seen them OC'd to 4.5GHz regularly on air cooling. This isn't "hard" or "the exception to the rule"; it is the norm for these chips. And that's just Sandy Bridge! Ivy Bridge offers a 10-15% improvement right out of the box!

    Hell, the Core i3 2100, running *only* $120, despite being just a dual-core chip, is able to easily wipe the floor with even AMD's octo-core Bulldozer and Piledriver chips, in just about every gaming and synthetic benchmark, despite that chip costing nearly twice as much!
    It powers my brother's gaming PC, and he's able to run Battlefield 3 on Ultra at 1600x900 (his monitor's resolution) with 50+FPS!
    Thanks to all this "non competitive consumer screwing" you're preaching about, I was able to build his entire rig, sans monitor, for $469 shipped!
    I mean, you couldn't ask for a better time to buy new PC gear!
    Reply
  • thehat2k5 - Wednesday, August 29, 2012 - link

    "Battlefield 3 on Ultra at 1600x900 (his monitor's resolution) with 50+FPS!" " for $469 shipped!" Yeah right. Maybe if there was a 75% off sale on video cards where you bought it. BF3 on ultra requires at least a $500 video card, regardless of how much you cheap out on a current cpu.
    Hell, if you came into my shop with a budget of $469 shipped, i have 7 employees that will laugh at you and kindly hand you a business card from Best Buy with two letters on the back....HP.
    that said, the best bang for the bug gaming cpu is the AMD FX4100 for about $140. Why go weak i3 dual core when you can go mid range quad from AMD for $20 more. I like your fairy tale, almost as much as I like some of the ones in the bible.
    Reply
  • StevoLincolnite - Wednesday, August 29, 2012 - link

    A $500 video card just for Battlefield 3? Seriously? Lol? With that kind of ignorance, I would never wan't to buy from your shop. Reply
  • thehat2k5 - Wednesday, August 29, 2012 - link

    If you want it running on Ultra in the middle of a firefight at min. 60fps, you bet! Considering our customers are buying 21.5" LCD's with resolutions of 1980x1050 as a minimum.
    Even our customers would laugh at the claim of BF3 on Ultra for $469. Sorry guys, i'm no AMD "fanboy", but around here we call a spade a spade. This dudes claim is fantasy based on a bath salts hallucination.
    Reply
  • taltamir - Wednesday, August 29, 2012 - link

    Your customers are buying 1980x1050 resolution monitors but he explicitly stated his brother is running on a 1600x900
    Also he said 50FPS+ not 60FPS steady like you are claiming.
    Reply
  • thehat2k5 - Wednesday, August 29, 2012 - link

    I would like to personally see that running in Ultra at 50fps, even on a 1600x900. If i can sell BF3 Ultra desktops for under $500, i'm going to open a few more stores and put Best Buys computer department out of business lol Reply

Log in

Don't have an account? Sign up now