Cache and Memory Performance

I mentioned earlier that cache latencies are higher in order to accommodate the larger caches (8MB L2 + 8MB L3) as well as the high frequency design. We turned to our old friend cachemem to measure these latencies in clocks:

Cache/Memory Latency Comparison
  L1 L2 L3 Main Memory
AMD FX-8150 (3.6GHz) 4 21 65 195
AMD Phenom II X4 975 BE (3.6GHz) 3 15 59 182
AMD Phenom II X6 1100T (3.3GHz) 3 14 55 157
Intel Core i5 2500K (3.3GHz) 4 11 25 148

Cache latencies are up significantly across the board, which is to be expected given the increase in pipeline depth as well as cache size. But is Bulldozer able to overcome the increase through higher clocks? To find out we have to convert latency in clocks to latency in nanoseconds:

Memory Latency

We disable turbo in order to get predictable clock speeds, which lets us accurately calculate memory latency in ns. The FX-8150 at 3.6GHz has a longer trip down memory lane than its predecessor, also at 3.6GHz. The higher latency caches play a role in this as they are necessary to help drive AMD's frequency up. What happens if we turn turbo on and peg the FX-8150 at 3.9GHz? Memory latency goes down. Bulldozer still isn't able to get to main memory as quickly as Sandy Bridge, but thanks to Turbo Core it's able to do so better than the outgoing Phenom II.

L3 Cache Latency

L3 access latency is effectively a wash compared to the Phenom II thanks to the higher clock speeds enabled by Turbo Core. Latencies haven't really improved though, and Bulldozer has a long way to go before it reaches Sandy Bridge access latencies.

The Impact of Bulldozer's Pipeline Windows 7 Application Performance
POST A COMMENT

428 Comments

View All Comments

  • Kristian Vättö - Wednesday, October 12, 2011 - link

    I'm happy that I went with i5-2500K. Performance, especially in gaming, seems to be pretty horrible. Reply
  • ckryan - Wednesday, October 12, 2011 - link

    I was just going to say the same thing. I was all about AMD last year, but early this year I picked up an i5 2500K and was blown away by efficiency and performance even in a hobbled H67. Once I bought a proper P67, it was on. It's not that Bulldozer is terrible (because it isn't); Sandy Bridge is just a "phenom". If SB had just been a little faster than Lynnfield, it would still be fast. But it's a big leap to SB, and it's certainly the best value. AMD has Bulldozer, an inconsistent performer that is better in some areas and worse in others, but has a hard time competing with it's own forebearer. It's still an unusual product that some people will really benefit from and some wont. The demise of the Phenom II can't come soon enough for AMD as some people will look at the benchmarks and conclude that a super cheap X4 955BE is a much better value than BD. I hope it isn't seen that way, but it's not a difficult conclusion to reach. Perhaps BD is more forward looking, and the other octocore will be cheaper than the 8150 so it's a better value. I'd really like to see the performance of the 4- and 6- before making judgement.

    It's still technically a win, but it's a Pyrrhic victory.
    Reply
  • ogreslayer - Wednesday, October 12, 2011 - link

    I tell friends that exact thing all the time. Phenoms are great CPUs but switch to Nehelam or Sandy Bridge and the speed is noticibly different. At equal clocks Core 2 Quads are as fast or faster.

    Bulldozer ends up with a lot of issues fanboys refused to see even though Anandtech and other sites did bring it up in previews. I guess it was just hope and a understandable disbelief that AMD would be behind for a decade till the next architecture. We can start at clockspeed but only being dual-channel is not helping memory bandwidth. I don't think there is enough L3 and they most definitely should have a shortpipeline to crush through processes. They need an 1.4 to 1.6 in CBmarks or what is thhe point of the modules.

    The module philosophy is probably close to the future of x86 but I imagine seeing Intel keeping HT enabled on the high-end SKUs. Also I think both of them want to switch FP calculation over to GPUs.
    Reply
  • slickr - Wednesday, October 12, 2011 - link

    Yeah I agree. To me Bulldozer comes like 1 year late.

    Its just not competitive enough and the fact that you have to make a sacrifice to single threaded performance for multithreaded when even the multithreaded isn't that good and looses to 2600K is just sad.

    They needed to win big with Bulldozer and they failed hard!
    Reply
  • retrospooty - Wednesday, October 12, 2011 - link

    Ya, it seems to be a pattern lately with the last few AMD architectures.

    1. Hype up the CPU as the next big thing
    2. Release is delayed
    3. Once released, benchmarks are severely underwhelming
    Reply
  • JasperJanssen - Wednesday, October 12, 2011 - link

    4. Immediately start hyping up the next release as the salvation of all. Reply
  • GatorLord - Thursday, October 20, 2011 - link

    It looks to me like BD is the CPU beta bug sponge for Trinity and beyond. Everybody these days releases a beta before the money launch.

    Hence the B3 stepping...and probably a few more now that a capable fab is onboard with TSMC. BD is not a CPU like we're used to...its an APU/HPC engine designed to drive code and a Cayman class GPU at 28nm and lots of GHz...I get it now.

    Also, the whole massive cache and 2B transistors, 800M dedicated to I/O, thing (SB uses 995M total) finally makes sense when you realize that this chip was designed to pump many smaller GPGPU caches full of raw data to process and combine all the outputs quickly.

    Apparently GPUs compute very fast, but have slow fetch latencies and the best way to overcome that is by having their caches continously and rapidly filled...like from the CPU with the big cache and I/O machine on the same chip...how smart..and convenient...and fast.

    Can you say 'OpenCL'?
    Reply
  • jleach1 - Friday, October 21, 2011 - link

    I don't see how this can be considered an APU, This product isn't being marketed as a HPC proc., and i don't see the benefit of this architecture design in GPGPU environments at all.

    It's sad...i've always given major kudos to AMD. Back in the days of the Athlon's prime, it was awesome to see david stomping goliath.

    But AMD has dropped the ball continuously since then. Thuban was nice, but it might as well be considered a fluke, seeing as AMD took a worthy architecture (Thuban) and ditched it for what's widely considered as a joke.

    And the phrase "AMD dropped the ball" is an understatement.

    They've ultimately failed. They havent competed with Intel in years. They...have...failed. After thuban came out i was starting to think that the fact that they competed for years on price and clock speed alone was a fluke, and just a blip on the radar. Now i see it the opposite way...it seems that AMD merely puts out good processors every once in a while...and only on accident.
    Reply
  • medi01 - Wednesday, October 12, 2011 - link

    Well, if anand didn't badmouth AMD's GPU's on top of CPU's, we would see less "fanboys" complainging about anand's bias. Reply
  • vol7ron - Wednesday, October 12, 2011 - link

    By badmouth do you mean objectively tell the truth? Do you blame PCMark or FutureMark for any of that? Perhaps if all the tests just said that AMD was clearly better, it wouldn't be badmouthing anymore. Reply

Log in

Don't have an account? Sign up now