The Bulldozer Review: AMD FX-8150 Tested
by Anand Lal Shimpi on October 12, 2011 1:27 AM ESTCache and Memory Performance
I mentioned earlier that cache latencies are higher in order to accommodate the larger caches (8MB L2 + 8MB L3) as well as the high frequency design. We turned to our old friend cachemem to measure these latencies in clocks:
Cache/Memory Latency Comparison | ||||||
L1 | L2 | L3 | Main Memory | |||
AMD FX-8150 (3.6GHz) | 4 | 21 | 65 | 195 | ||
AMD Phenom II X4 975 BE (3.6GHz) | 3 | 15 | 59 | 182 | ||
AMD Phenom II X6 1100T (3.3GHz) | 3 | 14 | 55 | 157 | ||
Intel Core i5 2500K (3.3GHz) | 4 | 11 | 25 | 148 |
Cache latencies are up significantly across the board, which is to be expected given the increase in pipeline depth as well as cache size. But is Bulldozer able to overcome the increase through higher clocks? To find out we have to convert latency in clocks to latency in nanoseconds:
We disable turbo in order to get predictable clock speeds, which lets us accurately calculate memory latency in ns. The FX-8150 at 3.6GHz has a longer trip down memory lane than its predecessor, also at 3.6GHz. The higher latency caches play a role in this as they are necessary to help drive AMD's frequency up. What happens if we turn turbo on and peg the FX-8150 at 3.9GHz? Memory latency goes down. Bulldozer still isn't able to get to main memory as quickly as Sandy Bridge, but thanks to Turbo Core it's able to do so better than the outgoing Phenom II.
L3 access latency is effectively a wash compared to the Phenom II thanks to the higher clock speeds enabled by Turbo Core. Latencies haven't really improved though, and Bulldozer has a long way to go before it reaches Sandy Bridge access latencies.
430 Comments
View All Comments
nofumble62 - Thursday, October 13, 2011 - link
Crappy building block will mean crappy building.richaron - Friday, October 14, 2011 - link
At first I was pissed off by being strung along for this pile of tripe. After sleeping on it, I am not completely giving up on this SERVER CHIP:1) FX is a performance moniker, scratch stupid amount of cache & crank clock
2) I'm sure these numbties can get single thread up to thuban levels
3) Patch windows scheduler ffs
Fix those (relatively simple) things & it will kick ass. But it means most enthusiasts wont be spending money on AMD for a while yet.
7Enigma - Friday, October 14, 2011 - link
Biggest problem for a server chip is the load power levels. It just doesn't compete on that benchmark and one in which is VERY important for a server environment from a cost/heat standpoint.Let's hope that's just a crappy leaky chip due to manufacturing but it's to early to tell.
richaron - Friday, October 14, 2011 - link
I've worked in a 'server environment'. of course power consumption is an issue. at the lower clock speeds & considering multithread performance, this is already a good/great contender. virtual servers & scientific computing this is already a winnar.with a few (hardware & software) tweaks it could be a GREAT pc chip in the long term.
ryansh - Friday, October 14, 2011 - link
Anyone have a BETA copy of WIN8 to see if BD's performance increases like AMD says it will.silverblue - Friday, October 14, 2011 - link
There's benchmarks here and there but nothing to say it'll improve performance more than 10% across the board. In any case, the competition also benefits from Windows 8, so it's still not a sign of AMD closing any sort of gap in a tangible fashion.Pipperox - Friday, October 14, 2011 - link
But Bulldozer is different.Windows 7 scheduler does not have a clue about its "modules" and "cores".
So for example it may find it perfectly legit to schedule 2 FP intensive threads to the same module.
Instead this will result in reduced performance on Bulldozer.
Also one may want to schedule two integer threads which share the same memory space to the same module, instead of 2 different modules.
This way the two threads can share the same L2 cache, instead of having to go to the L3 which would increase latency.
All of the above does not apply to Thuban; to a lesser degree it applies to Sandy Bridge, but Windows 7 scheduler is already aware of Sandy Bridge's architecture.
nirmv - Saturday, October 15, 2011 - link
Pipperox, It's not different than Intel's Hyper Threading.Pipperox - Sunday, October 16, 2011 - link
It is, although they're similar concepts.Let's make an example: you have 2 integer threads working on the same address space (for example two parallel threads working in the same process).
All cores are idle.
What is the best scheduling for a Hyperthreading cpu?
You schedule each thread to a different core, so that they can enjoy full execution resources.
What is best on Bulldozer?
You schedule them to the SAME module.
This because the execution resources are split in a BD module, so there would be no advantage to schedule the threads to different modules.
HOWEVER if the 2 threads are on the same module, they can share the L2 cache instead of the L3 cache on BD, so they enjoy lower memory latency and higher bandwidth.
There are cases where the above is not true, of course.
But my example shows that optimal scheduling for Hyperthreading can be SUB-optimal on Bulldozer.
Hence the need for a Bulldozer-aware scheduler in Windows 8.
Regs - Friday, October 14, 2011 - link
AMD needs a 40-50% performance gain and they're not going to see it using windows 8. What AMD needs is...actually I have no clue what the need. I've never been so dumbfounded about a product that makes no sense or has any position in the market.