Gaming Performance

AMD clearly states in its reviewer's guide that CPU bound gaming performance isn't going to be a strong point of the FX architecture, likely due to its poor single threaded performance. However it is useful to look at both CPU and GPU bound scenarios to paint an accurate picture of how well a CPU handles game workloads, as well as what sort of performance you can expect in present day titles.

Civilization V

Civ V's lateGameView benchmark presents us with two separate scores: average frame rate for the entire test as well as a no-render score that only looks at CPU performance.

Civilization V—1680 x 1050—DX11 High Quality

While we're GPU bound in the full render score, AMD's platform appears to have a bit of an advantage here. We've seen this in the past where one platform will hold an advantage over another in a GPU bound scenario and it's always tough to explain. Within each family however there is no advantage to a faster CPU, everything is just GPU bound.

Civilization V—1680 x 1050—DX11 High Quality

Looking at the no render score, the CPU standings are pretty much as we'd expect. The FX-8150 is thankfully a bit faster than its predecessors, but it still falls behind Sandy Bridge.

Crysis: Warhead

Crysis Warhead Assault Benchmark—1680 x 1050 Mainstream DX10 64-bit

In CPU bound environments in Crysis Warhead, the FX-8150 is actually slower than the old Phenom II. Sandy Bridge continues to be far ahead.

Dawn of War II

Dawn of War II—1680 x 1050—Ultra Settings

We see similar results under Dawn of War II. Lightly threaded performance is simply not a strength of AMD's FX series, and as a result even the old Phenom II X6 pulls ahead.

DiRT 3

We ran two DiRT 3 benchmarks to get an idea for CPU bound and GPU bound performance. First the CPU bound settings:

DiRT 3—Aspen Benchmark—1024 x 768 Low Quality

The FX-8150 doesn't do so well here, again falling behind the Phenom IIs. Under more real world GPU bound settings however, Bulldozer looks just fine:

DiRT 3—Aspen Benchmark—1920 x 1200 High Quality

Dragon Age

Dragon Age Origins—1680 x 1050—Max Settings (no AA/Vsync)

Dragon Age is another CPU bound title, here the FX-8150 falls behind once again.

Metro 2033

Metro 2033 is pretty rough even at lower resolutions, but with more of a GPU bottleneck the FX-8150 equals the performance of the 2500K:

Metro 2033 Frontline Benchmark—1024 x 768—DX11 High Quality

Metro 2033 Frontline Benchmark—1920 x 1200—DX11 High Quality

Rage vt_benchmark

While id's long awaited Rage title doesn't exactly have the best benchmarking abilities, there is one unique aspect of the game that we can test: Megatexture. Megatexture works by dynamically taking texture data from disk and constructing texture tiles for the engine to use, a major component for allowing id's developers to uniquely texture the game world. However because of the heavy use of unique textures (id says the original game assets are over 1TB), id needed to get creative on compressing the game's textures to make them fit within the roughly 20GB the game was allotted.

The result is that Rage doesn't store textures in a GPU-usable format such as DXTC/S3TC, instead storing them in an even more compressed format (JPEG XR) as S3TC maxes out at a 6:1 compression ratio. As a consequence whenever you load a texture, Rage needs to transcode the texture from its storage codec to S3TC on the fly. This is a constant process throughout the entire game and this transcoding is a significant burden on the CPU.

The Benchmark: vt_benchmark flushes the transcoded texture cache and then times how long it takes to transcode all the textures needed for the current scene, from 1 thread to X threads. Thus when you run vt_benchmark 8, for example, it will benchmark from 1 to 8 threads (the default appears to depend on the CPU you have). Since transcoding is done by the CPU this is a pure CPU benchmark. I present the best case transcode time at the maximum number of concurrent threads each CPU can handle:

Rage vt_benchmark—1920 x 1200

The FX-8150 does very well here, but so does the Phenom II X6 1100T. Both are faster than Intel's 2500K, but not quite as good as the 2600K. If you want to see how performance scales with thread count, check out the chart below:

Starcraft 2

Starcraft 2

Starcraft 2 has traditionally done very well on Intel architectures and Bulldozer is no exception to that rule.

World of Warcraft

World of Warcraft

Windows 7 Application Performance Power Consumption
Comments Locked

430 Comments

View All Comments

  • vectorm12 - Wednesday, October 12, 2011 - link

    As both AMD and Intel now use dedicated hardware for AES I feel simply testing AES performance isn't enough. A benchmark of the AES+Twofish+Serpent or atleast AES+Serpent would serve as a more telling benchmark at this point. Don't get me wrong I love that you guys even run a benchmark related to Encryption but it needs to be updated.

    About BD I'm also extremely bummed out that it didn't turn out better than this. Ofc there might be room for improvements with patches/cpu-driver for windows7 etc but considering the TDP, transistor-count and everything else this is a huge loss for AMD.

    I'm still interested in seeing how the Opteron versions will perform in specific tasks as the architecture itself seems really interesting. Someone obviously spent a lot of time thinking this design through and I'd like to believe there's at least one particular workload where BD can actually flex it's muscles for real.
  • fri2219 - Wednesday, October 12, 2011 - link

    Since Bulldozer wasn't created with 3D shooters in mind, it would have been nice to see some financial/engineering/scientific benchmarks instead. Anandtech used to differentiate itself from the kiddie sites by providing that sort of analysis. I guess things change, like my RSS subscription to Anandtech articles will.

    That said, the power consumption numbers pretty much say everything I need to know about the CPU series- the constraint on almost all HPC is power, not SPECint or peak flops.
  • chillmelt - Wednesday, October 12, 2011 - link

    Unfortunately a huge majority of the enthusiast market are gamers. If you truly want productivity benchmarks then wait for server chips. FX CPUs aren't marketed as such, but does perform like one.

    With that said, the FX lineup is a decent multi-threading powerhouse, and not flop in that respect.

    Read tomshardware's review for more benchmarks.
  • Malih - Wednesday, October 12, 2011 - link

    well, i guess there'll be follow up posts.

    "My sample actually arrived less than 12 hours ago, so expect a follow up with performance analysis later this week."
  • lagrande - Wednesday, October 12, 2011 - link

    I'm not AMD fan boy, but the reason AMD gave is pretty reasonable. Thread locality is an important factor in bulldozer architecture, primarily because the memory latency on the cache level is pretty high. If the OS can't schedule the thread properly to the correct core, then there will be a lot of inter-core data movement and probably problem like false sharing can be more apparent.
  • GatorLord - Wednesday, October 12, 2011 - link

    While on one hand as a PC user and builder...and really wanting to build a BD based mindblower, I'm a little disappointed...OK, more than a little...by these results. On the other hand, as an MBA and investor in AMD, I see the big picture and have to reluctantly agree...and hopefully profit.

    If you have constrained and finite capabilities in both design and manufacture (GloFo needs its butt kicked), you maximize along a marginal ROI track and right now that would be server chips to support the growing and lucrative cloud, data warehouse, HPC, and corporate servers and the growing fusion space integrating modest x86 with robust video on low wattage single chips, you end up with exactly what we have here. BD (a server chip rebranded) and Llano with plans to improve both with descendents.

    In highway terms it would be akin to building semis and commuter cars. This is the high performance forum and while the Ferrari Enzos are cool and badass, it's hard to fault AMD for the approach. After all, when you're on the road today, you'll see a bunch of semis and commuter cars...its economics. Performance sells magazines, utility sells products.

    BD must be a killer server beast because Cray (you know Cray, right?) just got a $97M contract from Oak Ridge NL about a month or so after taking possession of the first box of BD based server chips. I think Cray knows a thing or two about making computers haul butt.

    Now we'll see if any of that translates into the client space...
  • MossySF - Wednesday, October 12, 2011 - link

    I'll agree with this. We have a ton of servers -- both Intel and AMD. More integer cores are better. FPU? Games? 3D? Media encoding? Who cares. Hyper-threading does nothing when you peg all cores with VMs running at full blast. For example, we have 1 configuration where we run 4 VMs on a Phenom II x4 3ghz and it performs roughly the same as our 4-core i7 2.8ghz. If we add a 5th VM, both slow down equally showing that there are simply no free resources in the CPU pipeline for hyper-threading to steal.

    So where the bulldozer platform is extra good is for cheap / disposable / uniform VM hosts running Linux. Instead of 1 mega expensive quad xeon costing $100K, you have 10 x 1U Bulldozers that can handle 8VMs each at full utilization without speed degradation for $10K. In addition, you'd probably run something like Centos (or RHEL) the default packages are not compiled with Intels uber compiler so many of the +25% you see in benchmarks here don't exist at all in the Linux world.

    The most disappointing part though (which I mentioned previously) is the lack of speed improvement for the chipset. The first bottleneck for adding more VMs is CPU core but the 2nd is disk bandwidth. If you have disk intensive VMs, you need a separate hard drive for each VM to avoid HD seek latency killing performance. But putting 8 HDs in a 1U is impossible so you need 2U/4U servers taking up too much rackspace.

    The answer of course is a fast SSD ... 500MB/s with 0 latency can be split off to separate VMs with a linear degradation versus exponential for HDs. But the SB950 chipset at 2GB/s bidirectional can only handle 2 fast SSDs. So 1000 MB/s divided by 8 VMs reading at full blast is 125 MB/s per VM -- which is regular SATA3 HD speed. Double that to 4 GB/s and you can put easily put 4 x 2.5" SATA3 SSDs in a 1U delivering 250 MB/s to each VM. Now we're back to at least 2nd generation SSD performance.

    (Note, all the Intel chipsets also max out at 2GB/s bidirectional and stuffing a super expensive raid controller in a 1U is not cost effective.)
  • GatorLord - Wednesday, October 12, 2011 - link

    Great analysis...I'm not a server guy and can hardly keep up with the average 15 year old on desktop jargon and theory, but it seems that the bigger cache would mitigate the roundtrips to disk in the conditions you describe. I guess that's why they left that fat L3 cache on the die...assuming Interlagos and Zambezi are really closer than cousins...more like siblings.

    Great financial case...that I get. I heard a joke the other day that went something like "Whenever they say it's not about the money, it's about the money". It's always about the money... :)
  • Macabre215 - Wednesday, October 12, 2011 - link

    This is reminiscent of the Phenom I launch without the TLB bug. You have a chip that barely outperforms its predecessor and at times performs a little worse. AMD might be able to make a Phenom II like product out of Bulldozer but I I think it's too late. They needed to start out well out of the gate with this one.

    Right now I'm on a Phenom II and will be upgrading to Sandy Bridge soon. I'm done with AMD on the desktop front; a platform which is probably a dead one in the next ten to twenty years anyway. AMD should just stick to the server market and mobile platforms for CPUs as that's where they have a dog in the hunt.

    BTW, this is a disgrace to the FX name.
  • Iketh - Wednesday, October 12, 2011 - link

    I understand why AMD execs resigned in the past 2 years... can you imagine what it musta looked like then? "Nah, we've actually gotten slower per thread, and will need 4ghz+ to compete now..."

Log in

Don't have an account? Sign up now