What’s in a Benchmark? This is a pertinent question that all users need to ask themselves, because if you don’t know what a benchmark actually tests and how that relates to the real world, the scores are meaningless. Today, AMD has announced that they are resigning from BAPCo over a long standing dispute over the weighting of scores within the SYSmark suite. AMD specifically references SYSmark 2012 (SM12), but there have been complaints in the past and the latest release is apparently the proverbial straw that broke the camel’s back.

You can read more about the decision on Cheif Marketing Officer (CMO) Nigel Dessau’s blog, but this announcement comes at an interesting time since BAPCo just shipped us copies of the final SM12 release. We haven’t had a chance to run the suite yet, and we’ll still have a look at the results and see how AMD and Intel platforms compare at some point, but it looks like we have a foregone conclusion: Intel will come out ahead. What we really need to examine is why Intel gets a better score.

If you’ve been reading AnandTech for any length of time, you’ll know that we place a lot more weight on real-world benchmarks rather than synthetic tests, but certain tasks can be very difficult to test in a meaningful way. How do you measure every day tasks like surfing the web in a meaningful way when most CPUs are 95% idle performing that task? When we really look at the market right now, in many cases we can conclude that just about any current computer will be fast enough for 90% of users. If you want to surf the Internet, write email, work in Office applications, watch some movies, listen to music, etc. you can do that on anything from a lowly AMD Brazos netbook to a hex-core monster system. Yes, we did leave out Atom, because there are certain areas where it falls short—specifically, certain movie formats prove to be too much for the current Atom platform, particularly if you’re looking at HD H.264 content (e.g. YouTube and Hulu).

Reading through AMD’s announcement and Nigel’s blog, it’s pretty clear what AMD is after: they want the GPU to play a more prominent role in measurements of overall system performance. On the one hand, we could say that AMD is simply trying to get benchmarks to favor their APUs, since Brazos and Llano easily surpass the Intel competition when it comes to graphics and video prowess. This would certainly be true, but then we also have to consider what users are actually doing with their PCs. SYSmark has always included a variety of tests, and certainly knowing how fast your computer is in regards to Excel performance can be useful. However, AMD claims that a disproportionate weight is given to some tests, with mention of optical character recognition and file compression activities in particular.

We don’t have the full SM12 whitepaper yet, but we can look at the list of applications that are tested, and a few things immediately stand out. There are two web browsers in the list, but both versions are now outdated. Internet Explorer 8 has been replaced by Internet Explorer 9, and Firefox 3.6 is replaced by Firefox 4.0—with Firefox 5 just around the corner. Without newer browsers, HTML5 is basically untested by SM12, and while we understand that SM12 has been in development for a while, for something calling itself 2012 to include mostly 2010 applications feels out of place. Considering IE9 and FF4 both shift to GPU-accelerated engines, AMD would certainly have benefited from the use of the latest versions. The remaining applications look reasonable, but again we have no information on weighting of scores, so we’ll have to see how the results pan out.

Ultimately, the main thing to take away from all of this is that, just like the PCMark, 3DMark, Cinebench, SunSpider, etc. benchmarks we routinely refer to, SYSmark 2012 is merely one more tool to analyze system performance. It will be interesting to see how other elements—like the presence or lack of an SSD—impact the score. In our opinion most users would benefit far more from running something like Llano with an SSD as opposed to Sandy Bridge with an HDD, so the CPU/GPU/APU are not the only factors, but it still depends on your intended use. If you’re running a server, obviously the demands placed on the system will be far different from the average home computer. Multimedia professionals that spend a lot of time in Adobe Photoshop and/or Premiere likewise have different needs.

Is AMD right? Is heterogeneous (e.g. CPU and GPU working together) computing more important now than raw CPU performance, or is SYSmark12 merely proving what we already know: Sandy Bridge is really fast? Let us know what you think, but as always remember that when you’re looking at benchmark charts, take a minute to think about what the bars actually represent. The full news release is below, but again you can find substantially more detail in Dessau’s blog.

Update: It turns out AMD is not the only party to have left the BAPCo consortium recently. We've just confirmed with NVIDIA that they have also left the BAPCo consortium. No reason was given.

Update 2: BAPCo has released a statement in return. The consortium notes that AMD approved 80% of the development milestones and that AMD was never threatened with expulsion. The full statement is attached below.

Update 3: We've finally gotten official confirmation (as rumored earlier) that VIA has also left the consortium. They have sent a short statement to SemiAccurate which we have included below. The basis of their complaints are much the same as AMD's: they don't consider SYSMark 2012 to reflect real world usage.


AMD Will Not Endorse SYSmark 2012 Benchmark

— AMD Separates from Association with Industry Group BAPCo —

SUNNYVALE, Calif. — 21, 2011 — AMD (NYSE: AMD) today announced that it will not endorse the SYSmark 2012 Benchmark (SM2012), which is published by BAPCo (Business Applications Performance Corporation). Along with the withdrawal of support, AMD has resigned from the BAPCo organization.

“Technology is evolving at an incredible pace, and customers need clear and reliable measurements to understand the expected performance and value of their systems,” said Nigel Dessau, senior vice president and Chief Marketing Officer at AMD. “AMD does not believe SM2012 achieves this objective. Hence AMD cannot endorse or support SM2012 or remain part of the BAPCo consortium.”

AMD will only endorse benchmarks based on real-world computing models and software applications, and which provide useful and relevant information. AMD believes benchmarks should be constructed to provide unbiased results and be transparent to customers making decisions based on those results. Currently, AMD is evaluating other benchmarking alternatives, including encouraging the creation of an industry consortium to establish an open benchmark to measure overall system performance.

AMD encourages anyone wanting more details about the construction and scoring methodology of the SM2012 benchmark to contact BAPCo. For more details on AMD’s decision to exit BAPCo, please read AMD’s Executive Blog authored by Nigel Dessau.


BAPCo® Reaffirms Open Development Process For SYSmark® 2012

SAN MATEO, Calif.—(BUSINESS WIRE)—Business Applications Performance Corporation (BAPCo®) is a non-profit consortium made up of many of the leaders in the high tech field, including Dell, Hewlett-Packard, Hitachi, Intel, Lenovo, Microsoft, Samsung, Seagate, Sony, Toshiba and ARCintuition. For nearly 20 years BAPCo has provided real world application based benchmarks which are used by organizations worldwide. SYSmark® 2012 is the latest release of the premiere application based performance benchmark. Applications used in SYSmark 2012 were selected based on market research and include Microsoft Office, Adobe Creative Suite, Adobe Acrobat, WinZip, Autodesk AutoCAD and 3ds Max, and others.

Advanced Micro Devices (AMD) was, until recently, a long standing member of BAPCo. We welcomed AMD’s full participation in the two year development cycle of SYSmark 2012, AMD’s leadership role in creating the development process that BAPCo uses today and in providing expert resources for developing the workload contents. Each member in BAPCo gets one vote on any proposals made by member companies. AMD voted in support of over 80% of the SYSmark 2012 development milestones, and were supported by BAPCo in 100% of the SYSmark 2012 proposals they put forward to the consortium.

BAPCo also notes for the record that, contrary to the false assertion by AMD, BAPCo never threatened AMD with expulsion from the consortium, despite previous violations of its obligations to BAPCo under the consortium member agreement.

BAPCo is disappointed that a former member of the consortium has chosen once more to violate the confidentiality agreement they signed, in an attempt to dissuade customers from using SYSmark to assess the performance of their systems. BAPCo believes the performance measured in each of the six scenarios in SYSmark 2012, which is based on the research of its membership, fairly reflects the performance that users will see when fully utilizing the included applications.


VIA's Statement About Leaving The BAPCo Consortium

VIA today confirmed reports that we have tendered our resignation to BAPCo. We strongly believe that the benchmarking applications tests developed for SYSmark 2012 and EEcoMark 2.0 do not accurately reflect real world PC usage scenarios and workloads and therefore feel we can no longer remain as a member of the organization.

We hope that the industry can adopt a much more open and transparent process for developing fair and objective benchmarks that accurately measure real world PC performance and are committed to working with companies that share our vision.

Comments Locked

116 Comments

View All Comments

  • Targon - Tuesday, June 21, 2011 - link

    Firefox 4 has been around for a while, so any NEW benchmark should use it, not Firefox 3.6. IE 9 is also the most recent version, so should be used, not IE 8. The fact that these are very common and popular browsers and both use GPU acceleration would have been a benefit to AMD, and if most people only care about web browsing and perhaps some word processing, it makes sense to use what most people are using.

    Then you have Flash, and as much as some people may hate it, I expect that upwards of 90 percent of people on a Windows platform will have Flash installed and enabled. Flash is also GPU accelerated, meaning Intel would look REALLY bad using Firefox 4 with Flash content if that was a huge part of your benchmark. I bet Intel would complain if web browsing tests heavily weighted Firefox 4+Flash performance and it showed Intel was slower.
  • dustwalker13 - Tuesday, June 21, 2011 - link

    but Nvidia and VIA as well, leaving only Intel in the Industry Group BAPCo making it a group of one ...

    If this is true, you should update your article since 3 of 4 companies dropping out does imply somethings really fishy with sysmark, while AMD dropping out against Intel does only imply AMD is afraid of the results of the new version of the tests for their new cpu-designs.
  • jjj - Tuesday, June 21, 2011 - link

    The most interesting part of this news is that Nvidia and VIA left too so it's not just AMD going crazy.
    Anyway,this benchmark is pointless,it doesn't take into consideration just the CPU or all the hardware (from USB ports to screen and Wi-Fii),it's unclear what kind of usage paterns are factored in and it gained no traction in the community.
    Not very sure why it is being used in CPU reviews when it's not supposed to reflect CPU perf anyway.If it was any good it could be used in pre-built systems reviews but it's not so why bother.
    Since i mentioned Wi-Fi testing,maybe you guys could include that in notebook reviews and maybe HTPC tests in all APU and GPU reviews( since for many their main PC is also their HTPC).
  • BLaber - Tuesday, June 21, 2011 - link

    Hope Anandtech can also stop using this Benchmark now , its useless really.
  • Mr Perfect - Tuesday, June 21, 2011 - link

    I never read that part of the articles anyway, same with 3DMark. The number of theoretical marks something gets just doesn't help in the real world.
  • arthur449 - Thursday, June 23, 2011 - link

    I agree. The entire system benchmarks are much less relevant to my interests than the specific application tests. We do a lot of encoding using x.264, so the comparative AnandTech 2nd pass tests are a goldmine for us.

    These Borg collective benchmarks would be much more useful from a financial point of view if they stopped trying to collect all the results into one trademarked point value. They already conduct a number of tests using hardware traces of professional software that is financially unreasonable for an average consumer or hardware reviewer to own. If they make these individual tests center stage and give us results in seconds or data / operations per period of time then buying Borg Benchmark 2012 ("End of the World Edition!") for $100-150 is a steal compared to even a single student license of whatever Autodesk is punishing its customers with. I would rather read about individual program scores than see some giant meaningless consumer-friendly score.

    Likewise, I value the reviewer's words when explaining the charts more than the charts themselves. If it's a page of Borg benchmark charts and a paragraph at the bottom, I skip to the bottom. If the reviewer remarks that a particular product scored well in a certain test, I go back up and look at the chart.
  • Gigantopithecus - Tuesday, June 21, 2011 - link

    I think a better comparison than 'whiny baseball fans' (...?) is human intelligence. As computers continue to develop, there are more and more facets to their performance. Like human IQ, one single number like SYSmark doesn't really reflect what a computer is capable of doing.

    AMD's CPUs can't touch Intel's in terms of sheer compute performance, period. Bulldozer might change that, but for now, it's a fact. However, as Jarred alluded, even the $60 AMD Athlon II X2 250 is sufficient for most desktop computer user's needs. Think of it like this: you're running a company and need employees. If Intel is the applicant with an IQ of 140 and AMD is the applicant with an IQ of 100, who do you hire? For most positions, you're going to hire the person who is 'good enough' - you can pay them less, and they'll get the job done.

    Obviously computers aren't as nuanced as people, but AMD does in fact field some real strengths compared to Intel, especially in terms of pricing. For example, the applicant with an IQ 140 might also have some glaring personality flaw, and the definition of average applicant might be fun to talk with at lunch. But you don't see those important considerations in one number like IQ. SYSmark does not accurately represent AMD's dominance in graphics, which is an important aspect of the whole computing experience.

    AMD is not 'whining' about SYSmark because it has less powerful chips. Its problem with SYSmark is that it's an increasingly meaningless single number that doesn't reflect how CPUs, and now APUs, have developed in the last few years. Hell most salespeople don't really understand benchmarks, and even many of AnandTech's forum users place bizarre emphasis on certain benchmarks. Of course AMD is not going to support a benchmark that they (rightly) believe doesn't paint an accurate picture for consumers. I build a lot of systems for the average user: internet browsing, email, office productivity, Facebook games, maybe some streaming HD content. An AMD Athlon II and an SSD cost about the same as the least expensive Core i3 and a mechanical hard disk. The AMD + SSD combo is a MUCH better experience for the average user because the system just feels snappy and immediately responsive, whereas the i3 system with its platter-based drive is sluggish by comparison.

    IMHO, Intel continues to deftly leverage its strengths: manufacturing and compute power. AMD can never hope to beat or even rival Intel at its own game, and therefore chose a different path: address the weak links in the whole system. Right now, that's graphics.

    Finally, if you honestly expect a benchmark to be unbiased when its President is the 'head of benchmarking' or whatever at Intel, I've a bridge you might be interested in buying. :P
  • Spazweasel - Tuesday, June 21, 2011 - link

    If Intel is the only one left in BAPCo, and their chief benchmarking guy runs BAPCo, how can anyone say that SysMark can be trusted? That there isn't a huge conflict of interest that BAPCo is neck-deep in? No Via. No AMD. No nVidia. What's left? SysMark is now just a rubber-stamp.

    SysMark just became a waste of tester time and website bandwidth. Jarred, Anand, et al., just don't bother with it. You've got a limited amount of time to run benchmarks, and you need to choose for credibility. SysMark no longer has that credibility. I'm sure you can come up with something that's much more vendor-agnostic.
  • frozentundra123456 - Tuesday, June 21, 2011 - link

    Another case of whining from AMD because their CPUs do not measure up. There are plenty of other tests to measure GPU performance and AMD would be favored in those. Just put out a competitive CPU for goodness sake!!!
  • Germanicus - Tuesday, June 21, 2011 - link

    Then I guess Nvidia and VIA are whining too, but perhaps you wouldn't know that since Anandtech neglected to title the story properly.

Log in

Don't have an account? Sign up now