What’s in a Benchmark? This is a pertinent question that all users need to ask themselves, because if you don’t know what a benchmark actually tests and how that relates to the real world, the scores are meaningless. Today, AMD has announced that they are resigning from BAPCo over a long standing dispute over the weighting of scores within the SYSmark suite. AMD specifically references SYSmark 2012 (SM12), but there have been complaints in the past and the latest release is apparently the proverbial straw that broke the camel’s back.

You can read more about the decision on Cheif Marketing Officer (CMO) Nigel Dessau’s blog, but this announcement comes at an interesting time since BAPCo just shipped us copies of the final SM12 release. We haven’t had a chance to run the suite yet, and we’ll still have a look at the results and see how AMD and Intel platforms compare at some point, but it looks like we have a foregone conclusion: Intel will come out ahead. What we really need to examine is why Intel gets a better score.

If you’ve been reading AnandTech for any length of time, you’ll know that we place a lot more weight on real-world benchmarks rather than synthetic tests, but certain tasks can be very difficult to test in a meaningful way. How do you measure every day tasks like surfing the web in a meaningful way when most CPUs are 95% idle performing that task? When we really look at the market right now, in many cases we can conclude that just about any current computer will be fast enough for 90% of users. If you want to surf the Internet, write email, work in Office applications, watch some movies, listen to music, etc. you can do that on anything from a lowly AMD Brazos netbook to a hex-core monster system. Yes, we did leave out Atom, because there are certain areas where it falls short—specifically, certain movie formats prove to be too much for the current Atom platform, particularly if you’re looking at HD H.264 content (e.g. YouTube and Hulu).

Reading through AMD’s announcement and Nigel’s blog, it’s pretty clear what AMD is after: they want the GPU to play a more prominent role in measurements of overall system performance. On the one hand, we could say that AMD is simply trying to get benchmarks to favor their APUs, since Brazos and Llano easily surpass the Intel competition when it comes to graphics and video prowess. This would certainly be true, but then we also have to consider what users are actually doing with their PCs. SYSmark has always included a variety of tests, and certainly knowing how fast your computer is in regards to Excel performance can be useful. However, AMD claims that a disproportionate weight is given to some tests, with mention of optical character recognition and file compression activities in particular.

We don’t have the full SM12 whitepaper yet, but we can look at the list of applications that are tested, and a few things immediately stand out. There are two web browsers in the list, but both versions are now outdated. Internet Explorer 8 has been replaced by Internet Explorer 9, and Firefox 3.6 is replaced by Firefox 4.0—with Firefox 5 just around the corner. Without newer browsers, HTML5 is basically untested by SM12, and while we understand that SM12 has been in development for a while, for something calling itself 2012 to include mostly 2010 applications feels out of place. Considering IE9 and FF4 both shift to GPU-accelerated engines, AMD would certainly have benefited from the use of the latest versions. The remaining applications look reasonable, but again we have no information on weighting of scores, so we’ll have to see how the results pan out.

Ultimately, the main thing to take away from all of this is that, just like the PCMark, 3DMark, Cinebench, SunSpider, etc. benchmarks we routinely refer to, SYSmark 2012 is merely one more tool to analyze system performance. It will be interesting to see how other elements—like the presence or lack of an SSD—impact the score. In our opinion most users would benefit far more from running something like Llano with an SSD as opposed to Sandy Bridge with an HDD, so the CPU/GPU/APU are not the only factors, but it still depends on your intended use. If you’re running a server, obviously the demands placed on the system will be far different from the average home computer. Multimedia professionals that spend a lot of time in Adobe Photoshop and/or Premiere likewise have different needs.

Is AMD right? Is heterogeneous (e.g. CPU and GPU working together) computing more important now than raw CPU performance, or is SYSmark12 merely proving what we already know: Sandy Bridge is really fast? Let us know what you think, but as always remember that when you’re looking at benchmark charts, take a minute to think about what the bars actually represent. The full news release is below, but again you can find substantially more detail in Dessau’s blog.

Update: It turns out AMD is not the only party to have left the BAPCo consortium recently. We've just confirmed with NVIDIA that they have also left the BAPCo consortium. No reason was given.

Update 2: BAPCo has released a statement in return. The consortium notes that AMD approved 80% of the development milestones and that AMD was never threatened with expulsion. The full statement is attached below.

Update 3: We've finally gotten official confirmation (as rumored earlier) that VIA has also left the consortium. They have sent a short statement to SemiAccurate which we have included below. The basis of their complaints are much the same as AMD's: they don't consider SYSMark 2012 to reflect real world usage.


AMD Will Not Endorse SYSmark 2012 Benchmark

— AMD Separates from Association with Industry Group BAPCo —

SUNNYVALE, Calif. — 21, 2011 — AMD (NYSE: AMD) today announced that it will not endorse the SYSmark 2012 Benchmark (SM2012), which is published by BAPCo (Business Applications Performance Corporation). Along with the withdrawal of support, AMD has resigned from the BAPCo organization.

“Technology is evolving at an incredible pace, and customers need clear and reliable measurements to understand the expected performance and value of their systems,” said Nigel Dessau, senior vice president and Chief Marketing Officer at AMD. “AMD does not believe SM2012 achieves this objective. Hence AMD cannot endorse or support SM2012 or remain part of the BAPCo consortium.”

AMD will only endorse benchmarks based on real-world computing models and software applications, and which provide useful and relevant information. AMD believes benchmarks should be constructed to provide unbiased results and be transparent to customers making decisions based on those results. Currently, AMD is evaluating other benchmarking alternatives, including encouraging the creation of an industry consortium to establish an open benchmark to measure overall system performance.

AMD encourages anyone wanting more details about the construction and scoring methodology of the SM2012 benchmark to contact BAPCo. For more details on AMD’s decision to exit BAPCo, please read AMD’s Executive Blog authored by Nigel Dessau.


BAPCo® Reaffirms Open Development Process For SYSmark® 2012

SAN MATEO, Calif.—(BUSINESS WIRE)—Business Applications Performance Corporation (BAPCo®) is a non-profit consortium made up of many of the leaders in the high tech field, including Dell, Hewlett-Packard, Hitachi, Intel, Lenovo, Microsoft, Samsung, Seagate, Sony, Toshiba and ARCintuition. For nearly 20 years BAPCo has provided real world application based benchmarks which are used by organizations worldwide. SYSmark® 2012 is the latest release of the premiere application based performance benchmark. Applications used in SYSmark 2012 were selected based on market research and include Microsoft Office, Adobe Creative Suite, Adobe Acrobat, WinZip, Autodesk AutoCAD and 3ds Max, and others.

Advanced Micro Devices (AMD) was, until recently, a long standing member of BAPCo. We welcomed AMD’s full participation in the two year development cycle of SYSmark 2012, AMD’s leadership role in creating the development process that BAPCo uses today and in providing expert resources for developing the workload contents. Each member in BAPCo gets one vote on any proposals made by member companies. AMD voted in support of over 80% of the SYSmark 2012 development milestones, and were supported by BAPCo in 100% of the SYSmark 2012 proposals they put forward to the consortium.

BAPCo also notes for the record that, contrary to the false assertion by AMD, BAPCo never threatened AMD with expulsion from the consortium, despite previous violations of its obligations to BAPCo under the consortium member agreement.

BAPCo is disappointed that a former member of the consortium has chosen once more to violate the confidentiality agreement they signed, in an attempt to dissuade customers from using SYSmark to assess the performance of their systems. BAPCo believes the performance measured in each of the six scenarios in SYSmark 2012, which is based on the research of its membership, fairly reflects the performance that users will see when fully utilizing the included applications.


VIA's Statement About Leaving The BAPCo Consortium

VIA today confirmed reports that we have tendered our resignation to BAPCo. We strongly believe that the benchmarking applications tests developed for SYSmark 2012 and EEcoMark 2.0 do not accurately reflect real world PC usage scenarios and workloads and therefore feel we can no longer remain as a member of the organization.

We hope that the industry can adopt a much more open and transparent process for developing fair and objective benchmarks that accurately measure real world PC performance and are committed to working with companies that share our vision.

POST A COMMENT

116 Comments

View All Comments

  • yyrkoon - Wednesday, June 22, 2011 - link

    I do not really know why you, or anyone uses sysmark or any of these garbage synthetic benchmarks anyway. It's about the furthest thing away from real world performance ALWAYS, You may as well pick some arbitrary numbers, and then go use Sandra to claim you've go the biggest eManhood on the planet. HDtune, HD tach etc makes more sense, but even with these, they are NOT a good indicator of real world performance either. So again . . .

    In the end, the results only offer an obscure result that only tend to obfusticate the real outcome. SO in other words, someone is making themselves important ( and rich ) by providing totally useless system information to others.

    Gaming benchmarks make sense because it can be an important factor for someone buying equipment. Video encoding/trans-coding is important, because again; Different hardware can produce vastly different performance levels. Even Photoshop benchmarks are important to many. But who here runs an app called Systemmark other than for bench marking ? No one . ..
    Reply
  • krumme - Wednesday, June 22, 2011 - link

    Exactly, sysmark measures something nobody needs.
    Gaming and video encoding is what matters today, and it can be measured in real test.

    If Anand chooses to to use sysmark and whatever like it, it will be bad for consumers, and a small step to keep Intel away from making the most usefull products even if they have the best technology and development ressources in the world. An idiotic outcome for the consumers.

    Use tools like the HD benchmark suite you developed. It was a bit heavy on the 4k random write - wonder why - and lol later on post Intel G2 for that, but at least it reflect real usage, and is a valuable tool for the consumers to select ssd.
    Reply
  • Targon - Wednesday, June 22, 2011 - link

    Gaming and video encoding? Really? Gaming is something that many read Anandtech and other sites to check performance for, but video encoding is something that really does not take up all that much computer time for MOST people. Web browsing benchmarks should probably be up near the top when it comes to overall importance, with watching video content(from different sources) being way up there as well.

    I can see "time to open and extract files from an archive" being up there as well since if you download drivers, they DO take a fair amount of time to extract. Creation of archives should be taken into account too, but not as much as extracting content. For much of this of course, picking good hard drives and systems with a good SATA controller comes into play.....except, AMD has SATA 3 and Intel does not at this point in the latest chipsets. I could see Intel taking second place to file copy operations if you take the latest AMD chipsets compared to Intel chipsets as a result.
    Reply
  • Veroxious - Wednesday, June 22, 2011 - link

    Honestly speaking, IMO Anandtech is NOT showing any bias against AMD... anyone thinking that did not read the article properly and/or are nitpicking. After reading the entire article I have to agree that AMD has done the right thing. Bapco in it's current state is just useless and skewed towards Intel.

    And before someone accuses me of being an AMD fanboy, don't bother. Of the 4 computers I own only one is AMD based. That's simply because for the roles they play in my life and pricing in my neck of the woods the Intel systems offer me more bang for the buck period. I honestly believe that Intel is or was guilty of anti-competitive behaviour and underhanded dishonest business practices in the OEM/corporate space and should have paid a bigger price. It's not like market share can be regained overnight although currently Intel deserve to be dominant in the corporate space due to their obviously superior current CPU's.

    In fact I am already planning on upgrading my gaming rig to the Sandy Bridge platform (although Ivy bridge has to be considered ) and I will be buying whatever graphics card/s represent the best bang for the buck at the time of purchase whether it be red or green since bang for the buck is the most important consideration for me seeing that I have 3 computers constantly in use in my household (excluding laptops). Ofcourse most ppl in my situation will opt for medium level hardware i.e just a little more than I require to be adequate for the next 3-4 years.

    That said benchmarks from sites such as Anandtech is an important tool for me when speccing a new pc BUT is ONE of many considerations in that process. Only a fool would base their entire spec on a specific benchmark.

    The point that many ppl seem to miss is that other than showing one the top peforming hardware it also allows us to see other hardware's relative performance which is what I need when speccing most pc's - i.e how close in performance is a i5-2300 to an i7-2600 or Phenom II 955 BE in a specific task like encoding for instance. That is what I use to spec a system suitable to the main tasks it will perform.
    Reply
  • stancilmor - Wednesday, June 22, 2011 - link

    People ask which computer to buy and I respond with tell me what you plan to use it for?
    Both AMD and Intel are right. On AMD's point people really do need more graphics power. And on Intel's point a little dedicated logic (i.e. quicksync) goes a long way.

    While difficult to impossible benchmarks should be somehow normalized to performance per watt. Clearly not all case benefit from that, because sometimes raw performance really does matter. Or maybe performace per dollar is a better indicator.

    To give an example, I game on a system with a lowly pentium dual core 1.8GHz with 1MB of cache and a nvidia 8800GT 512 at 1920x1200 with all the eye candy (dx9) turned up. Sure could benefit from a faster CPU, but my system is over 24fps sustained...to me if movies are good enough at that speed, the so are games. And as was pointed out in the article most of us need and SSD. I've certainly noticed that when my system us lagging the harddrive light is solid on, not my measely CPU above 80% or my 2GB of system memory all used up.

    Maybe for AMD and Intel the benchmarks should gage performance per profit or cost to manufacture, because would be more relevant to them.

    Ultimately I want to know raw performance and performance per dollar for the applications that I run
    Reply
  • alpha754293 - Wednesday, June 22, 2011 - link

    Well, part of the problem is how the benchmarks are performed.

    For example, 3DMark, SYSmark, etc. are all pre-packaged benchmarks are run in more or less the same way.

    However, as I've also mentioned, how those are compiled can have a HUGE impact on performance.

    Like, if you take the LINPACK CPU floating point benchmark, and compile it 100 different ways, you're going to get 100 different results. So then that raises the question "what's the actual performance of my system?" And then add to that, the complexity of how each CPU actually executes what it is being asked to do. I'm sure if you profile the LINPACK application while it is running, how it runs will also affect performance.

    And that's just a very simple example.

    Now you go with some of those aforementioned benchmarks; and the differences can compound and the result can be polar and diametrically opposed.

    And then with the programs that I've mentioned that I use to benchmark my systems (and to some extent, Anandtech has ran the Fluent benchmark); not knowing how those programs work (because they're not canned benchmarks) can affect how the results too.

    LS-DYNA has conducted testing and research that if you specify a pfile for the MPP decomposition -- it can reduce your runtime by 33%! That's nothing to sneeze at.
    Reply
  • ash9 - Wednesday, June 22, 2011 - link

    you can do that on anything from a lowly AMD Brazos netbook to a hex-core monster system. Yes, we did leave out Atom, because there are certain areas where it falls short

    would'nt that make the Atom, lowly

    asH
    Reply
  • ash9 - Wednesday, June 22, 2011 - link

    That means Intel is the only semiconductor manufacturer left.

    Same as it ever was.
    Reply
  • Lolimaster - Wednesday, June 22, 2011 - link

    Via also confirmrs it left bapco.

    http://semiaccurate.com/2011/06/22/via-confirms-it...
    Reply
  • Roy2001 - Thursday, June 23, 2011 - link

    They are all losers, right? Reply

Log in

Don't have an account? Sign up now