SPECjbb2005

SPECjbb2005 from SPEC (Standard Performance Evaluation Corporation) evaluates the performance of server side Java by emulating a three-tier client/server system with emphasis on the middle tier. Instead of testing with a possible disk intensive database system, SPECjbb uses tables of objects, implemented by Java Collections, rather than a separate database. A longer description can be found here.

Again, it is not our objective to show the best possible scores. Very few people will take the time to fully tune the JVM and take the risk that some of the ultra aggressive optimizations backfire. So we tested with some decent but rather generic tuning that we could use on all systems. The JVM is Sun's version 1.5.0_08, which allows us to compare scores with previous results.

We tested SPECjbb2005 with four application instances. Using NUMActl, a clever utility written by Andi Kleen, we were able to bind each Java application to each node on our Tyan server. We didn't bind instances to CPUs on the Intel platforms (though it is possible with taskset) as it gives worse performance. The parameters in bold show the actual JVM optimizations.

On the Opteron we used:
numactl --cpunodebind=$node --membind=$node -- java -cp jbb.jar:check.jar -Xms2g -Xmx2g -Xmn1g -Xss128K -XX:+AggressiveOpts -XX:+UseParallelOldGC -XX:+UseParallelGC spec.jbb.JBBmain -propfile SPECjbb.props -id $x
On the Xeons we used:
java -classpath jbb.jar:check.jar -Xms2g -Xmx2g -Xmn1g -Xss128K -XX:+AggressiveOpts -XX:+UseParallelOldGC -XX:+UseParallelGC spec.jbb.JBBmain -propfile SPECjbb.props -id $x
Below you can find the final score that specjbb2005 reports, which is an average of the last four runs.


Specjbb 4 instances

The impact of binding each instance to a specific node is less dramatic as what we have seen before, but still, the Opteron scored only 42254 without the use of numactl. The Opterons are in a neck-and-neck race with the dual core Intel's. As this kind of transactional Java application depends quite a bit on the memory interface, the slightly lower integer power of the Opteron is hidden by its faster access to the memory. Nevertheless, it is the Xeon E5345 which wins this race as the use of 4 instances allows the Xeon 53xx to scale well.

The Secret Boost of the Opteron 2224 MySQL
Comments Locked

30 Comments

View All Comments

  • piroroadkill - Tuesday, August 7, 2007 - link

    it is a car analogy
  • Gul Westfale - Monday, August 6, 2007 - link

    good analogy there, except that mustangs (and various other cars) use pickup truck engines for cost reasons. large trucks use larger engines (often diesels) because they offer considerably more torque at much lower RPM than a smaller gasoline engine; and thus provide more pulling power.
  • Gul Westfale - Monday, August 6, 2007 - link

    these are not regular consumer cpus, but intended for use in commercial servers and workstations. they and their motherboards cost more because they support features such as multiple sockets (so in addition to having multiple cores on one chip you can also have multiple chips on one motherboard).

  • yyrkoon - Monday, August 6, 2007 - link

    quote:

    Intel has a clear lead in the rendering market. If you are rendering complex high resolutions images, the quad core Xeon is clearly the best choice.


    they win 1 of 2 tests, and it is clear they are the winner ? Why ? Because they won the software rendering also ? Anyone interrested enough in rendering, and HAVING to have this sort of hardware for it is NOT going to bother with software . . .

    This means your conclusion on this point is incorrect, and in which case, it boils down to which application the rendering machine is going to do.

    Man you guys come to the wierdest conclusions based on your own data, and I am not even the first to notice/mention this sort of thing . . .
  • JohanAnandtech - Monday, August 6, 2007 - link

    The Quadcore wins all high resolution rendering tests. Where do you see the DC opterons win against the Quadcore Intel in high resolution rendering? Show me a rendering engine where a 3 GHz K8 DC core is faster in high resolution renderering than a 2.33 GHz Quadcore. All decent and used in the realworld rendering engines will more or less show the same picture.

    In fact, the "rendering performance" situation will get worse for the K8 as SSE-2 tuning will get more common. All Intel CPUs since core and all AMD CPUs since Barcelona will show (or are already showing) high performance boost from using better SSE-2 code.
  • yyrkoon - Monday, August 6, 2007 - link

    Ok, I see now with the graphs 'lower is better' on 3ds max, I missed that with the tables, which is actually what I meant this morning 'table obfustication'. I personally do not mind tables, but when the data is not in a uniform spot, it confuses/makes it harder to read at a glance.

    Anyhow, I was tired when I posted this morning, cranky, and was overly harsh I think. However it *is* much easier for me personaly to read the graphs at a glance (I cannot speak for everyone though).
  • yyrkoon - Monday, August 6, 2007 - link

    Oh, and while on the subject, you guys here at anandtech have lately mastered the art of graph obfustication. Is it really THAT hard leaving items in the same rows / columns for different tests ? Are we trying to confuse the results, or is there some other reason this happens, and has gone completely over my head ?
  • JohanAnandtech - Monday, August 6, 2007 - link

    The only reason is that until very recently I didn't master the graphing engine. I got some weird error messages and gave up. But I have found the error, and you should see some nice graphs which don't obfusticate...
  • Spoelie - Monday, August 6, 2007 - link

    the gif on page 2 is non-looping, so after a very quick jump from 1ghz -> 2.8ghz (why??) -> 3.2ghz , it stays put on the 3.2ghz image. If reading the article, by the time the reader sees the image, it's already 5 minutes on the last image and staying there, making it for all intents and purposes a static image instead of an animated one

    :)
  • JohanAnandtech - Monday, August 6, 2007 - link

    Thanks, fixed that. The reason to show 2.8 GHz is that for example Specjbb and other applications sometimes don't completely stress the CPU and then the cpu dynamically goes back to 2.8 GHz. It are simply the 3 stages I saw the most, and found the most interesting to show.

Log in

Don't have an account? Sign up now