Java Server Performance

The SPECjbb 2013 benchmark has "a usage model based on a world-wide supermarket company with an IT infrastructure that handles a mix of point-of-sale requests, online purchases, and data-mining operations." It uses the latest Java 7 features and makes use of XML, compressed communication, and messaging with security.

Benchmark architecture diagram

We tested with four groups of transaction injectors and back-ends. We applied a relatively basic tuning to mimic real-world use. Our first run was done with a low amount of memory:

"-server -Xmx4G -Xms4G -Xmn2G -XX:+AlwaysPreTouch -XX:+UseLargePages"

With these settings, the benchmark takes about 20-27GB of RAM. In our second run, we doubled the amount of memory to see if more memory (64GB vs. 32GB) can boost performance even more:

"-server -Xmx8G -Xms8G -Xmn4G -XX:+AlwaysPreTouch -XX:+UseLargePages"

With these settings, the benchmark takes about 43-57GB of RAM. The first metric is basically maximum throughput.

SPECJBB 2013-Multi max-jOPS

Assigning more memory to your Java VMs than what you have available is of course a bad idea, but now we have some numbers you can use to convince you coworkers of this fact. Although the Atom C2750 and X-Gene perform a little better thanks to fact that they can address twice as much RAM, they are nowhere near the performance of a Xeon E3-1230L if the latter is configured properly.

The Critical-jOPS metric is a throughput metric under response time constraint.

SPECJBB 2013-Multi Critical-jOPS

The Xeon E3 Haswell core continues to outperform the Atom by a tangible margin. The X-Gene fails to compete with the Intel SoCs. The conclusion is pretty simple: Java applications run best on a Haswell or Ivy Bridge core.

SPECJBB®2013 is a registered trademark of the Standard Performance Evaluation Corporation (SPEC).
Multi-Threaded Integer Performance Website Performance: Drupal 7.21
Comments Locked

47 Comments

View All Comments

  • JohanAnandtech - Tuesday, March 10, 2015 - link

    Are you sure this is up to date? gcc tells me -march=native is not supported.
  • JohanAnandtech - Tuesday, March 10, 2015 - link

    Update. march=native does not work. I have tried -march=armv8-a but does not do much (it is probably the default). O3 makes the biggest difference. Omit it and you get 5.7 GB/s. With -O3, I am at 18 GB/s and more (stream m400)
  • Alone-in-the-net - Tuesday, March 10, 2015 - link

    Apologies. For AArch64 the only is "armv8-a", for intel, -march=native sets it to use the one for your CPU.
    https://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/AArch...
    https://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/i386-...
    From version 4.9.x and above of GCC, you can really start to add tuning for the CPU.
    https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gcc/AArch...
    -mtune=name
    Specify the name of the target processor for which GCC should tune the performance of the code. Permissible values for this option are: ‘generic’, ‘cortex-a53’, ‘cortex-a57’.
    Additionally, this option can specify that GCC should tune the performance of the code for a big.LITTLE system. The only permissible value is ‘cortex-a57.cortex-a53’.

    Where none of -mtune=, -mcpu= or -march= are specified, the code will be tuned to perform well across a range of target processors.
  • Alone-in-the-net - Tuesday, March 10, 2015 - link

    Also support for the XGene1 as a compilation target is only from GCC5.
    https://gcc.gnu.org/gcc-5/changes.html
    Support has been added for the following processors (GCC identifiers in parentheses): ARM Cortex-A72 (cortex-a72) and initial support for its big.LITTLE combination with the ARM Cortex-A53 (cortex-a72.cortex-a53), Cavium ThunderX (thunderx), Applied Micro X-Gene 1 (xgene1). The GCC identifiers can be used as arguments to the -mcpu or -mtune options, for example: -mcpu=xgene1
  • The_Assimilator - Monday, March 9, 2015 - link

    So AMD, how's that bet on ARM you made looking now?
  • extide - Monday, March 9, 2015 - link

    Don't count them out yet. I really wish that intel didn't abandon ARM for the Atom, I bet they could come out with a sweet armv8 core if they had to, and on their process it would be sweet.
  • BlueBlazer - Monday, March 9, 2015 - link

    That AMD Opteron A1100 looking more like abandonware as more time passes on, and that was like 8 months ago. Until now not a single real world deployment nor was used in any of AMD's own SeaMicro servers. Currently available as development kit with a rather steep price tag.
  • tuxRoller - Monday, March 9, 2015 - link

    You REALLY should be using GCC 5. that includes many improvements for the armv8 isa. I'd suggest grabbing a nightly of Fedora 22, but Ubuntu 15.04 may be using gcc5 as well.
  • Wilco1 - Monday, March 9, 2015 - link

    Agreed, nobody doing anything on AArch64 should contemplate using GCC4.8. Even 4.9 is way out of date. GCC5.0 with latest GLIBC gives major speedups across the board.
  • JohanAnandtech - Tuesday, March 10, 2015 - link

    "Way out of date?" We tried out 4.9.2, which has been released on October 30th 2014. That is about 4 months old. https://www.gnu.org/software/gcc/releases.html. Latest version is 4.8.4, 5.0 has not even been released AFAIK.

Log in

Don't have an account? Sign up now