Enterprise Storage Bench - Oracle Swingbench

We begin with a popular benchmark from our server reviews: the Oracle Swingbench. This is a pretty typical OLTP workload that focuses on servers with a light to medium workload of 100 - 150 concurrent users. The database size is fairly small at 10GB, however the workload is absolutely brutal.

Swingbench consists of over 1.28 million read IOs and 3.55 million writes. The read/write GB ratio is nearly 1:1 (bigger reads than writes). Parallelism in this workload comes through aggregating IOs as 88% of the operations in this benchmark are 8KB or smaller. This test is actually something we use in our CPU reviews so its queue depth averages only 1.33.

Oracle Swingbench - Average Data Rate

The S3700's only performance blemish is here in our Swingbench test. Like many other new drives we've looked at (e.g. Intel's SSD 910, Micron's P320h), the S3700's performance here is a regression compared to previous drives. I asked Intel about this and it appears that largely unaligned smaller-than-4KB accesses are slower on the new controller compared to the outgoing 710. My guess is that given how common 4KB accesses are, most controller vendors picked it as the optimization point for their next-gen drives. I'm not entirely sure how many enterprise applications exist that fall into this behavior pattern but it's worth pointing out in case you have a unique workload that features a lot of < 4KB unaligned accesses.

Update: I have some more clarification as to what's going on here. There are two components to the Swingbench test we're running here: the database itself, and the redo log. The redo log stores all changes that are made to the database, which allows the database to be reconstructed in the event of a failure. In good DB design, these two would exist on separate storage systems, but in order to increase IO we combined them both for this test. Accesses to the DB end up being 8KB and random in nature, a definite strong suit of the S3700 as we've already shown. The redo log however consists of a bunch of 1KB - 1.5KB, QD1, sequential accesses. The S3700, like many of the newer controllers we've tested, isn't optimized for low queue depth, sub-4KB, sequential workloads like this. 

Remember that with 25nm NAND, the page size grew to 8KB. Having tons of small IOs like this creates extra tracking overhead, which can require additional DRAM to maintain full performance. Intel views this type of a scenario as unlikely (apparently the latest versions of Oracle allow you to force the redo log to use 4KB sectors instead of 512B sectors, which would make these transfers 8KB - 12KB in size), and thus simply didn't optimize for it with the S3700's firmware. Intel did confirm that should customer feedback indicate this is a fairly likely scenario that it would be willing to consider some alternate solutions to improving performance here, but at this point it seems like it doesn't matter for the vast majority of use cases.

Oracle Swingbench - Disk Busy Time

Oracle Swingbench - Average Service Time

Sequential Read/Write Speed Enterprise Storage Bench - Microsoft SQL UpdateDailyStats
Comments Locked

30 Comments

View All Comments

  • Hans Hagberg - Monday, November 12, 2012 - link

    An enterprise storage review today is not really complete without an array of 15K mechanical disks for comparison. This is still what is being used for performance in most cases and that is what we are up against when we are looking to motivate SSDs in existing configurations.

    And for completeness, please throw in PCI-based SSD storage as well. Such storage always come up in discussions around SSD but there is too little independent test data available to take decisions.

    Another question when reading the review is about the test system being used. I couldn't find this information?

    Also - enterprise storage is most often fronted by high-end controllers with lot's of cache. It would be interesting to see an analysis of how that impacts the different drives and their consistency. Will the consistency be equalized by a big controller and cache in front of it?

    The Swingbench anomaly is unfortunate because database servers are probably the primary application for massive implementation of SSD storage. It would be nice if the anomaly could be sorted out so we could see what the units can do. Normally, if one cares for enterprise performance, you are careful with alignment and separation of storage (data, logs etc.) so I agree with the Intel statement on this. Changing the benchmark would tear up the old test data so I'm not sure how to fix it without starting over.

    The review format and test case selection is excellent. Just give us some more data points.
    I would go as far as to say I would pay good money to read the review if the above was included.
  • Sb1 - Tuesday, November 13, 2012 - link

    "An enterprise storage review today is not really complete without an array of 15K mechanical disks for comparison."
    ... "And for completeness, please throw in PCI-based SSD storage as well."

    I __fully__ agree with Hans Hagberg

    I thought this was a good article, but it would be an excellent one with both of these.

    Still keep up the good work.
  • Troff - Wednesday, November 14, 2012 - link

    I agree as far as PCI-based SSDs go, but I see no point in including the 15K mechanical drive array for the same reason you don't see velocipedes in car reviews.
  • ilkhan - Tuesday, November 13, 2012 - link

    So what I see here is that for an enterprise server drive, go with this Intel. For a desktop drive, this intel or a samsung 840pro, for a laptop drive, the samsung 840pro is best.

    That about sum it up?
  • korbendallas - Friday, November 16, 2012 - link

    Instead of average and max latency figures, I would love to see percentiles: 50%, 90%, 99%, 99,9% for instance. If you look at intel's claims for these drives, they're in percentiles too.

    If your distribution does not follow a bell curve, which is the case in many of the SSDs you are testing, average is useless. And as you already know (and why you didn't include it before now), max is useless too.
  • dananski - Saturday, November 17, 2012 - link

    I'd really like to see more graphs like the ones on "Consistent Performance: A Reality" showing how much variation drives can have in instantaneous IOPS. These really do a great job of showing exactly what Intel has fixed and I can see the benefit in some enterprise situations. A millisecond hiccup is an eternity for the CPU waiting for that data.

    Personally I'd now like to know:
    * How much of a problem this can be on consumer drives, where sustained random IO is less common?
    * Is this test a good way to characterise the microstutter problem for a particular drive?
    * How badly are drives with uneven IOPS distributions affected by RAID? (I know this was touched on briefly in the webcast with Intel)
  • junky77 - Sunday, November 18, 2012 - link

    the consistency of current consumer SSDs?
  • virtualstorage - Tuesday, March 12, 2013 - link

    I see the test results upto 2000 seconds. With a enterprise array, there will be continuos ios in 24/7 production environment. What is the performance behavior of Intel SSD DCS3700 with continuous io's over many hours?
  • damnintel - Wednesday, March 13, 2013 - link

    heyyyy check this out damnintel dot com
  • rayoflight - Sunday, October 6, 2013 - link

    Got two of these. Both of them failed after approx. 30 boot up's. They arent recognised anymore by the bios or as external harddrives on a different system, as if they are completely dead. Faulty batch? Or do they "lock up" ? Anyone had this problem?

Log in

Don't have an account? Sign up now