Power Management

Idle power management for SSDs can be surprisingly complicated, especially for NVMe drives. But it is also vitally important for any battery-powered system. Real-world client storage workloads leave SSDs idle most of the time, so idle behavior is a big factor in how battery-friendly a drive is. Power draw when idle isn't the only thing that matters; how quickly a drive can enter or wake up from a low-power state can have a big impact on how effective its power management is.

For SATA SSDs, the host system doesn't have a lot of say in how the drive manages power. Using the SATA Aggressive Link Power Management (ALPM) feature to mostly power the SATA connection is usually sufficient to put a drive to sleep. But the lowest-power sleep state supported by SATA devices (DevSleep) requires extra signalling on a pin that's part of the SATA power connector. This means that DevSleep is in practice only supported on laptops, and our desktop testbeds cannot use or measure this sleep state.

NVMe includes numerous features pertaining to power management or thermal management. Most of them are optional in the NVMe spec, but there's a common subset supported by most consumer SSDs. NVMe drives can support numerous different power states, including multiple active and multiple inactive power states. The drive's firmware provides information about its capabilities to the host system:

Samsung 980 PRO
NVMe Power States
Controller Samsung Elpis
Firmware 1B2QGXA7
Power
State
Maximum
Power
Active/Idle Entry
Latency
Exit
Latency
PS 0 8.49 W Active - -
PS 1 4.48 W Active - 0.2 ms
PS 2 3.18 W Active - 1.0 ms
PS 3 40 mW Idle 2.0 ms 1.2 ms
PS 4 5 mW Idle 0.5 ms 9.5 ms

 

When a drive and the host OS both support the Autonomous Power State Transition (APST) feature in NMVe 1.1 or later, the host system can give the drive a set of rules for how long it should wait while idle before dropping down to a lower-power state. Operating systems choose these delays based on the power state entry and exit latencies claimed by the drive, and other configuration information about the system's overall tolerance for increased disk access times.

One common problem with the NVMe APST feature is that the NVMe spec doesn't really say anything about how APST interacts with PCIe Active State Power Management. SSD vendors tend to make assumptions that eg. a system which configures the drive to use its deepest idle state will fully support PCIe APSM. Most of the time, things work out, but it's also possible to end up with a drive that goes to sleep and never wakes up, or a drive that defaults back to its highest power state if anything goes wrong when it tries to go to sleep.

Using our Coffee Lake testbed that has fully functional PCIe power management, we test SSD power in three states. Active idle is when the drive is not using any externally-configurable power management features: SATA or PCIe link power management is disabled, and NVMe APST is off. We're now using a more reliable and broadly-compatible method for disabling APST through the Linux kernel rather than directly poking the drive's registers. This means that some drives will probably end up showing higher active idle power draw than we have previously measured.

Even though there are many combinations of power management settings and power states that can be used with a typical consumer NVMe SSD, we condense it down to just two low-power configurations to test. What we call "Desktop Idle" is using the features that are almost always available and working on desktop platforms, even if they're off by default. This includes enabling SATA ALPM, NVMe APST, and PCIe ASPM.

Next, we have the "Laptop Idle" state, with all the power-saving features fully enabled. For SATA SSDs, this would include DevSleep, so we can't fairly measure the Laptop Idle power draw of SSDs. For NVMe SSDs, this includes enabling PCIe L1 substates.

Idle Power Consumption - No PMIdle Power Consumption - DesktopIdle Power Consumption - Laptop

Accurately measuring the time it takes for a drive to enter a low-power state is tricky, but measuring the time taken to wake up is straightforward. We run a synthetic test that performs a single 4kB random read once every 10 seconds. When power management features are disabled and the drive stays in its active idle state, the random read latency will be determined mainly by the speed of the NAND flash. When the drive is in the Desktop Idle or Laptop Idle state, it will go to sleep between each random read, so we can repeatedly sample the time taken to wake up and perform a random read. The difference between this time and the random read latency from the drive in the active idle state is due almost entirely to the overhead of waking up the drive from a sleep state, and this difference is what we report as a drive's wake-up latency.

Idle Wake-Up Latency

 

Conclusions

In this article we hope we've given you an insight into how much goes into testing a modern solid state storage drive - something more than just running CrystalDiskMark and finding peak sequential speeds! The new suite is not only more in-depth, but also we've streamlined it somewhat for automation, enabling fewer sleepless nights as deadlines loom on the horizon (or put another way, more reviews to come). We're obviously keen to take on additional feedback with the testing, so please leave a comment below.

Advanced Synthetic Tests: Block Sizes and Cache Size Effects
Comments Locked

70 Comments

View All Comments

  • IanCutress - Tuesday, February 2, 2021 - link

    Some people see sequential reads at QD128 as the 'holy' unified metric ;)
  • Samuel Vimes - Monday, February 1, 2021 - link

    Great to see this updated test suite.
    Would love to see popular SSDs like Corsair MX500, BX500 and Samsung 970 EVO Plus incorporated as reference points.

    Also, a note to put numbers in perspective (e.g., "in this test, a 10% difference is/isn't significant")—like in sound measurements where we know +10 dB is twice the perceived sound and we have dB(A), in SSD measurements what amount of difference matters in different scenarii (data loaders, gamers, office use...)?
  • oRAirwolf - Monday, February 1, 2021 - link

    Very nice. Also nice to see some additional validation of the SK Hynix Gold P31 results. My only complaint with that SSD, which I installed in my XPS 17 9700, is that it does not support hardware encryption with bitlocker. There is definitely a significant performance penalty when testing performance in Crystal Disk Mark with software encryption enabled and disabled. Sad times.
  • svan1971 - Monday, February 1, 2021 - link

    where is the fastest 4.0 M.2 ? The Sabrent Rocket 4 Plus ?
  • Beaver M. - Monday, February 1, 2021 - link

    Deleting critical comments now, are we?
  • Ryan Smith - Tuesday, February 2, 2021 - link

    No comments have been deleted. I only delete them in the most egregious of circumstances, and never for being critical.
  • Deicidium369 - Tuesday, February 2, 2021 - link

    Thank, I needed a good laugh - I guess contradicting Ian Cutress is egregious - not in this article - but the badly botched Xe HPC write up.

    Slippery slope leading to Tom's Hardware level of protecting the fee fees of "editors".
  • Martin84a - Sunday, February 7, 2021 - link

    Well one of your writers certainly do. I pointed out that it's incorrect to write 2TB, 2MB, 2MHz, 2mm etc. and that it's incorrect to write 2Mb when you mean 2 MB (megabyte), and that I was surprised to see such a lack of consistency on Anandtech. *Deleted*.
  • Martin84a - Sunday, February 7, 2021 - link

    For those not getting it, the ISO standard, IEC and writing it as a proper SI unit all specify that the numerical value always precedes the unit, and that is always used to separate the unit from the number.
  • Martin84a - Tuesday, February 9, 2021 - link

    *a space is always used.

Log in

Don't have an account? Sign up now