Solid state drive/NVMe
NVM Express (NVMe) is a specification for accessing SSDs attached through the PCI Express bus. As a logical device interface, NVM Express has been designed from the ground up, capitalizing on the low latency and parallelism of PCI Express SSDs, and mirroring the parallelism of contemporary CPUs, platforms and applications.
Installation
The Linux NVMe driver is natively included in the kernel since version 3.3. NVMe devices should show up under /dev/nvme*
.
Extra userspace NVMe tools can be found in nvme-cli or nvme-cli-gitAUR.
See Solid State Drives for supported filesystems, maximizing performance, minimizing disk reads/writes, etc.
Management
List all the NVMe SSDs attached with name, serial number, size, LBA format and serial:
# nvme list
List information about a drive and features it supports in a human-friendly way:
# nvme id-ctrl -H /dev/nvme0
List information about a namespace and features it supports:
# nvme id-ns /dev/nvme0n1
Output the NVMe error log page:
# nvme error-log /dev/nvme0
Delete a namespace:
# nvme delete-ns /dev/nvme0n1
Create a new namespace, e.g creating a smaller size namespace to overprovision an SSD for improved endurance, performance, and latency:
# nvme create-ns /dev/nvme0
See nvme help
and nvme(1) for a list of all commands along with a terse description.
SMART
Output the NVMe SMART log page for health status, temp, endurance, and more:
# nvme smart-log /dev/nvme0
-H
option to output even more information, e.g. nvme smart-log -H /dev/nvme0
.NVMe support was added to smartmontools in version 6.5 (available since May 2016 in the official repositories).
Currently implemented features (as taken from the wiki):
- Basic information about controller name, firmware, capacity (
smartctl -i
) - Controller and namespace capabilities (
smartctl -c
) - SMART overall-health self-assessment test result and warnings (
smartctl -H
) - NVMe SMART attributes (
smartctl -A
) - NVMe error log (
smartctl -l error[,NUM]
) - Ability to fetch any nvme log (
smartctl -l nvmelog,N,SIZE
) - The smartd daemon tracks health (
-H
), error count (-l error
) and temperature (-W DIFF,INFO,CRIT
)
See S.M.A.R.T. and the official wiki entry for more information, and see this article for contextual information about the output.
Secure erase
See Solid state drive/Memory cell clearing#NVMe drive.
Firmware update
Generic
Firmware can be managed using nvme-cli. To display available slots and check whether Slot 1 is read only:
# nvme fw-log /dev/nvme0
Firmware Log for device:nvme0 afi : 0x11 frs1 : 0x32303132345a3553 (S5Z42102) frs2 : 0x32303132345a3553 (S5Z42102)
# nvme id-ctrl /dev/nvme0 -H | grep Firmware
[0:0] : 0 Firmware Slot 1 Read/Write
Download and commit firmware to specified slot. In the example below, firmware is first committed without activation (-a 0
). Next, an existing image is activated (-a 2
). Refer to the NVMe specification for details on firmware commit action values.
# nvme fw-download -f S5Z42_fw_S5Z42105.bin /dev/nvme0
Firmware download success
# nvme fw-commit -s 2 -a 0 /dev/nvme0
Success committing firmware action:0 slot:2
# nvme fw-log /dev/nvme0
Firmware Log for device:nvme0 afi : 0x21 frs1 : 0x32303132345a3553 (S5Z42102) frs2 : 0x35303132345a3553 (S5Z42105)
# nvme fw-commit -s 2 -a 2 /dev/nvme0
Success committing firmware action:2 slot:2
Finally reset the controller to load the new firmware
# echo 1 > /sys/class/nvme/nvme0/reset_controller
Intel
"The Intel® Memory and Storage Tool (Intel® MAS) is a drive management tool for Intel® SSDs and Intel® Optane™ Memory devices, supported on Windows*, Linux*, and ESXi*. [...] Use this tool to manage PCIe*-/NVMe*- and SATA-based Client and Datacenter Intel® SSD devices and update to the latest firmware."[2]
Install intel-mas-cli-toolAUR and check whether your drive has an update available:
# intelmas show -intelssd
- Intel SSD 660p Series redacted - Capacity : 512.11 GB CurrentPercent : Property not found DevicePath : /dev/nvme0n1 DeviceStatus : Healthy Firmware : 002C FirmwareUpdateAvailable : 004C Index : 0 MaximumLBA : 1000215215 ModelNumber : INTEL SSDPEKNW512G8 ProductFamily : Intel SSD 660p Series SMARTEnabled : True SectorDataSize : 512 SerialNumber : redacted
If so, execute the load
command as follows:
# intelmas load -intelssd 0
WARNING! You have selected to update the drives firmware! Proceed with the update? (Y|N): Y Checking for firmware update... - Intel SSD 660p Series redacted - Status : Firmware updated successfully. Please reboot the system.
Kingston
Kingston does not provide separate firmware downloads on their website, instead referring users to a Windows only utility. Firmware files appear to use a predictable naming scheme based on the firmware revision:
https://media.kingston.com/support/downloads/S5Z42105.zip
Then proceed with the generic flashing instructions.
Performance
Sector size
See Solid state drive#Native sector size.
Discards
Discards are disabled by default on typical setups that use ext4 and LVM, but other file systems might need discards to be disabled explicitly.
Intel, as one device manufacturer, recommends not to enable discards at the file system level, but suggests the periodic TRIM method, or apply fstrim
manually.[3]
Airflow
NVMe SSDs are known to be affected by high operating temperatures and will throttle performance over certain thresholds.[4]
Testing
Raw device performance tests can be run with hdparm:
# hdparm -Tt --direct /dev/nvme0n1
Power Saving (APST)
To check NVMe power states, install nvme-cli or nvme-cli-gitAUR, and run nvme get-feature /dev/nvme[0-9] -f 0x0c -H
:
# nvme get-feature /dev/nvme0 -f 0x0c -H
get-feature:0xc (Autonomous Power State Transition), Current value:0x000001 Autonomous Power State Transition Enable (APSTE): Enabled Auto PST Entries ................. ...
When APST is enabled the output should contain "Autonomous Power State Transition Enable (APSTE): Enabled" and there should be non-zero entries in the table below indicating the idle time before transitioning into each of the available states.
If APST is enabled but no non-zero states appear in the table, the latencies might be too high for any states to be enabled by default. The output of nvme id-ctrl /dev/nvme[0-9]
(as the root user) should show the available non-operational power states of the NVME controller. If the total latency of any state (enlat + xlat) is greater than 25000 (25ms) you must pass a value at least that high as parameter default_ps_max_latency_us
for the nvme_core
kernel module. This should enable APST and make the table in nvme get-feature
(as the root user) show the entries.
Troubleshooting
Controller failure due to broken APST support
Some NVMe devices may exhibit issues related to power saving (APST). This is a known issue for Kingston A2000 [5] as of firmware S5Z42105 and has previously been reported on Samsung NVMe drives (Linux v4.10) [6][7]
A failure renders the device unusable until system reset, with kernel logs similar to:
nvme nvme0: I/O 566 QID 7 timeout, aborting nvme nvme0: I/O 989 QID 1 timeout, aborting nvme nvme0: I/O 990 QID 1 timeout, aborting nvme nvme0: I/O 840 QID 6 timeout, reset controller nvme nvme0: I/O 24 QID 0 timeout, reset controller nvme nvme0: Device not ready; aborting reset, CSTS=0x1 ... nvme nvme0: Device not ready; aborting reset, CSTS=0x1 nvme nvme0: Device not ready; aborting reset, CSTS=0x1 nvme nvme0: failed to set APST feature (-19)
As a workaround, add the kernel parameter nvme_core.default_ps_max_latency_us=0
to completely disable APST, or set a custom threshold to disable specific states.
Since March 2021 a firmware update 9 from Kingston is available. As Kingston only supports Windows, downloads for Linux can be found via heise.de or github. It is expected that, as long as the kernel workaround is in place, the firmware update will not do much as the deepest powersaving states are not reached anyway.
# smartctl -a /dev/nvme0
Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 9.00W - - 0 0 0 0 0 0 1 + 4.60W - - 1 1 1 1 0 0 2 + 3.80W - - 2 2 2 2 0 0 3 - 0.0450W - - 3 3 3 3 2000 2000 4 - 0.0040W - - 4 4 4 4 15000 15000
The value passed is the maximum exit latency (Ex_Lat). For example, to disable PS4 set nvme_core.default_ps_max_latency_us=2000
.
Controller failure due to broken suspend support
Some users (for example, see Laptop/HP) have reported suspend failures with certain NVMe drives. As above, the failure renders the device inoperable until system reset, with kernel messages
nvme nvme0: Device not ready; aborting reset, CSTS=0x3 nvme nvme0: Removing after probe failure status: -19
As a workaround, add the kernel parameter iommu=soft
to use a software replacement for the hardware IOMMU. (For further details, see this documentation.) This has the potential to cause some slight processing overhead.