TrueNAS Core: Miscellaneous power tweaks
This continues the series of posts on TrueNAS Core idle power optimization. The previous opus here discussed SpeedShift available on Broadwell and newer processors, that allows to relieve the OS from micromanaging power states and accomplish significantly better power savings that tools like powerd could achieve. In this post we discuss a few minor tweaks that bring diminishing returns, but if every watt counts – it’s worth exploring.
In no particular order
- PCIE: Active-State Power Management (ASPM)
- Interrupt moderation
- Hard drives: EPC (Extended Power Conditions)
- References
PCIE: Active-State Power Management (ASPM)
Run pciconf -lcv
(list capabilities verbose). Look for ASPM disabled
in the properties of the interesting devices—such as your network controller, HBA card, and any other consumer that is infrequently used/under a low load and can benefit from power management. Save the output for easier comparison later.
If you are using server motherboard, ASPM would likely be disabled in bios by default. You would need to hunt down the ASPM
enablement setting somewhere in the PCIE Advanced
configuration and set it to Enable
(if the other option is Auto
) or Auto
(if the other option is Disable
)1:
PCIe/PCI/PnP Configuration
--------------------------------------------------------
ASPM Support [Auto]
Boot into the OS, and verify that it’s enabled there as well:
% sysctl hw.pci.enable_aspm
hw.pci.enable_aspm: 1
Re-run pciconf -lcv
and check for the status of ASPM flag on those devices of interest.
For example, witnessed the following change for my network adapter:
ASPM disabled:
igc0@pci0:6:0:0: class=0x020000 rev=0x03 hdr=0x00 vendor=0x8086 device=0x15f3 subvendor=0x8086 subdevice=0x0000
vendor = 'Intel Corporation'
device = 'Ethernet Controller I225-V'
...
cap 10[a0] = PCI-Express 2 endpoint max data 256(512) FLR NS
max read 512
link x1(x1) speed 5.0(5.0) ASPM disabled(L1)
...
ASPM enabled:
igc0@pci0:6:0:0: class=0x020000 rev=0x03 hdr=0x00 vendor=0x8086 device=0x15f3 subvendor=0x8086 subdevice=0x0000
vendor = 'Intel Corporation'
device = 'Ethernet Controller I225-V'
...
cap 10[a0] = PCI-Express 2 endpoint max data 256(512) FLR NS
max read 512
link x1(x1) speed 5.0(5.0) ASPM L1(L1)
...
Some devices optimized for high performance such as P3600 SSDs or old HBAs may ignore ASPM. I’ve read reports that the newer 9500 series Broadcom HBA not only consumes half the power of the older devices, but also support ASPM. This remains to be confirmed.
Interrupt moderation
Excessively frequent interrupts can prevent the CPU from sleeping peacefully. Run vmstat -i
to see the interrupt rate.
For example, I noticed that my network adapter was generating server thousands of interrupts per second, that was quite bonkers.
Intel driver allows specifying the minimum delay between interrupts firing, thus bundling them up together:
sysctl dev.igc.0.rx_int_delay=250
sysctl dev.igc.0.tx_int_delay=250
Setting the delay to a quarter of a millisecond won’t have any measurable impact on the home server performance, but will help reduce CPU wake-ups:
irq78: igc0:rxq0 37951168 1946
irq79: igc0:rxq1 9028857 463
irq80: igc0:rxq2 17285084 887
irq81: igc0:rxq3 22961332 1178
Hard drives: EPC (Extended Power Conditions)
TrueNAS Core provides UI to specify Advanced Power Management
and Acoustic
levels, but they are silently ignored by the modern-ish drives.
Fairly popular Seagate Exos X20 series drive:
% camcontrol identify da0 | grep 'power\|model'
device model ST20000NM007D-3DJ103
power management yes yes
advanced power management no no
power-up in Standby yes no
extended power conditions yes yes
APM is not supported, but EPC is, and is by default enabled. We can query the drive state:
% camcontrol epc da0 -c status
APM: NOT Supported, NOT Enabled
EPC: Supported, Enabled
Low Power Standby NOT Supported
Set EPC Power Source NOT Supported
Current power state: Idle_a(0x81)
The specific power states are vendor-specific; For example, Seagate calls them “Power Choice Technology Profiles2
From the paper:
Idle_A
- Disables most of the servo system, reduces processor and channel power consumption
- Discs rotating at full speed (7,200 RPM)
Idle_B
- Disables most of the servo system, reduces processor and channel power consumption
- Heads are unloaded to drive ramp.
- Discs rotating at full speed (7,200 RPM)
…
Depending on how active your server is, you may not want drives to spin down at all, or even to park heads: head on-ramp count is a limited resource. Drive firmware relies on pre-defined timeouts and other logic that shall prevent excessive wear, but it may still be a good idea to capture the Load_Cycle_Count
parameter twice a day apart and compare:
for d in /dev/da{0..11}; do
echo $d completed $(sudo smartctl -a $d | grep Load_Cycle_Count | sed -E 's/.*[^0-9]([0-9]+)$/\1/') load/unload cycles
done
If an excessive number of cycles is observed to consider disabling that state. Refer to the Server Fault3 discussion for details.
References
-
Heed the advice in your motherboard manual:
Warning: Enabling ASPM support may cause some PCI-E devices to fail!
After enabling ASPM, watch the system for a few days for stability. ↩
-
Seagate® PowerChoice™ technology paper ↩