超微风扇设置
from--https://blog.pcfe.net/hugo/posts/2018-08-14-epyc-ipmi-fans/

set fan thresholds on my Super Micro H11DSi-NT

Table of Contents

The Super Micro remote management webUI is nice and even offers KVM with HTML5. But I much prefer using vendor agnostic IPMI directly instead of a vendor specific tool like iDRAC, iLO, etc.

These are my notes on setting fan thresholds via IPMI for my H11DSi-NT motherboard.

why not control fans with software

you may ask yourself. Well some people use this perl script on github. But I want fan control to be under control of the motherboard.

note on IPMI reset

If after writing the new thresholds, you see no change, then reset your board management controller. Either by removing all power to the machine or by issuing a reset of the BMC (you can do this via ipmitool or via the BMC webUI).

guessing at IPMI

Super Micro keeps all IPMI documentation here, but as of 2018-08-17 The H11 platform is not listed. It seems the H11 platform uses a AST2500 from Aspeed like X11 platform does. To be sure, I opened a case with Super Micro asking for which documentation on that page applies to my motherboard.

update 2018-08-20

Super Micro did confirm that to set PWM fan duty cycle H11 uses the same raw command as X11 platform.

guess at temperature zones

The motherboard manual says

FANA, FANB, FAN1~FAN64-pin System/CPU Fan Headers

Since most search engine hits I saw that Super Micro boards use FAN1-n for CPU temperature control and FANA-z for peripheral temperature control, I assume FAN1 through FAN6 are the ones that respond to CPU temperature changes.

used fan mode

I set Fan to Optimal Speed so that the BMC can control the CPU zone with a low target speed and fixes the peripheral zone to a low speed. See Supermicro X9/X10/X11 Fan Speed Control for details.

original fan assertion thresholds

The original fan thresholds were;

[pcfe@workstation ~]$ ipmitool -H supermicro-bmc -U ADMIN -f ~/.ipmi-supermicro-bmc -I lanplus sensor get FAN6
Locating sensor record...
Sensor ID              : FAN6 (0x46)
 Entity ID             : 29.6
 Sensor Type (Threshold)  : Fan
 Sensor Reading        : 1100 (+/- 0) RPM
 Status                : ok
 Lower Non-Recoverable : 300.000
 Lower Critical        : 500.000
 Lower Non-Critical    : 700.000
 Upper Non-Critical    : 25300.000
 Upper Critical        : 25400.000
 Upper Non-Recoverable : 25500.000
 Positive Hysteresis   : 100.000
 Negative Hysteresis   : 100.000
 Assertion Events      :
 Assertions Enabled    : lcr- lnr- ucr+ unr+
 Deassertions Enabled  : lcr- lnr- ucr+ unr+

adjusting fan thresholds

I found the lower non-recoverable speed of the fans not to my liking. Especially since with the low noise replacement fans I would get pulsating fans, very annoying and noisy.

Noctua fans

Since in the past Noctua fans served me well and with little noise, I connected to the PWM ports of the motherboard 6 fans of type NF-S12A PWM and a NH-U14S TR4-SP3 CPU cooler.

With the default thresholds, the H11DSi-NT would spin up one or more fans every few seconds and I’d see entries in the SEL.

[pcfe@workstation ~]$ ipmitool -H supermicro-bmc -U ADMIN -f ~/.ipmi-supermicro-bmc -I lanplus sel list
[...]
[...] Lower Critical going low - Assertion
[...] Lower Non-Recoverable going low - Assertion
[...] Lower Non-Recoverable going low - Deassertion
[...] Lower Critical going low - Deassertion
[...]

This means that the Noctua fans were spinning below the Lower Non-Recoverable (lnr) value. This lnr assertion would make the board, as expected, spin the fans up to 1200 – 1400 RPM. But maximum fan speed is what I try to avoid by using PWM fans.

Whatsmore, after a few seconds the fans would spin down again and a short while later the cycle would start anew.

Obviously, the thresholds need adjusting. Time to check the Noctua specs and write new lower (and also upper) values to my BMC.

set lower thresholds

Since the used fans all advertise Stops at 0% PWM, I went with the following values

  • Lower Non-Recoverable threshold to 0
  • Lower Critical threshold to 100
  • Lower Non-Critical threshold to 200
[pcfe@workstation ~]$ for i in 1 2 3 4 5 6 A B; do ipmitool -H supermicro-bmc \
                      -U ADMIN -f ~/.ipmi-supermicro-bmc -I lanplus \
                      sensor thresh FAN${i} lower 0 100 200;done
Locating sensor record 'FAN1'...
Setting sensor "FAN1" Lower Non-Recoverable threshold to 0.000
Setting sensor "FAN1" Lower Critical threshold to 100.000
Setting sensor "FAN1" Lower Non-Critical threshold to 200.000
Locating sensor record 'FAN2'...
Setting sensor "FAN2" Lower Non-Recoverable threshold to 0.000
Setting sensor "FAN2" Lower Critical threshold to 100.000
Setting sensor "FAN2" Lower Non-Critical threshold to 200.000
Locating sensor record 'FAN3'...
Setting sensor "FAN3" Lower Non-Recoverable threshold to 0.000
Setting sensor "FAN3" Lower Critical threshold to 100.000
Setting sensor "FAN3" Lower Non-Critical threshold to 200.000
Locating sensor record 'FAN4'...
Setting sensor "FAN4" Lower Non-Recoverable threshold to 0.000
Setting sensor "FAN4" Lower Critical threshold to 100.000
Setting sensor "FAN4" Lower Non-Critical threshold to 200.000
Locating sensor record 'FAN5'...
Setting sensor "FAN5" Lower Non-Recoverable threshold to 0.000
Setting sensor "FAN5" Lower Critical threshold to 100.000
Setting sensor "FAN5" Lower Non-Critical threshold to 200.000
Locating sensor record 'FAN6'...
Setting sensor "FAN6" Lower Non-Recoverable threshold to 0.000
Setting sensor "FAN6" Lower Critical threshold to 100.000
Setting sensor "FAN6" Lower Non-Critical threshold to 200.000
Locating sensor record 'FANA'...
Setting sensor "FANA" Lower Non-Recoverable threshold to 0.000
Setting sensor "FANA" Lower Critical threshold to 100.000
Setting sensor "FANA" Lower Non-Critical threshold to 200.000
Locating sensor record 'FANB'...
Setting sensor "FANB" Lower Non-Recoverable threshold to 0.000
Setting sensor "FANB" Lower Critical threshold to 100.000
Setting sensor "FANB" Lower Non-Critical threshold to 200.000

The lowest actual speed I have seen for these fans is 300, hence the lower limits were set below that value, to avoid alerts in normal operation.

determine upper thresholds

for Noctua NF-S12A PWM

Max speed of the NF-S12A PWM fan is 1200 (respectively 900 with the low noise adapter) ±10%. Those are on connectors FAN1, FAN3, FAN4, FAN6, FANA and FANB.

for Noctua NF-A15 PWM

Max speed of the NH-U14S TR4-SP3 fan is 1500 (respectively 1200 with the low noise adapter) ±10%. It’s on connector FAN5.

no FAN2

No fan is connected to the FAN2 header

set upper thresholds

I set my upper thresholds set as follows;

[pcfe@workstation ~]$ ipmitool -H supermicro-bmc -U ADMIN -f ~/.ipmi-supermicro-bmc -I lanplus \
                      sensor thresh FAN5 upper 1700 1800 1900
Locating sensor record 'FAN5'...
Setting sensor "FAN5" Upper Non-Critical threshold to 1700.000
Setting sensor "FAN5" Upper Critical threshold to 1800.000
Setting sensor "FAN5" Upper Non-Recoverable threshold to 1900.000
[pcfe@workstation ~]$ for i in 1 3 4 6 A B; do ipmitool -H supermicro-bmc -U ADMIN \
                      -f ~/.ipmi-supermicro-bmc -I lanplus \
                      sensor thresh FAN${i} upper 1400 1500 1600;done
Locating sensor record 'FAN1'...
Setting sensor "FAN1" Upper Non-Critical threshold to 1400.000
Setting sensor "FAN1" Upper Critical threshold to 1500.000
Setting sensor "FAN1" Upper Non-Recoverable threshold to 1600.000
Locating sensor record 'FAN3'...
Setting sensor "FAN3" Upper Non-Critical threshold to 1400.000
Setting sensor "FAN3" Upper Critical threshold to 1500.000
Setting sensor "FAN3" Upper Non-Recoverable threshold to 1600.000
Locating sensor record 'FAN4'...
Setting sensor "FAN4" Upper Non-Critical threshold to 1400.000
Setting sensor "FAN4" Upper Critical threshold to 1500.000
Setting sensor "FAN4" Upper Non-Recoverable threshold to 1600.000
Locating sensor record 'FAN6'...
Setting sensor "FAN6" Upper Non-Critical threshold to 1400.000
Setting sensor "FAN6" Upper Critical threshold to 1500.000
Setting sensor "FAN6" Upper Non-Recoverable threshold to 1600.000
Locating sensor record 'FANA'...
Setting sensor "FANA" Upper Non-Critical threshold to 1400.000
Setting sensor "FANA" Upper Critical threshold to 1500.000
Setting sensor "FANA" Upper Non-Recoverable threshold to 1600.000
Locating sensor record 'FANB'...
Setting sensor "FANB" Upper Non-Critical threshold to 1400.000
Setting sensor "FANB" Upper Critical threshold to 1500.000
Setting sensor "FANB" Upper Non-Recoverable threshold to 1600.000

verify fan sensor values

Checking the values with

[pcfe@workstation ~] $ ipmitool -H supermicro-bmc -U ADMIN -f ~/.ipmi-supermicro-bmc -I lanplus sensor list | grep ^FAN

gives me

namevalueunitsstatelnrlclncuncucunr
FAN1400.000RPMok0.000100.000200.0001400.0001500.0001600.000
FAN2nananananananana
FAN3500.000RPMok0.000100.000200.0001400.0001500.0001600.000
FAN4400.000RPMok0.000100.000200.0001400.0001500.0001600.000
FAN5500.000RPMok0.000100.000200.0001700.0001800.0001900.000
FAN6500.000RPMok0.000100.000200.0001400.0001500.0001600.000
FANA500.000RPMok0.000100.000200.0001400.0001500.0001600.000
FANB500.000RPMok0.000100.000200.0001400.0001500.0001600.000

verify current temperature

Current temperatures were obtained with;

[pcfe@workstation ~]$ ipmitool -H supermicro-bmc -U ADMIN -f ~/.ipmi-supermicro-bmc -I lanplus sensor list | grep -i temp | grep -v na

the result was

namevalueunitsstatelnrlclncuncucunr
CPU1 Temp25.000degrees Cok5.0005.00010.00095.000100.000100.000
System Temp29.000degrees Cok5.0005.00010.00080.00085.00090.000
Peripheral Temp48.000degrees Cok5.0005.00010.00080.00085.00090.000
MB_10G Temp68.000degrees Cok5.0005.00010.00095.000100.000105.000
VRMCpu1 Temp37.000degrees Cok5.0005.00010.00095.000100.000105.000
VRMSoc1 Temp40.000degrees Cok5.0005.00010.00095.000100.000105.000
VRMP1ABCD Temp45.000degrees Cok5.0005.00010.00095.000100.000105.000
VRMP1EFGH Temp42.000degrees Cok5.0005.00010.00095.000100.000105.000
P1-DIMMB1 Temp37.000degrees Cok5.0005.00010.00080.00085.00090.000
P1-DIMMD1 Temp39.000degrees Cok5.0005.00010.00080.00085.00090.000
P1-DIMMF1 Temp34.000degrees Cok5.0005.00010.00080.00085.00090.000
P1-DIMMH1 Temp33.000degrees Cok5.0005.00010.00080.00085.00090.000

Everything except MB_10G Temp is nice and cool when the box is idle. That temperature, while higher, is sill well below worrisome levels (Upper Non-Critical, unc).

link collection

上一篇
下一篇