I have a S2600CP motherboard, with two Xeon E2660 processors on it. It's in a non-intel chassis (a chenbro 1U case)
For some reason, the thermal margin sensors on the motherboard are not being recognized. This is leading to the BMC throwing an warning, and the fans being stuck at 100% all the time.
I've updated the BIOS and BMC to the latest version, as well as updated the FRU, and reprogrammed/validated the SDR configuration so the fans are set up properly, and they're correctly detected in the IPMI sensor readings dialog.
What can cause the thermal margin sensors to stop registering? All the thermal margin sensors are showing as "Unknown" with a reading of "Not Available".
This is doubly confusing as linux can see the temperature sensors in the CPU without issue, using the "coretemp-isa-0000" and "coretemp-isa-0001" interfaces.
I wound up editing the sensor data record file to remove the thermal margin sensors, which seems to prevent the BMC from freaking out and setting the fans to 100%. I also had to modify the "Sensor Unavailable Control Value" PWM settings to reduce the fall-back fan speed. I also changed the sensor used to drive the fan speeds (and the sensor ramp end-points), so the system now uses the "BB P2 VR Temp" value to modulate the fans. This isn't ideal, but it seems to track the actual CPU temperatures FAR more closely then the "BB EDGE Temp" sensor, which is the sensor used for the fan speed control normally (it's basically intake air temp, and therefore completely useless).
OTOH, I can now run through the SDR update procedure without even having to look at the command prompt, so that something (albeit of dubious value).
Doing some testing, even with every core maximally loaded, my top core temperature is 71°C, which is acceptable. I'd still like to understand why the motherboard is not seeing or properly detecting the thermal margin sensors. I'd guess that the SDR for the E5-2660 v1 CPUs I'm using are slightly different then the v2 and later CPUs, though all the documentation specifies that the motherboard supports "E5-2600 series" processors without any rev number.
Intel: Can you please look into this?
Could you post your .sdr file? I also have a Chenbro 1U case and experienced the same situation with the fans stuck at 100%. I edited mine as well down to about 25%, but am still tweaking it because I'm not sure that it has the most efficient settings. My fans run around 3000-3500 rpm and the system temps are about 40-60°C, but I only have 2 VMs running currently.
Ok, here's the SDR file.
It only currently updates the CPU fans depending on the VR temp. This is because I've rehoused my S2600CP in a RPC-432 case, and I'm using chassis fans that are large and slow enough that their noise doesn't bother me.
It should be possible to fiddle with the fan domains to put all of them in the CPU domain, but I'm not exactly sure what you'd need to change to do that.