Community
cancel
Showing results for 
Search instead for 
Did you mean: 
JMcCo12
Beginner
1,015 Views

Random fan spin-up on R1304SPOSHBNR

I have recently purchased three S1200SPOR servers, and they all three are having the same issue: at random intervals, the fans will spin up to max for just a second or two before settling back down. These intervals can be anywhere from a few minutes to an hour, and they do not correlate with server load.

I have upgraded the firmware to S1200SP.86B.03.01.0026.092720170729, check for thermal trips, and examined the SEL, but they didn't help. However, in watching the IPMI sensors, I did notice one correlation: the times when the fans spin up seem to correspond to the sensors being unable to read "P1 Therm Margin". My conclusion was that the system is occasionally unable to read the power supply temperature, so it spins up the fans "just in case", but on the next reading it reads the temperature correctly and so returns the fans to normal.

I checked the sensor wire and it is secure. (Fiddling with it doesn't trigger the fan response either.) I might conclude a hardware fault, but this is happening on all three systems, each of which were purchased at different times over the last few months.

I also found this thread: but the information isn't any help since the SEL doesn't report any errors, the problem is intermittent, I don't have a RAID controller in them, and the standard checks didn't reveal anything promising.

If anyone has any ideas on where to go next, I would appreciate the input.

Thanks!

System info dump (all three set up the same):

OS: CentOS 7

# dmidecode

Getting SMBIOS data from sysfs.

SMBIOS 2.7 present.

64 structures occupying 3770 bytes.

Table at 0x80630000.

Handle 0x0006, DMI type 0, 24 bytes

BIOS Information

Vendor: Intel Corporation

Version: S1200SP.86B.03.01.0026.092720170729

Release Date: 09/27/2017

Address: 0xF0000

Runtime Size: 64 kB

ROM Size: 16384 kB

Characteristics:

PCI is supported

PNP is supported

BIOS is upgradeable

BIOS shadowing is allowed

Boot from CD is supported

Selectable boot is supported

EDD is supported

5.25"/1.2 MB floppy services are supported (int 13h)

3.5"/720 kB floppy services are supported (int 13h)

3.5"/2.88 MB floppy services are supported (int 13h)

Print screen service is supported (int 5h)

8042 keyboard services are supported (int 9h)

Serial services are supported (int 14h)

Printer services are supported (int 17h)

CGA/mono video services are supported (int 10h)

ACPI is supported

USB legacy is supported

LS-120 boot is supported

ATAPI Zip drive boot is supported

BIOS boot specification is supported

Function key-initiated network boot is supported

Targeted content distribution is supported

UEFI is supported

BIOS Revision: 0.0

Firmware Revision: 0.0

Handle 0x0007, DMI type 1, 27 bytes

System Information

Manufacturer: Intel Corporation

Product Name: S1200SP

Version: R1304SPOSHBNR

Serial Number: QSCD74200212

UUID: 71C4BAF6-8AB0-E711-AB21-A4BF0128AAE2

Wake-up Type: Power Switch

SKU Number: SKU Number

Family: Family

Handle 0x0008, DMI type 2, 17 bytes

Base Board Information

Manufacturer: Intel Corporation

Product Name: S1200SP

Version: H57534-260

Serial Number: QSSA74100091

Asset Tag: Base Board Asset Tag

Features:

Board is a hosting board

Board is replaceable

Location In Chassis: Part Component

Chassis Handle: 0x0000

Type: Motherboard

Contained Object Handles: 0

Handle 0x001D, DMI type 4, 48 bytes

Processor Information

Socket Designation: CPU 1

Type: Central Processor

Family: Xeon

Manufacturer: Intel(R) Corporation

ID: E9 06 09 00 FF FB EB BF

Signature: Type 0, Family 6, Model 158, Stepping 9

Flags:

FPU (Floating-point unit on-chip)

VME (Virtual mode extension)

DE (Debugging extension)

PSE (Page size extension)

TSC (Time stamp counter)

MSR (Model specific registers)

PAE (Physical address extension)

MCE (Machine check exception)

CX8 (CMPXCHG8 instruction supported)

APIC (On-chip APIC hardware supported)

SEP (Fast system call)

MTRR (Memory type range registers)

PGE (Page global enable)

MCA (Machine check architecture)

CMOV (Conditional move instruction supported)

PAT (Page attribute table)

PSE-36 (36-bit page size extension)

CLFSH (CLFLUSH instruction supported)

DS (Debug store)

ACPI (ACPI supported)

MMX (MMX technology supported)

FXSR (FXSAVE and FXSTOR instructions supported)

SSE (Streaming SIMD extensions)

SSE2 (Streaming SIMD extensions 2)

SS (Self-snoop)

HTT (Multi-threading)

TM (Thermal monitor supported)

PBE (Pending break enabled)

Version: Intel(R) Xeon(R) CPU E3-1220 v6 @ 3.00GHz

Voltage: 0.9 V

External Clock: 100 MHz

Max Speed: 4200 MHz

Current Speed: 3000 MHz

Status: Populated, Enabled

Upgrade: Other

L1 Cache Handle: 0x001A

L2 Cache Handle: 0x001B

L3 Cache Handle: 0x001C

Serial Number: To Be Filled By O.E.M.

Asset Tag: To Be Filled By O.E.M.

Part Number: To Be Filled By O.E.M.

Core Count: 4

Core Enabled: 4

Thread Count: 4

Characteristics:

64-bit capable

Multi-Core

Execute Protection

Enhanced Virtualization

Power/Performance Control

Handle 0x000E, DMI type 17, 40 bytes

Memory Device

Array Handle: 0x001E

Error Information Handle: Not Provid...

0 Kudos
17 Replies
idata
Community Manager
44 Views

Hello John,

 

 

Thank you for contacting Intel Technical Support.

 

 

In order to further understand the issue please download the system information retrieval utility from https://downloadcenter.intel.com/downloads/eula/26991/System-Information-Retrieval-Utility-SysInfo-?... here, then uncompress the .zip file into a usb key and boot from it. After that run the file sysinfo.efi. This utility will collect log files and system information that we expect will help us figure out what the problem is the server fans.

 

 

Best regards,

 

 

Franklin S.

 

Intel Technical Support.
JMcCo12
Beginner
44 Views

Hi Franklin, thank you for replying. Here are the logs you requested.

For context, the critical events listed in the SEL on the 13th were from me troubleshooting the issue. There have been numerous fan spin-up events since then.

JMcCo12
Beginner
44 Views

Because a video is worth a thousand pictures, here is a (bad) recording of the SDR sensors during a fan event. Watch the "P1 Therm Margin" sensor to the bottom and listen for the fans to spin up.

idata
Community Manager
44 Views

Hello John,

 

 

Apparently it is expected for the fans to go full speed when the temperature levels are unreadable. Please check this link https://www.intel.com/content/www/us/en/support/articles/000021781/server-products/server-boards.htm... https://www.intel.com/content/www/us/en/support/articles/000021781/server-products/server-boards.htm... it indicates that the problem could be associated with the use of a non validated memory module. On the other hand, there is another link within the same article where it says "Review how to troubleshoot a loud fan scenario" which could be of help but still there is a high chance that using a validated memory module will help.

 

 

Best regards,

 

 

Franklin S.

 

Intel Technical Support.

 

idata
Community Manager
44 Views

Hello John,

 

 

I just wanted to check if you had any news on troubleshooting the fan spin up issue that you were experiencing. Thank you.

 

 

Best regards,

 

 

Franklin S.

 

Intel Technical Support.

 

JMcCo12
Beginner
44 Views

Hi Franklin,

I was out for the Christmas holiday, so I haven't had a chance to work on this in the last couple of weeks. Thanks for following up, though.

I had found those links you provided, but I had discounted them since the symptoms weren't quite the same; namely, the reported SEL entries weren't there. I will see what I can do to get some validated memory, but that may or may not be an option for me. (Budget issues, not distribution issues.)

John.

idata
Community Manager
44 Views

John,

 

 

Hope you had a great holiday season. If you could somehow make a test with validated memory that would be great. I will investigate more to see if I can find more info that points to the specific error message that you are getting but all sources we checked here basically point to the same conclusion with validated memory.

 

 

Best regards,

 

 

Franklin S.

 

Intel Technical Support.

 

idata
Community Manager
44 Views

Hello John,

 

 

I checked here and unfortunately there is not much that we can do until the system is tested with validated memory. Thank you.

 

 

Best regards,

 

 

Franklin S.

 

Intel Technical Support.

 

idata
Community Manager
44 Views

Hello John,

 

 

Since the best advice we can offer you in this particular case is the use of validated memory we will proceed to close this case. If in the future you have any additional concerns or questions please contact us

 

 

Best regards,

 

 

Franklin S.

 

Intel Technical Support.

 

JBeat
Beginner
44 Views

Validated memory? Really? That's the best you got?

idata
Community Manager
44 Views

Hello,

 

 

Here in this link https://serverconfigurator.intel.com/exalt/RequestManager?ServletNumber=1&dynamicUser=Y&localinfo=0&... https://serverconfigurator.intel.com/exalt/RequestManager?ServletNumber=1&dynamicUser=Y&localinfo=0&... you can find a tool that can help you configure compatible hardware for your server. First you would need to select the S1200SPOR board and then the memory category to find the list of validated memories.

 

 

Here is a spreadsheet with that information anyways, in case the website becomes a little confusing.

 

 

Best regards,

 

 

Franklin S.

 

Intel Technical Support.

 

JBeat
Beginner
44 Views

I already have compatible memory...

Now what?

idata
Community Manager
44 Views

Hello jcbeauty,

 

 

For further confirmation, would you please show the output of the command "sudo dmidecode --type 17"?

 

 

Best regards,

 

 

Franklin S.

 

Intel Technical Support.

 

JBeat
Beginner
44 Views

Are you guys going to try and help us out here or just continuously make us jump through hoops until we give up?

idata
Community Manager
44 Views

Hi jcbeaty,

 

 

We need more information to help you diagnose what could be happening to your system. Please download the system information retrieval utility from https://downloadcenter.intel.com/downloads/eula/26991/System-Information-Retrieval-Utility-SysInfo-%... here, and then uncompress the .zip file into a usb key and boot from it. After that run the file sysinfo.efi. This utility will collect log files and system information that we need you to share with us afterwards.

 

 

Thank you,

 

 

Franklin S.

 

Intel Technical Support.

 

idata
Community Manager
44 Views

Hi jcbeaty,

 

 

This is a quick follow up to see if you were able to gather your server logs so that we can figure out if there are any additional configurations that could be adjusted in order to solve your problem.

 

 

Thank you,

 

 

Franklin S.

 

Intel Technical Support.
idata
Community Manager
44 Views

Hi jcbeaty,

 

 

Please do not hesitate to contct us if you need further assistance.

 

 

Thank you,

 

 

Franklin S.

 

Intel Technical Support.