Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
822 Views

DIMM Thermal Margin Error on Intel S2600CW

Hi,

We're running VMWare ESXI 6.5 on an Intel S2600CW server with two Xeon 2620 v3 and we've been seeing this error. I tried to open a case online but I there were any good topic options to choose from. I've updated the firmware to the latest as of 1/29/19 and the error still occurs. What can I do to fix this or is it a motherboard fault?

0 Kudos
10 Replies
Highlighted
Moderator
29 Views

Hello CMart31, Thank you for joining the community. Are you seeing this error from the BMC console? We definitely need more information about this issue. Could you please run and gather a sysinfo log using the following tool: https://downloadcenter.intel.com/download/26991/System-Information-Retrieval-Utility-SysInfo-?produc... This will let us know the exact moment when this error happened. Besides that could you please attach the memory modules part number? Looking forward for your updates Jose A. Intel Customer Support Technician Under Contract to Intel Corporation
0 Kudos
Highlighted
Beginner
29 Views

Hi Jose,

Thanks for the response! I'm currently getting this info now.

0 Kudos
Highlighted
Beginner
29 Views

sysinfo_log.txt: https://pastebin.com/3tBgMRff

pci_log: https://pastebin.com/rYjDWRgc

 

The warnings for the FAN is because I took out the 2nd CPU and 2nd FAN. I also noticed the DIMM Thermal Margins aren't listed because I've been testing with different DIMMs installed and what not.

We have two 32GB DIMMs (currently taken out) with serial number: Samsung 32GB 4DRx4 PC4-2133P M386A4G40DM0-CPB

and another two 32GB DIMMs:    Manufacturer: "Hynix"

   Serial: "91B8702B"

   Part Number: "HMA84GL7AMR4N-TF"

@Jose_Intel let me know if you need anything else.

0 Kudos
Highlighted
Beginner
29 Views

One final update, it looks like if a bank is empty, VMWare will throw a DIMM thermal warning error message, I saw it for 2, 3, and 4 since I only have RAM in the in A1 and B1 for the first CPU.

0 Kudos
Highlighted
Moderator
29 Views

Hello CMart31, Both memory modules show up as validated for this server board so the modules should not be the issue here. If you have noticed the DIMM thermal error seems to come from the OS and not the BMC which is the OS independent monitoring tool I might thinks this is some kind of false error or OS driver/bug. Please allow us some time to analyze the logs attached and I will get back to you as soon as I have updates. Regards. Jose A. Intel Customer Support Technician Under Contract to Intel Corporation
0 Kudos
Highlighted
Beginner
29 Views

Hi @Jose_Intel​, after some further testing, it seems that if each bank of RAM is populated with a stick of RAM then the DIMM Thermal Margins report correctly in VMWare with no warnings. I find this odd because I thought we wanted dual channel RAM (so two slots used in a bank for each CPU, and then two other banks are left empty). Is this normal?

0 Kudos
Highlighted
Moderator
29 Views

Hello CMart31, We have checked the sysinfo log and found no error related to thermal margin on RAM. What we found was the values for empty slots provide negative values referring to the temperature "distance" to reach the defined temp threshold. We suspect that your OS might take that as an error thus throwing a false positive. We suggest you to check with the OS developer directly and ask how does it acquire the monitoring data and if it could be somehow disabled or modified so it won't display the alarms. Regards Jose A. Intel Customer Support Technician Under Contract to Intel Corporation
0 Kudos
Highlighted
Moderator
29 Views

Hello CMart31, Do you have any updates, questions or comments in regards to this issue? Please do not hesitate to contact us back. If you consider the issue to be completed please let us know so we can proceed to mark this thread as resolved. Regards Jose A. Intel Customer Support Technician Under Contract to Intel Corporation
0 Kudos
Highlighted
Moderator
29 Views

Hello CMart31, Do you have any further details, updates, questions or comments in regards to this issue? This thread will be marked as resolved automatically in the next 48 hours if no activity is received. Regards Jose A. Intel Customer Support Technician Under Contract to Intel Corporation
0 Kudos
Highlighted
Moderator
29 Views

Hello CMart31, We will proceed to mark this thread as resolved. If you have further issues or questions just go ahead and create a new topic. Jose A. Intel Customer Support Technician Under Contract to Intel Corporation
0 Kudos