Server Products
Data Center Products including boards, integrated systems, Intel® Xeon® Processors, RAID Storage; and Intel® Xeon® Processors
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!
4302 Discussions

memory thrm related warnings/errors on vmware.

RSmir
Beginner
2,031 Views

Добрый день !

У меня подобная проблема которая тут обсуждалась. Ответа на вопрос я не нашёл и поэтому повторяю :

Remote Management Module key : Installed

Device (BMC) Available : Yes

BMC FW Build Time : 2018-06-07 11:48:53

BIOS ID : SE5C610.86B.01.01.0027.071020182329

BMC FW Rev : 1.53.11210

Boot FW Rev : 1.07

SDR Package Version : SDR Package 1.17

Mgmt Engine (ME) FW Rev : 03.01.03.050

Baseboard Serial Number : BQTP61800321

Сенсоры в BMC WEB console :

VRD Hot All deasserted OK 0x0000

DIMM Thrm Mrgn 1 Normal OK -48 degrees C

DIMM Thrm Mrgn 2 Normal Unknown Not Available

DIMM Thrm Mrgn 3 Normal OK -51 degrees C

DIMM Thrm Mrgn 4 Normal Unknown Not Available

Agg Therm Mgn 1 Normal OK -7 degrees C

Постоянно, приблизительно каждые 5 минут в сфере vmware выходит сообщение о температуре на DIMM. Помогите от этого избавиться.

for english :

Good afternoon !

 

 

At me a similar problem which here was discussed. I did not find the answer to the question and therefore I repeat:

 

 

Remote Management Module key: Installed

 

Device (BMC) Available: Yes

 

BMC FW Build Time: 2018-06-07 11:48:53

 

BIOS ID: SE5C610.86B.01.01.0027.071020182329

 

BMC FW Rev: 1.53.11210

 

Boot FW Rev: 1.07

 

SDR Package Version: SDR Package 1.17

 

Mgmt Engine (ME) FW Rev: 03.01.03.050

 

Baseboard Serial Number: BQTP61800321

 

 

Sensors:

 

 

VRD Hot All deasserted OK 0x0000

 

DIMM Thrm Mrgn 1 Normal OK -48 degrees C

 

DIMM Thrm Mrgn 2 Normal Unknown Not Available

 

DIMM Thrm Mrgn 3 Normal OK -51 degrees C

 

DIMM Thrm Mrgn 4 Normal Unknown Not Available

 

Agg Therm Mgn 1 Normal OK -7 degrees C

 

 

Constantly, approximately every 5 minutes in the vmware sphere there is a message about the temperature on the DIMM.

 

help please get rid of it.

0 Kudos
27 Replies
idata
Community Manager
304 Views

Hello rs-link,

 

 

Hope you are doing fine.

 

 

In order to better assist you can you confirm the motherboard model?

 

 

Regards,

 

 

Charlie_Intel

 

RSmir
Beginner
304 Views

Processed Tags

FRU & SDR Update Package for Intel(R) Server Board S2600TPx (S

2600TP_117)

Copyright (c) 2017 Intel Corporation

Intel(R) Server Board S2600TPR detected

Auto-detecting chassis model and attached hardware.

This may take up to 1 minute to complete.

NODE Presence is present

Detected Hardware

**************************************************************

******************

Intel(R) Server Chassis H2000G Product Family

Intel(R) Server Board S2600TPR

Intel(R) Xeon(R) Processor E5-2600 in socket 1

Intel(R) Xeon(R) Processor E5-2600 in socket 2

Baseboard FRU Device

1600 Watt Power Supply Module 1

Power Supply Module 1 FRU Device

1600 Watt Power Supply Module 2

Power Supply Module 2 FRU Device

Redundant Power Supply Configuration

HSBP is a 2U 12 slot 3.5 inch HDD Backplane

IO Module FRU Storage Device

IO Module Temperature Device

idata
Community Manager
304 Views

Hello rs-link,

 

 

Are able to restart the server?

 

Can you perform a CMOS clear and system event log clear just for testing purposes?

 

 

Regards,

 

 

Charlie_Intel

 

 

RSmir
Beginner
304 Views

Good afternoon !

 

 

Yes of course I will. Tomorrow I'll be in the server room, I'll write.

Thank you.

idata
Community Manager
304 Views

Hello rs-link,

 

 

Thank you, let us know how it goes.

 

 

Regards,

 

 

Charlie_Intel
RSmir
Beginner
304 Views

After completing your instructions, the day passed - the flight was normal. Sphere does not give out more warnings about the memory temperature. Tomorrow I'll cut the computational modules into combat mode - let's see.

 

 

Many thanks for the advice.

 

 

With the best wishes.
RSmir
Beginner
304 Views

Good afternoon !

 

Unfortunately the messages have again climbed. Confuses this:

DIMM Thrm Mrgn2 and Mrgn4

RSmir
Beginner
304 Views

In addition to the complete information, a message from vsphere:

idata
Community Manager
304 Views

Hello rs-link,

 

 

Is there a way that you can share the sysinfo logs from the server with us so we can perform a deep research about it?

 

With the last image what we can see is a failing sensor, but we would like to sure about it.

 

 

Regards,

 

 

Charlie_Intel

 

 

RSmir
Beginner
304 Views

Good evening !

 

I attach the report generated for developers in the form of an archive file with logs. I hope that helps.

 

 

Regards
idata
Community Manager
304 Views

Hello rs-link,

 

 

Thanks for the file, unfortunately, the file is encrypted and we were not able to open it.

 

Also, the one that we require is the sysinfo log, you can pull it out by following this link: https://www.intel.com/content/www/us/en/support/articles/000023940/server-products/server-boards.htm...

 

Please let us know if you need further assistance to download the logs.

 

 

Regards,

 

 

Charlie_Intel
RSmir
Beginner
304 Views

The prose of life is that I do not encrypt this file, but your hardware. Anyway. Attached files with logs taken by sysinfo. I really hope that helps. Look forward to.

 

In advance, I'm sorry for my English.

 

Regards
idata
Community Manager
304 Views

Hello rs-link,

 

 

Do not worry.

 

Thank you for the file, are you able to switch memories 1 and 2, clear the Cmos again and then send us the new logs?

 

 

Regards,

 

 

Charlie_Intel
RSmir
Beginner
304 Views

Good afternoon !

 

 

The memory is replaced in the following order

 

DIMM A1 <---> F1

 

DIMM B1 <---> E1

 

 

Cross crosswise.

 

 

CMOS reset

 

 

Logs are attached.

 

 

Thank you in advance
idata
Community Manager
304 Views

Hello rs-link,

 

 

Thank you for the information, let us do a quick research about it and we will contact you back.

 

 

Regards,

 

 

Charlie V.
RSmir
Beginner
304 Views

Good afternoon !

 

 

How's it going with my problem? Do not forget about me?
idata
Community Manager
304 Views

 

Hello rs-link,

 

 

Please do not worry we are just waiting for a second level response and we are going to get back to you as soon as possible. Thank you.

 

 

Regards,

 

 

Charlie_Intel

 

idata
Community Manager
304 Views

Hello rs-link,

 

 

Thank you for your patience, these values are not memory temps but actual margins or thresholds.

 

What I did notice in the log is processor IERR. This is an internal processor error and 2 mins after that I see warning events for DIMM Thrm Mrgn failures.

 

 

Could be related to processor since processors include the memory controller in them.

 

 

Now, the entries are from 09/02 and I don't see them happening anymore.

 

 

Could you confirm if the issues are still showing?

 

I would like you to check the newer logs, is that possible?

 

 

Regards,

 

 

Charlie_Intel
idata
Community Manager
304 Views

Hello rs-link,

 

 

We are wondering if you need further assistance?

 

 

Regards,

 

 

Charlie_Intel
idata
Community Manager
167 Views

Hello rs-link,

 

 

Any update?

 

 

Regards,

 

 

Charlie_Intel
Reply