- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
maybe someone has an idea re. VRD hot asserted. My system is a S2600STBR with XEON Silver 4210 and Kingston 4*KSM26RD8/16HDI. System is running well, i do not see any issues except for the "System Status LED" is blinking amber ( 1 second frequency ). It's about the VRD hot sensor: SEL says "CPU2, DIMM Channel 1/2" ) but CPU2 and DIMM are not even populated !
One can touch all heatsinks on board, they are not even warm ( and definitely not hot ).
Anyhow, i installed 2 additional chassis fans in server case, but it does not change anything.
I already updated system from initial firmware to latest BIOS/BMC Package ( 02.01.0014 ) but no change.
Well, i thought populating CPU2 and DIMM Channel 1/2 might change the game, but it doesn't. CPU2 and DIMM Channel 1/2 are recognized and run w/o issues.
I tried resettig CMOS ( by jumper on S2600STBR ) and also by BMC "Reset to factory" - no luck.
It's the same when resetting BMC ( via ipmitool mc reset cold ).
Any idea how to remedy that? If you need more information, just ask.
Just for completeness; power supply unit can deliver 750W, CPU1+2 and MB power are connected, the additonal 4-pin "12V aux power" is not. I hope, this is not the source of my trouble.
Thanks in advance, Ulrich
Link Copied
- « Previous
-
- 1
- 2
- Next »
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello UlrichP,
Thank you for the patience and time, we are investigating this with the engineering team. In the meantime, I'd like to confirm that you tried to swap the DIMMS on Channels 1/2 and same results, is that correct?
Regards,
Paul R.
Intel Customer Support Technician
For firmware updates and troubleshooting tips, visit:
https://intel.com/support/serverbios
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Paul,
well, actually not. As it is really difficult or even impossible to get memory modules on the market these days, that are listed in the HCL for S2600STBR, there was no chance to test it (yet).
So still waiting for the opportunity to grab modules from an existing (fully functional /error free) system. But this might take a while.
What i did was interchanging the 4 modules from CPU1 to CPU2 and vice versa - but no change at all. As CPU1 memory doesn't get a "bad press" by VRD hot sensor, i thougth message would turn to reason "CPU1 - DIMM channel 1/2" - but nothing like this happened. It keeps saying reason is "CPU2 - DIMM channel 1/2". Like a fixed message.
Regards, Uli
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello UlrichP,
Thank you for all the information provided.
Please allow us to review the details you have shared with us. We will share an update soon.
Regards,
Paul R.
Intel Customer Support Technician
For firmware updates and troubleshooting tips, visit:
https://intel.com/support/serverbios
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello UlrichP,
Thank you for your patience and time, we are still investigating, can you please provide the SysInfo logs for our investigation? Use the following tool to retrieve them:
Regards,
Paul R.
Intel Customer Support Technician
For firmware updates and troubleshooting tips, visit:
https://intel.com/support/serverbios
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Paul,
of cause, please find the logs attached.
I used sysinfo tool version 15.0.3 and ran it from UEFI shell.
If you need something else or additional logs, just say a word.
Regards, Uli
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello UlrichP,
Thank you for all the information provided.
Please allow us to review the details you have shared with us. We will share an update soon.
Regards,
Paul R.
Intel Customer Support Technician
For firmware updates and troubleshooting tips, visit:
https://intel.com/support/serverbios
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello UlrichP,
Thank you for your patience and time , after going over the latest Syslog info, we cannot see any issues with the temperature reported on the board. All levels of temperature are operating on the design specifications.
I would suggest as the last option, checking the Fan1 cables/connections and making sure all is properly working and no apparent heat.
Please let me know the outcome and if possible provide a new set of fresh logs with the alert.
Regards,
Paul R.
Intel Customer Support Technician
For firmware updates and troubleshooting tips, visit:
https://intel.com/support/serverbios
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Paul,
i checked Fan 1 ( and all other 4 i.e. FAN1 .. FAN5 ) - all are fine ( do spin ) and are properly connected.
When updating SDR file all FANs are properly detected. So i just swapped cable connection of FAN1 and FAN2.
But it does not change the amber light... I also checked CPU Fans, replcaed both - no change as well.
One observation: CPU Fans run always at full speed ( noisy ! ) since i do have the new Intel case.
Prior to that they were running but with much lower rounds per minute (rpm).
I also had the system to re-detect all changes by setting BIOS Defaults (F9) and CMOS clear (Jumper) .
No change... The only thing: SELlog and sysinfo are dated from 01.01.2020 now - but they are from today and fresh.
( because NTP cleared after CMOS reset ).
I added a short video showing the green and amber LED blinking. You probably remember that i reported the green LED near BMC chip...
And once again a deguglog for your engineers.
To be frankly: I've got the impression, that we are not even close to a solution, are we?
Please keep me posted about the outcome.
Regards, Uli
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello UlrichP,
Thank you for all the information provided.
Please allow us to review the details, I will keep you posted.
Regards,
Paul R.
Intel Customer Support Technician
For firmware updates and troubleshooting tips, visit:
https://intel.com/support/serverbios
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello UlrichP,
Thank you very much for your patience and time, as per the revision of the logs in terms of temperature and fans RPMs we can say that there has to be a sensor failure since the server is not presenting any issues with this.
Therefore, we would like to replace the board, we will create a case internally in which I will send you an email requesting all the information needed to proceed.
Regards,
Paul R.
Intel Customer Support Technician
For firmware updates and troubleshooting tips, visit:
https://intel.com/support/serverbios
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Paul,
thanks a lot, i just replied to your mail re. Intel Customer Support - Case #: 05372027.
and provided the information you were asking for.
Regards,
Uli
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Paul,
just to let you know: yesterday i recieved my replacement board ( not a new one, but functional ).
What shall i say: everything is working - even in None-Intel-Case. Case solved and dismissed.
Thanks for your patience and support.
One more happy customer...
Regards, Ulrich

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- « Previous
-
- 1
- 2
- Next »