Server Products
Data Center Products including boards, integrated systems, Intel® Xeon® Processors, RAID Storage, and Intel® Xeon® Processors
4784 Discussions

SC2600CO random reboot

GMigu1
Beginner
7,554 Views

Dears, I have a SC2600CP board in a server with 2 Xeon CPUs and 196GB of RAM.

This machine is used as a calculation node in a cluster environment, with other machines that has almost the same configuration.

A few days ago it started to reboot with no reason.

To try to identify the problem, I checked all DIMM slots and all the memory's looking for someone with error. I tested all of them but could not found an error.

Then I checked the SEL logs:

1 | 09/17/2017 | 16:38:10 | Event Logging Disabled # 0x07 | Log area reset/cleared | Asserted

2 | 09/17/2017 | 17:16:55 | Power Unit # 0x01 | Failure detected | Asserted

3 | 09/17/2017 | 17:16:56 | Power Unit # 0x01 | Power off/down | Asserted

4 | 09/17/2017 | 17:17:01 | Power Unit # 0x01 | Power off/down | Deasserted

5 | 09/17/2017 | 17:17:01 | Power Unit # 0x01 | Failure detected | Deasserted

6 | 09/17/2017 | 17:17:02 | Power Unit # 0x01 | Power off/down | Asserted

7 | 09/17/2017 | 17:17:07 | Power Unit # 0x01 | Power off/down | Deasserted

8 | 09/17/2017 | 17:17:13 | Fan # 0x32 | Lower Non-critical going low | Deasserted

9 | 09/17/2017 | 17:17:13 | Fan # 0x32 | Lower Critical going low | Deasserted

a | 09/17/2017 | 17:17:13 | Fan # 0x32 | Lower Non-critical going low | Deasserted

b | 09/17/2017 | 17:17:13 | Fan # 0x32 | Lower Critical going low | Deasserted

c | 09/17/2017 | 17:17:24 | Fan # 0x32 | Lower Non-critical going low | Asserted

d | 09/17/2017 | 17:17:24 | Fan # 0x32 | Lower Critical going low | Asserted

e | 09/17/2017 | 17:17:31 | System Event # 0x83 | Timestamp Clock Sync | Asserted

f | 09/17/2017 | 17:17:32 | System Event # 0x83 | Timestamp Clock Sync | Asserted

10 | 09/17/2017 | 17:17:55 | System Event # 0x83 | OEM System boot event | Asserted

and on the BMC web console:

3009/17/2017 17:39:32Pwr Unit StatusPower Unitreports the power unit is powered off or being powered down - Asserted2909/17/2017 17:37:19BIOS Evt SensorSystem Eventreports OEM System Boot Event - Asserted2809/17/2017 17:36:56BIOS Evt SensorSystem Eventreports Timestamp Clock Sync. Event is one of two expected events from BIOS on every power on. - Asserted2709/17/2017 17:36:56BIOS Evt SensorSystem Eventreports Timestamp Clock Sync. Event is one of two expected events from BIOS on every power on. - Asserted2609/17/2017 17:36:49System Fan 3Fanreports the sensor is in a low, critical, and going lower state - Asserted2509/17/2017 17:36:49System Fan 3Fanreports the sensor is in a low, but non-critical, and going lower state - Asserted2409/17/2017 17:36:36System Fan 3Fanreports the sensor is in a low, critical, and going lower state - Deasserted2309/17/2017 17:36:36System Fan 3Fan...
0 Kudos
36 Replies
GMigu1
Beginner
1,021 Views

Hi Mike, thank you for your time and support with this issue.

I trully think that it could be related to a board problem and not with OS itself. Anyway, I am downloading a evaluation version of the SuSE enterprise to our cluster main server to make possible the installation via PXE.

The machine is rebooting even when I try to install the opensuse LEAP it via PXE or USB. I am hoping that this will not be an issue when trying to boot it with Suse Enterprise via PXE.

Only today I have figured out that the motherboard it is out of its warranty period, so I would like to know: Can you keep supporting me or if this case will be closed?

Once again, thank you once again for your time and support.

Best regards,

0 Kudos
idata
Employee
1,021 Views

Hi, Guilhermefsmiguel,

 

 

The board is still under support so Intel will continue supporting it for a while. Intel provides with 3 years of warranty since the date of purchase; I suggest you review the receipt date.

 

 

I will be waiting for the outcome of the new operating system installation.

 

 

Regards,

 

Mike C
0 Kudos
idata
Employee
1,021 Views

Hi, Guilhermefsmiguel,

 

 

I am following your case with random reboots on the Intel® Server Board S2600COE.

 

 

I will case the case open if you need anything else.

 

 

Regards,

 

Mike C
0 Kudos
GMigu1
Beginner
1,021 Views

Hi Mike,

I tried to install the SLES 11 on the machine, via PXE and via USB.

I also tried both option, legacy boot or EFI boot.

Non of them worked. I couldn't even install the OS. It continues to restart.

As I told you, I have replaced the power source and the problem persists.

I will kindly ask you about the procedures to send this board to Intel warranty, so they can better test it and find the solution for the problem.

Thank you for your time and support,

Best regards,

0 Kudos
idata
Employee
1,021 Views

Hi, Guilhermefsmiguel,

 

 

In regard to the random reboots with the Intel® Server Board S2600COE. I reviewed the board markings and it is out of warranty according to the date of production; however, if you have the proof of purchase within 3 years, please contact us using our phone, email or chat support and we will gladly replace it.

 

 

PBA: G29920-205

 

S/N: QSC022700376

 

 

Find us using our contact support link and use Servers option:

 

https://www.intel.com/content/www/us/en/support/contact-support.html

 

 

In your previous posts, you mentioned the system can get into the BIOS and EFI mode without any random reboot. I suggest you to try to install the operating system on any other hard drive; we need to narrow down the issue.

 

 

I look forward to hearing from you.

 

 

Regards,

 

Mike C

 

0 Kudos
GMigu1
Beginner
1,021 Views

Hi Mike,

The professor who bought the machine, confirmed me that the purchase date it is more than tree years too. So th warranty it is not an option.

I will just make an observation, that when the machine it is on BIOS and in the Intel UEFI ( the one that I used to upgrade the firmware), it does not restart.

When I tried to install the OS using the USB pendrive, I used the EFI boot (the one who appears when the F6 it is pressed durring boot).

I don't know if it is possible and how to use the Intel UEFI to boot the OS.

I believe that this is not an Hard disk issue, but I will try to install the OS on Thursday, since I will have to travel in a few hours.

I asked the professor to buy another power supply for the machine ( We have already replaced it, but he wants to try with another one).

Thank you for your time and support,

0 Kudos
GMigu1
Beginner
1,021 Views

Mike,

When I said:

I believe that this is not an Hard disk issue, but I will try to install the OS on Thursday, since I will have to travel in a few hours.

Please read:

believe that this is not an Hard disk issue, but I will try to install the OS in another HD on Thursday, since I will have to travel in a few hours. I will try to install the OS in a SSD HD.

Also, if you have a procedure that I can use to install the SLES via Intel UEFI ( the one who I used to upgrade the firmware), please send it to me.

Best regards,

0 Kudos
idata
Employee
1,021 Views

Hi, Guilhermefsmiguel,

 

 

Following the random reboots with the Intel® Server Board S2600COE.

 

 

I will be waiting for your results using a new hard drive.

 

 

In regard to your question how to install the operating system using UEFI Boot Mode, the instructions are below.

 

 

1-Reset BIOS settings (Press F9 say "yes" and save changes with F10 say.

 

2-On Main tab, change Quiet Boot option to disabled.

 

3-Advanced tab: go to Mass Storage Controller Configuration, select Sata Port 0-5 and double check if AHCI is selected as AHCI Capable SATA Controller.

 

4-Go to Boot Maintenance Manager: select Advanced Boot Options and select "UEFI" on Boot Mode

 

5-Save changes with F10

 

 

Make sure the USB Suse bootable drive is ready for UEFI.

 

 

-Plug the USB to your server

 

-Restart the system and keep tapping F6, go to EFI shell, select the USB drive and install the OS.

 

 

I will be waiting for your results.

 

 

Regards,

 

Mike C
0 Kudos
GMigu1
Beginner
1,021 Views

Hi Mike,

Last time I have followed all the 5 steps you told me. When I was selecting the boot device, I have selected the EFI from USB drive. I did not entered on the Intel EFI.

Once in the EFI, do you have any instructions on how to do it? Can you check internally about the procedure to do it via intel EFI shell with SLES?

The professor bought a brand new 580W power supply, that, because of Brazil regulations should be in the Lab in about a week.

As I told you, I can only test the installation of the OS on Thursday, so I will ask you hold on until then, so I can share with you the results.

Once again, thank you for your time and support.

0 Kudos
GMigu1
Beginner
1,021 Views

Hi Mike,

I found the instructions on the novell website:

https://www.novell.com/support/kb/doc.php?id=7003263 Support | UEFI SLES/SLED 11 partitioning recommendations

Could you confirm this procedure to me?

Thank you,

0 Kudos
idata
Employee
1,021 Views

Hi, Guilhermefsmiguel,

 

 

Following your case with random reboots on the Intel® Server Board S2600COE.

 

 

Guilhermefsmiguel, I have read the document posted and it looks fine. It explains how to set the operating system.

 

 

The document requires setting the Boot Mode as UEFI in the BIOS of the board (at Boot Maintenance Manager)

 

 

Intel engineers have tested SUSE Linux Enterprise Server 10 SP3, 32-bit and 64-bit; it works with limitations, I am sending you more details about the Operating System Compatibility.

 

 

https://www.intel.com/content/www/us/en/support/articles/000007556/server-products/server-boards.html Operating System Compatibility for Intel® Server Board S2600CO Family

 

 

Regards,

 

Mike C
0 Kudos
GMigu1
Beginner
1,021 Views

Hi Mike,

I tested the machine with another power supply and it still rebooting. It was a 850W power supply.

Do you have any other options for us to test?

0 Kudos
idata
Employee
1,021 Views

Hi, Guilhermefsmiguel,

 

 

In regard to random reboots on the Intel® Server Board S2600COE and Suse operating system.

 

 

I will suggest testing your board with SUSE Linux* Enterprise Server 11 or Red Hat Linux* Enterprise Server 6.3 as a workaround. If the problem continues, I am afraid your board gets damaged.

 

 

Regards,

 

Mike C

 

0 Kudos
idata
Employee
1,021 Views

Hi, Guilhermefsmiguel,

 

 

I am following the random reboots on the Intel® Server Board S2600COE and Suse operating system.

 

 

I was wondering if you have enough time to test your system with any other operating system.

 

 

Regards,

 

Mike C
0 Kudos
GMigu1
Beginner
1,021 Views

Hi Mike,

I was planning to but, sue to stress issues, I had some heath problems during the week and I am still under recovery. I beliave that I can test it next week and send you the results, ok?

Thank you

0 Kudos
David_A_Intel
Moderator
1,021 Views

Hello guilhermefsmiguel,

Feel free to post back again once you have the results.

Best regards,

David A.

0 Kudos
Reply