- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi ;
I have a custom board based on C3758 SoC. When i tried to boot OpenBsd OS board gets random hangs. Sometimes its not even booted up to the OS. Sometimes its succesfully boot the Os and get in to the console but after a while it hangs again randomly.
I have examine the hardware in hang situtation and i realized that the IERR_N error signal is asserted in hang situation. Other error signals (MCERR and EROR[N] signals.) are deactive.
Also the we cant find any fault in clocks and power supplies rails when this hang occurs.
What can we do about this problem ?
What can trigger this IERR_N signal.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @M_Serhan_Ozyigit,
Thank you for contacting Intel Embedded Community.
That error is tied to an unrecoverable internal error. This signal is coming from Punit, so it is about a power error report.
The following are some possible cases of such catastrophic errors:
• Retirement watchdog time-out from the core
• Internal error detected by the SoC power management circuitry
Best regards,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Diego ,
Firstly thank you for your answer ,
1. What can cause "Retirement watchdog time-out from the core"
2. I have LEDs on the board that show the states of sleep states and PLTRST signal, and when I get an IERR_N error and the board goes into a freezing state, I see the sleep state 0 powergood signal still standing and the platform reset has been removed (the board is not in reset). Shouldn't these signals be deasserted in the event of a power failure?
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Also sometimes when the board is booting up i get a error message like this "
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @M_Serhan_Ozyigit,
Regarding the watchdog time-out sometimes can be related to memory problems, can be caused by incompatible memory, bad memory or a failure in the processor's memory controllers.
https://www.kernel.org/doc/html/v5.9/watchdog/watchdog-api.html
https://www.makeuseof.com/fix-clock-watchdog-timeout-error-windows/
I'm reviewing the document #558579.
At page 270:
IERR_N: Internal Error: This active-low signal indicates to the external circuitry that the SoC has detected an error.
While the SoC active-low output signal PMU_PLTRST_N is asserted, this signal is not valid and must be ignored by the platform board circuitry.
From 288:
Board designs must not consider IERR_N and MCERR_N valid until after the
Have you verified if all voltage rails are correct?
Also, I found a debug handbook for Xeon that may worth to check for your case (server processor):
Document #576242 - System Hang Issue Debug Handbook
Best regards,
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page