- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have a custom board(RC10), which has E3845 and is similar to MinnowBoard MAX. I have customized from Intel Firmware Engine MinnowBoard MAX firmware to RC10 by enabling i2c-0, PCIe-2, etc. When the Linux system boots, it shows "mce: [Hardware Error]: Machine check events logged" 300 seconds after the boot.
1. Since the original configuration came from the MinnowBoard MAX, which uses E3825, the mce error might come from it. If yes, how can I change the processor to E3845.
2. Other than #1 I don't have any idea where the mce error came from. Is there any way to track it down by disabling HW components(e.g. PCIE-0)?
- Tags:
- Firmware
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Brian,
Here is the output of mcelog --client:
mcelog: failed to prefill DIMM database from DMI data
Kernel does not support page offline interface
mcelog: Family 6 Model 37 CPU: only decoding architectural errors
Hardware event. This is not a software error.
MCE 0
CPU 0 BANK 0
ADDR fef80000
TIME 978536917 Wed Jan 3 10:48:37 2001
MCG status:
MCi status:
Uncorrected error
MCi_ADDR register valid
Processor context corrupt
MCA: Internal unclassified error: 410
Running trigger `unknown-error-trigger'
STATUS a600000007600410 MCGSTATUS 0
MCGCAP 806 APICID 0 SOCKETID 0
CPUID Vendor Intel Family 6 Model 55
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We'd like to get the log of the machine check exception to figure out what's going on.
On Linux systems, you should be able to get this using mcelog - http://mcelog.org/
As an example you can install this on Ubuntu/Debian using apt-get:
sudo apt-get install mcelog
The events will be logged to /var/log/mcelog
. You can also run:
sudo mcelog --client
to query the mcelog
daemon for errors.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Brian,
Here is the output of mcelog --client:
mcelog: failed to prefill DIMM database from DMI data
Kernel does not support page offline interface
mcelog: Family 6 Model 37 CPU: only decoding architectural errors
Hardware event. This is not a software error.
MCE 0
CPU 0 BANK 0
ADDR fef80000
TIME 978536917 Wed Jan 3 10:48:37 2001
MCG status:
MCi status:
Uncorrected error
MCi_ADDR register valid
Processor context corrupt
MCA: Internal unclassified error: 410
Running trigger `unknown-error-trigger'
STATUS a600000007600410 MCGSTATUS 0
MCGCAP 806 APICID 0 SOCKETID 0
CPUID Vendor Intel Family 6 Model 55
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Jong. We'll investigate this and let you know what we find.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jong: did you try to enable ECC memory on your board?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Brain,
Unfortunately, we don't have ECC (E3845 - DRAM1_DQ[56..x] aka DRAM0_ECC_DQ[0..x]) in RC10 board design. We didn't think it was necessary.
Are you recommending to have ECC in RC10 board design? Do you think the MCE message come from memory?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, I don't recommend using ECC with this project. It wasn't a feature enabled on the MinnowBoard Max. I'm just trying to rule it out as a problem. Thanks for the information.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Brian,
For your information I tired 0.84 firmware from https://firmware.intel.com/projects/minnowboard-max on both RC10 and MinnowBoard MAX. None of those had the mce error. I think the firmware built from the Intel Firmware Engine had some problem. What do you think?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We're investigating the 0.84 codebase differences already. There may be some delay on our end due to the Christmas holiday, but I'll keep you posted. Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Is there any update? Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
what kind of linux did you try? yocto?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It's Debian 8 Jessie. As I mentioned previously, 0.84 firmware from https://firmware.intel.com/projects/minnowboard-max didn't have mce error with Debian 8.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I can reproduce it in ubuntu and yocto. After debugging, i found this machine check error actually happens during bios post. It is not a critical error, minor issue and happens only once. Will not impact later OS running. You can temporarily ignore it. Besides, the root cause has been found, we are gonna fix this bug in later release.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Is it possible to send you a release candidate to see if you see the issue again?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Laurie,
Yes, I can try.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you send an email to Firmware_Engine@intel.com so I can give instructions for downloading a pre-release for testing.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page