Server Products
Data Center Products including boards, integrated systems, Intel® Xeon® Processors, RAID Storage, and Intel® Xeon® Processors
4784 Discussions

Processor event alarm details

IHaru
Beginner
1,158 Views

Hi,

My Alarm Management Module is handling IPMI traps from Mullins TIGW1U. I'd like to check, what do i do (user corrective action) in case of processor event alarm?

This alarm is mapped to the ipmi traps below. Anyone can advice on the corrective action for each of these event trap?

1) processorIERREvent - 1.3.6.1.4.1.343.2.10.3.5.1000.40.1 --> internal error

2) processorFRB1Event - 1.3.6.1.4.1.343.2.10.3.5.1000.40.3 --> faut resilient boot error, BIST failure

3) processorFRB2Event - 1.3.6.1.4.1.343.2.10.3.5.1000.40.4 --> faut resilient boot error, hang in POST failure

4) processorFRB3Event - 1.3.6.1.4.1.343.2.10.3.5.1000.40.5 --> faut resilient boot error, initialization failure

5) processorConfigurationErrorEvent - 1.3.6.1.4.1.343.2.10.3.5.1000.40.6 --> configuration error

6) processorSMBIOSUncorrectableCPUEvent - 1.3.6.1.4.1.343.2.10.3.5.1000.40.10 --> smbios uncorrectable error.

Regards

0 Kudos
1 Reply
DSilv11
Valued Contributor III
329 Views

I would remove all add in cards and try again.

IERR is the most maligned error code.

What it means is the CPU is not able to make forward process in its opertional code. -- IE it is waiting for something.

 

Most often it is a add in card failing rather than the processor.

The rest of the errors look like a cascade from the first.

3 types if FRB timmer indicating a hang.

configuration error could be the hung CPU, a bad CPU or miss matched CPUs

The last is telling you, its dead and could not recover it's self.

If it works with no cards installed, add the cards back one at a time until it fails again.

0 Kudos
Reply