Software Archive
Read-only legacy content

Excessive event log messages from micx64 on Windows

Stefan_F
Beginner
408 Views
Hello!
 
Running on Windows Server 2012, we encountered a situation where the micx64 driver started to log the message "Driver detected an internal error in its data structures for ." about 80 times per second. This filled up the event log rapidly, leaving only about 15 minutes of history with the default event log size.
 
Restarting the system resolved the issues, but I'd like to get to the bottom of this. For one I want to make sure everything is in order with the Xeon Phi cards. Furthermore, I don't want to lose event log data again due to the driver going on a logging spree.
 
The Server in question has four Xeon Phi 31S1P cards installed and is running MPSS 3.8.1 under a fully patched Windows Server 2012. The output of micinfo is attached to this post.
 
We also have a second server with the same hardware configuration where this issue did not occur. It might be worth noting that the machines are usually rebooted once a month for security updates, so this might be something that gets more likely with higher uptime.

Does anyone have suggestion on how to proceed to debug this? Are there any people here using Xeon Phi cards on Windows Server and if so, did you encounter any similar issues?

Kind regards,
Stefan
0 Kudos
5 Replies
gaston-hillar
Valued Contributor I
408 Views

Hi Stefan,

You can configure the desired logging level by calling the micras tool with the desired logging level number specified after -loglevel. You can check the documentation for the micras tool and the available options in the Intel® Manycore Platform Software Stack (Intel® MPSS) for Windows

0 Kudos
gaston-hillar
Valued Contributor I
408 Views

Hi Stefan,

In my previous post, I haven't included the link to the documentation for the Intel® Manycore Platform Software Stack (Intel® MPSS) for Windows

You can check the available options in the documentation referenced in the link.

0 Kudos
gaston-hillar
Valued Contributor I
408 Views

Hi Stefan,

Just in case the version changes, the accurate link in which you will find the latest available documentation for download is the following: https://software.intel.com/en-us/articles/intel-manycore-platform-software-stack-mpss

0 Kudos
Stefan_F
Beginner
408 Views

Hello Gastón, 

thank you for your quick reply.

We are not currently running micras as we have our own monitoring system in place. I will look into getting micras started as a service so we can review its log should the error occur again. 

That said, our monitoring never reported any errors while the windows event log was swamped with driver errors and I know that the application we run on the coprocessors was working as intended. I am not familiar with the kind of events micras can capture, but with what the MPSS user guide hints at, it should basically do the same as our custom monitoring solution.

So, to sum it up: The coprocessors were doing what they were intended to do and were able to communicate with the host via the virtual network adapters, but the windows driver logged a massive number of messages to the system event log.

0 Kudos
gaston-hillar
Valued Contributor I
408 Views

Hi Stefan,

Got it. I thought you were running micas. In many projects, micas has been an extremely useful tool for me.

0 Kudos
Reply