<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Unknown header type 7f in Software Archive</title>
    <link>https://community.intel.com/t5/Software-Archive/Unknown-header-type-7f/m-p/1014591#M35517</link>
    <description>&lt;P class="p1"&gt;I'm running RHEL 7.0 and I the system seems to have a problem talking to the Phi card.&lt;/P&gt;

&lt;P class="p1"&gt;This is what I see in lspci:&lt;/P&gt;

&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;03:00.0 Co-processor: Intel Corporation Xeon Phi coprocessor 5100 series (rev ff) (prog-if ff)&lt;/SPAN&gt;&lt;/P&gt;

&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; !!! Unknown header type 7f&lt;/SPAN&gt;&lt;/P&gt;

&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Kernel driver in use: &lt;/SPAN&gt;&lt;SPAN class="s2"&gt;mic&lt;/SPAN&gt;&lt;/P&gt;

&lt;P class="p1"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P class="p1"&gt;I've attached the micdebug log.&lt;/P&gt;</description>
    <pubDate>Tue, 07 Apr 2015 23:48:26 GMT</pubDate>
    <dc:creator>Jacob_F_</dc:creator>
    <dc:date>2015-04-07T23:48:26Z</dc:date>
    <item>
      <title>Unknown header type 7f</title>
      <link>https://community.intel.com/t5/Software-Archive/Unknown-header-type-7f/m-p/1014591#M35517</link>
      <description>&lt;P class="p1"&gt;I'm running RHEL 7.0 and I the system seems to have a problem talking to the Phi card.&lt;/P&gt;

&lt;P class="p1"&gt;This is what I see in lspci:&lt;/P&gt;

&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;03:00.0 Co-processor: Intel Corporation Xeon Phi coprocessor 5100 series (rev ff) (prog-if ff)&lt;/SPAN&gt;&lt;/P&gt;

&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; !!! Unknown header type 7f&lt;/SPAN&gt;&lt;/P&gt;

&lt;P class="p1"&gt;&lt;SPAN class="s1"&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Kernel driver in use: &lt;/SPAN&gt;&lt;SPAN class="s2"&gt;mic&lt;/SPAN&gt;&lt;/P&gt;

&lt;P class="p1"&gt;&amp;nbsp;&lt;/P&gt;

&lt;P class="p1"&gt;I've attached the micdebug log.&lt;/P&gt;</description>
      <pubDate>Tue, 07 Apr 2015 23:48:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Unknown-header-type-7f/m-p/1014591#M35517</guid>
      <dc:creator>Jacob_F_</dc:creator>
      <dc:date>2015-04-07T23:48:26Z</dc:date>
    </item>
    <item>
      <title>I want to thank you for</title>
      <link>https://community.intel.com/t5/Software-Archive/Unknown-header-type-7f/m-p/1014592#M35518</link>
      <description>&lt;P&gt;I want to thank you for including the micdebug output. It was useful in eliminating a number of possibilities.&lt;/P&gt;

&lt;P&gt;I suspect what is going on is that your card is overheating. You might want to look at&amp;nbsp;https://software.intel.com/en-us/forums/topic/532366, where they were also seeing the unknown header type message after the system had been up for a few minutes. They also had a problem with the BIOS version, but the card returning just strings of 1s basically implies the card has given up and shut down. You can check this by unplugging the host, letting everything come back to room temperature then powering the system back up and checking lspci as soon as the host is back up. You can use micsmc (there is a man page) to monitor the temperature. &amp;nbsp;(From the micdebug output, it looks like the card might have come up right after it was installed but didn't stay up long. It looks like you might have run 'micctrl --initdefaults' after you installed the MPSS but it didn't complete - /etc/mpss/default.conf is there but /etc/mpss/mic0.conf is either missing or corrupted. Do you know if that it true?)&lt;/P&gt;

&lt;P&gt;In any event, this may be an issue you need to take back to the supplier for your host system to make sure it is configured correctly for the coprocessor. Let us know what you find out.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Apr 2015 08:43:27 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Unknown-header-type-7f/m-p/1014592#M35518</guid>
      <dc:creator>Frances_R_Intel</dc:creator>
      <dc:date>2015-04-08T08:43:27Z</dc:date>
    </item>
    <item>
      <title>I do remember seeing some</title>
      <link>https://community.intel.com/t5/Software-Archive/Unknown-header-type-7f/m-p/1014593#M35519</link>
      <description>&lt;P&gt;I do remember seeing some weirdness with micctrl --initdefaults before.&lt;/P&gt;

&lt;P&gt;I tried uninstalling MPSS, powering down to let it cool off, then reinstalling.&lt;BR /&gt;
	&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;When I got to the modprobe mic step I got this:&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;sudo modprobe mic&lt;BR /&gt;
	Message from syslogd@monster at Apr &amp;nbsp;8 11:23:33 ...&lt;BR /&gt;
	&amp;nbsp;kernel:BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:2:263]&lt;BR /&gt;
	BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:2:263]&lt;BR /&gt;
	rcu_sched self-detected stall on CPU {0} )t=60000 jiffies g=2492 c=2491 q=0)&lt;BR /&gt;
	ETC timer compensation(-1000000ppm) is much higherthan expected&lt;BR /&gt;
	&lt;BR /&gt;
	Then when I ran micctrl --initdefaults:&lt;BR /&gt;
	&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;sudo micctrl --initdefaults&lt;/SPAN&gt;&lt;BR /&gt;
	&lt;SPAN class="s2" style="font-size: 1em; line-height: 1.5;"&gt;[Warning]&lt;/SPAN&gt;&lt;SPAN class="s1" style="font-size: 1em; line-height: 1.5;"&gt; mic0: Generating compatibility network config file /opt/intel/mic/filesystem/mic0/etc/sysconfig/network/ifcfg-mic0 for IDB.&lt;/SPAN&gt;&lt;BR /&gt;
	&lt;SPAN class="s2" style="font-size: 1em; line-height: 1.5;"&gt;[Warning]&lt;/SPAN&gt;&lt;SPAN class="s1" style="font-size: 1em; line-height: 1.5;"&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; This may be problematic at best and will be removed in a future release, Check with the IDB release.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P class="p1"&gt;&lt;SPAN class="s1" style="font-size: 1em; line-height: 1.5;"&gt;I've attached my latest micdebug.&lt;BR /&gt;
	&lt;BR /&gt;
	Thanks Frances, let me know if there's anything else I can try, or why you think the CPUs might be getting into soft lockup.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Apr 2015 18:41:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Unknown-header-type-7f/m-p/1014593#M35519</guid>
      <dc:creator>Jacob_F_</dc:creator>
      <dc:date>2015-04-08T18:41:11Z</dc:date>
    </item>
    <item>
      <title>Thank you for your attention,</title>
      <link>https://community.intel.com/t5/Software-Archive/Unknown-header-type-7f/m-p/1014594#M35520</link>
      <description>&lt;P&gt;Unknown header type 7f&lt;/P&gt;

&lt;P&gt;Thank you for your attention, I have the same problem of the lspci output:&lt;/P&gt;

&lt;P&gt;84:00.0 Co-processor: &amp;nbsp;Intel Corporation Xeon Phi coprocessor 31S1 (rev ff) (prog-if ff)&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; !!!Unknown header type 7f&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Kernel driver in use: mic&lt;/P&gt;

&lt;P&gt;After all the device came to the room temperature, I powered on the system and &amp;nbsp;the lspci output is the same. When I tried to use micsmc -t to see the mic0's temperature, I got error message:&lt;/P&gt;

&lt;P&gt;Error: mic0: unable to determin device status: get post code: read: /sys/class/mic/mic0/post_code: No such device or address&lt;/P&gt;

&lt;P&gt;The output of micdebug.sh is attached follows.&lt;/P&gt;

&lt;P&gt;I will be very appreciate for your help!&lt;/P&gt;</description>
      <pubDate>Mon, 13 Jul 2015 03:21:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Unknown-header-type-7f/m-p/1014594#M35520</guid>
      <dc:creator>lu_S_</dc:creator>
      <dc:date>2015-07-13T03:21:25Z</dc:date>
    </item>
    <item>
      <title>hi,</title>
      <link>https://community.intel.com/t5/Software-Archive/Unknown-header-type-7f/m-p/1014595#M35521</link>
      <description>&lt;P&gt;hi,&lt;/P&gt;

&lt;P&gt;just out of curiosity: can you try unloading the mic driver and then rerun 'lspci -vv -s 84:0' again ?&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;JJK&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 13 Jul 2015 09:01:16 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Unknown-header-type-7f/m-p/1014595#M35521</guid>
      <dc:creator>JJK</dc:creator>
      <dc:date>2015-07-13T09:01:16Z</dc:date>
    </item>
    <item>
      <title>I sat down and walked my way</title>
      <link>https://community.intel.com/t5/Software-Archive/Unknown-header-type-7f/m-p/1014596#M35522</link>
      <description>&lt;P&gt;I sat down and walked my way though all the information Lu S. sent and I still think this is an overheating problem.&lt;/P&gt;

&lt;P&gt;In the messages log, we can see the coprocessor booting successfully during the host boot.&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
	&lt;P&gt;Jul 13 18:25:32 localhost kernel: mic0: Transition from state ready to booting&lt;BR /&gt;
		Jul 13 18:25:32 localhost kernel: mic image: /usr/share/mpss/boot/rasmm-kernel.knightscorner-ab.elf&lt;BR /&gt;
		Jul 13 18:25:32 localhost kernel: MIC 0 Booting&lt;BR /&gt;
		Jul 13 18:25:32 localhost kernel: mic0: Transition from state booting to online&lt;BR /&gt;
		Jul 13 18:25:32 localhost kernel: ELF booted succesfully&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;There is nothing to show what the lspci output was at that time, but the card cannot boot if the mic kernel module cannot read the header. So at that time the header must have been valid.&lt;/P&gt;

&lt;P&gt;However, the coprocessor doesn't stay up for long.&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
	&lt;P&gt;Jul 13 18:26:50 localhost kernel: mic0: Transition from state online to resetting&lt;BR /&gt;
		Jul 13 18:26:52 localhost kernel: mic0: Resetting (Post Code 3C)&lt;BR /&gt;
		Jul 13 18:26:53 localhost kernel: mic0: Resetting (Post Code 3d)&lt;BR /&gt;
		Jul 13 18:26:54 localhost kernel: mic0: Resetting (Post Code 3d)&lt;BR /&gt;
		Jul 13 18:26:55 localhost kernel: mic0: Resetting (Post Code 3d)&lt;BR /&gt;
		Jul 13 18:26:56 localhost kernel: mic0: Resetting (Post Code 3d)&lt;BR /&gt;
		Jul 13 18:26:57 localhost kernel: mic0: Resetting (Post Code 3d)&lt;BR /&gt;
		Jul 13 18:26:58 localhost kernel: mic0: Resetting (Post Code 3E)&lt;BR /&gt;
		Jul 13 18:26:59 localhost kernel: mic0: Resetting (Post Code 3E)&lt;BR /&gt;
		Jul 13 18:27:00 localhost kernel: mic0: Resetting (Post Code 3E)&lt;BR /&gt;
		Jul 13 18:27:01 localhost kernel: mic0: Resetting (Post Code 09)&lt;BR /&gt;
		Jul 13 18:27:02 localhost kernel: mic0: Resetting (Post Code 09)&lt;BR /&gt;
		Jul 13 18:27:03 localhost kernel: mic0: Resetting (Post Code 12)&lt;BR /&gt;
		Jul 13 18:27:03 localhost kernel: mic0: Transition from state resetting to ready&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;This is followed by a couple attempts to bring up the network connection to the coprocessor, which fail because the coprocessor isn't online.&lt;/P&gt;

&lt;P&gt;Then the coprocessor reboots.&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
	&lt;P&gt;Jul 13 18:34:28 localhost kernel: mic0: Transition from state ready to booting&lt;BR /&gt;
		Jul 13 18:34:28 localhost kernel: mic image: /usr/share/mpss/boot/rasmm-kernel.knightscorner-ab.elf&lt;BR /&gt;
		Jul 13 18:34:28 localhost kernel: MIC 0 Booting&lt;BR /&gt;
		Jul 13 18:34:28 localhost kernel: mic0: Transition from state booting to online&lt;BR /&gt;
		Jul 13 18:34:28 localhost kernel: ELF booted succesfully&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;But this time, the coprocessor barely makes it up before it comes down again.&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
	&lt;P&gt;Jul 13 18:38:28 localhost kernel: mic0: Transition from state online to resetting&lt;BR /&gt;
		Jul 13 18:38:29 localhost kernel: Invalid Postcode : ��Jul 13 18:38:30 localhost kernel: mic0: Resetting (Post Code ��&lt;BR /&gt;
		Jul 13 18:38:30 localhost kernel: mic0: Transition from state resetting to reset failed&lt;BR /&gt;
		Jul 13 18:38:30 localhost kernel: MIC 0 RESETFAIL postcode ��1&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;and apparently brings the the mpss daemon down with it, since the service needs to be restarted.&lt;/P&gt;

&lt;P&gt;So, at system boot, the header was valid and the coprocessor booted but by the time the host was up in multi-user mode and Lu S. was able to run lspci, the coprocessor had shut itself down.&lt;/P&gt;

&lt;P&gt;People who have been seeing this behavior might want to contact their supplier to determine that the card is working properly, then check out the posts in this forum where people have been talking about solutions for cooling their cards in systems which do not provide adequate cooling by default.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 22 Jul 2015 01:47:23 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Unknown-header-type-7f/m-p/1014596#M35522</guid>
      <dc:creator>Frances_R_Intel</dc:creator>
      <dc:date>2015-07-22T01:47:23Z</dc:date>
    </item>
  </channel>
</rss>

