Software Archive
Read-only legacy content

SCIF error

Minh_H_
Beginner
206 Views

Hi,
We have application in python which call to C++/C to scif to communicate with Fortran/C++ server on Xeon Phi (CPU <--> Xeon Phi on the same computer). This application runs one a month. In the past this works fine, however  recently we got the following error in kernel (see the end).
To resolve this we needs to reboot the computer, reboot Xeon Phi(s) alone did not resolve the problem.  We have like to ask if there is any way to
resolve this issue without reboot the computer.
   Many thanks,
     Minh

---------------------------------------------------------------------------------------------
[Mon Mar 14 11:55:19 2016] [<ffffffff81766c59>] schedule+0x29/0x70
[Mon Mar 14 11:55:19 2016] [<ffffffff817699fd>] rwsem_down_write_failed+0x1ed/0x390
[Mon Mar 14 11:55:19 2016] [<ffffffff81393093>] call_rwsem_down_write_failed+0x13/0x20
[Mon Mar 14 11:55:19 2016] [<ffffffff8176932d>] ? down_write+0x2d/0x40
[Mon Mar 14 11:55:19 2016] [<ffffffffc06c9657>] __scif_pin_pages+0xf7/0x500 [mic]
[Mon Mar 14 11:55:19 2016] [<ffffffffc06d5880>] ? micscif_nodeqp_send+0xf0/0x1c0 [mic]
[Mon Mar 14 11:55:19 2016] [<ffffffffc06caf27>] __scif_register+0x1c7/0x830 [mic]
[Mon Mar 14 11:55:19 2016] [<ffffffffc06c8c91>] ? scif_user_send+0x191/0x370 [mic]
[Mon Mar 14 11:55:19 2016] [<ffffffff810a17b0>] ? wake_up_state+0x10/0x20
[Mon Mar 14 11:55:19 2016] [<ffffffff810e3b76>] ? wake_futex+0x66/0x90
[Mon Mar 14 11:55:19 2016] [<ffffffffc06cf5b5>] scif_process_ioctl+0xc85/0xeb0 [mic]
[Mon Mar 14 11:55:19 2016] [<ffffffffc06b2c8d>] mic_ioctl+0x3d/0x60 [mic]
[Mon Mar 14 11:55:19 2016] [<ffffffff811e69d0>] do_vfs_ioctl+0x2e0/0x4c0
[Mon Mar 14 11:55:19 2016] [<ffffffff810e6e31>] ? SyS_futex+0x71/0x150
[Mon Mar 14 11:55:19 2016] [<ffffffff811e6c31>] SyS_ioctl+0x81/0xa0
[Mon Mar 14 11:55:19 2016] [<ffffffff8176aced>] system_call_fastpath+0x1a/0x1f
[Mon Mar 14 11:55:23 2016] __scif_register 2607 err -107
[Mon Mar 14 11:55:24 2016] __scif_register 2626 err -107
[Mon Mar 14 11:56:22 2016] __scif_register 2607 err -107
[Mon Mar 14 11:56:22 2016] __scif_register 2626 err -107
[Mon Mar 14 11:57:12 2016] __scif_register 2626 err -14
[Mon Mar 14 11:57:12 2016] __scif_register 2626 err -14
[Mon Mar 14 11:57:12 2016] __scif_register 2626 err -14
[Mon Mar 14 11:57:12 2016] __scif_register 2626 err -14
[Mon Mar 14 11:57:12 2016] __scif_register 2626 err -14

0 Kudos
0 Replies
Reply