Software Archive
Read-only legacy content
17061 Discussions

General Exploration anaylsis on Intel Xeon Phi does not work

silvio_stanzani
Beginner
216 Views

We are trying to run a General Exploration at the Matrix Multiplication example that comes with the VTune (attached).

This execution is returning the following error:
 
abstract_Reserve_PMU ret error
abstract_Release_PMU ret error
Invalid error code
amplxe: Collection failed.
amplxe: Internal Error


The home of the users (and also the root's) are shared with the Xeon Phi using NFS. So, the binary is available at the machine and can be
executed:

ssh mic3 env LD_LIBRARY_PATH=/opt/intel/lib/mic ~/matrix/linux/matrix.mic
 
with the output:
Threads #: 240 OpenMP threads
Matrix size: 512
Using multiply kernel: multiply3
Execution time = 0.432 seconds
Freq = 1.238094 GHz


We've succesfully compiled the vtsspp and sep3 drivers and they are loaded at the board OS:

[silvio@phi02 linux]$ ssh mic3 lsmod |egrep "sep|vtss"
vtsspp                315729  0
sep4_0                610187  2


We believe that this is probably permission related but we still cannot find what is wrong.

By loging into the mic we notice that a file named /sep3.15/sep_mic_server.txt displays messages of connection  error:

ERROR Reading data from socket: Connection reset by peer
ERROR reading header from socket: Connection reset by peer

First we though this was a firewall related issue but we tried the same with the firewall disabled and got the same result.

We already tried to recompile the vtsspp and sep drivers as pointed out at some threads on intel forums and we still get the same issue.

The Intel Xeon Phi OS reports on its /var/log/messages the following
(which I don't know if it's an error):

Jul 21 14:46:03 phi02-mic3 user.info kernel: [1290689.232827] sep4_0: LWPMUDRV_IOCTL_VERSION
Jul 21 14:46:46 phi02-mic3 user.info kernel: [1290732.228897] sep4_0: LWPMUDRV_IOCTL_VERSION
Jul 21 14:47:29 phi02-mic3 user.info kernel: [1290775.311339] sep4_0: LWPMUDRV_IOCTL_VERSION
Jul 21 14:47:46 phi02-mic3 user.info kernel: [1290791.547044] sep4_0: LWPMUDRV_IOCTL_VERSION
Jul 21 14:48:14 phi02-mic3 user.info kernel: [1290820.433531] sep4_0: LWPMUDRV_IOCTL_VERSION


It seems like we have some issue with the version of the sep driver, but performing the command  /opt/intel/vtune_amplifier_xe_2016.3.0.463186/bin64/sep -version -mic we get the following output:
 
Sampling Enabling Product version: 4.0 (private) built by patbbinn on Apr 20 2016 02:39:28
SEP User Mode Version: 4.0.0
mic 0 (phi02-mic0.ncc.unesp.br): SEP driver version 4.0.0
mic 1 (phi02-mic1.ncc.unesp.br): SEP driver version 4.0.0
mic 2 (phi02-mic2.ncc.unesp.br): SEP driver version 4.0.0
mic 3 (phi02-mic3.ncc.unesp.br): SEP driver version 4.0.0
mic 4 (phi02-mic4.ncc.unesp.br): SEP driver version 4.0.0
 

which suggests that the driver versions are ok.

We are using the Intel Parallel Studio Cluster Edition 2016 update 3 on a CentOS 7.2.1511 machine with kernel 3.10.0-327.18.2.el7.x86_64

You'll find attached the log files generated by the vtune execution and the application.

 

0 Kudos
0 Replies
Reply