Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Amlesh_K_
Beginner
94 Views

Vtune with Xeon Phi shows internal error

I am trying to analyse an offload using VTune. I have put the following command in the script file along with mpirun - 

amplxe-cl -target-system=mic-host-launch -collect advanced-hotspots -r xeonPhiAnalysis

I also added the following line in bashrc and sourced bashrc -

source /opt/intel/vtune_amplifier_xe_2015.2.0.393444/amplxe-vars.sh

 

I am submitting the job through PBS with 1 MPI processor. (When I log on to the node and do "top", it doesn't show the code running) The run goes on till 1 minute by PBS and then i am getting the following error - 

Copyright (C) 2009-2014 Intel Corporation. All rights reserved.
Intel(R) VTune(TM) Amplifier XE 2015 (build 393444)
amplxe: Using target: mic-host-launch:
amplxe: The analyzed process has a rank of 0. This rank will be added to the result path/name.
amplxe: Collection started. To stop the collection, either press CTRL-C or enter from another console window: amplxe-cl -r /storage/home/amlesh/optimize/cesm1_2_2/cases/opt1case/run/xeonPhiAnalysis.0 -command stop.
amplxe: Warning: To enable hardware event-base sampling, VTune Amplifier has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes.
amplxe: Internal Error

 

0 Kudos
4 Replies
Dmitry_P_Intel1
Employee
94 Views

Hello Amlesh,

Is there a chance to upgrade VTune to a newer version e.g. VTune Amplifier XE 2016 Gold? We got a bunch of fixes since that time both stability and performance.

BTW - are you able to do a native application collection on the node with the card? Someting like:

amplxe-cl -target-system=mic-native -collect advanced-hotspots ./bin/ls

just to check that the card side installed correctly? BTW - how many cards do you have per the node?

Thanks & Regards, Dmitry

Amlesh_K_
Beginner
94 Views

Hi Dmitry,

We may be able to upgrade to VTune 16 Gold but it will take time. I have not tried to do a native run with my application, (CESM), and only doing offloading. We have performed experiments with xeon phi with sample codes and they show expected results. 

We have 2 xeon phi cards per node.

 

Thanks.

Dmitry_P_Intel1
Employee
94 Views

Hello,

Amlesh,

Could you please run:

<VTune_install_dir>/bin64/sep -version -mic

and provide the output.

Also just in case please put the card number explicitly like:

amplxe-cl -target-system=mic-host-launch:0 -collect advanced-hotspots -r xeonPhiAnalysis

Thanks & Regards, Dmitry

Amlesh_K_
Beginner
94 Views

Hi Dmitry,

This is what I get on master node -

Sampling Enabling Product version: 3.15 (private) built by patbbinn on Jan 30 2015 02:39:55
SEP User Mode Version: 3.15.5
There is no MIC card exist or online.

 

This is what I get on one of the nodes (on which I am running my application) - 

Sampling Enabling Product version: 3.15 (private) built by patbbinn on Jan 30 2015 02:39:55
SEP User Mode Version: 3.15.5
mic 0 (node1-mic0): SEP driver version 3.15.5
mic 1 (node1-mic1): SEP driver version 3.15.5

 

This is what I get on rest of the nodes - 

Sampling Enabling Product version: 3.15 (private) built by patbbinn on Jan 30 2015 02:39:55
SEP User Mode Version: 3.15.5
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card <1:1047>, errno 111ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card, make sure sep_mic_server is running on the card, retrying connection...
ERROR connecting to MIC card <1:1047>, errno 111mic 0: Error retrieving SEP driver version

 

I ran by specifying the mic number also, but it is showing the same error message that I have posted.

 

Thanks

 

Reply