Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)

Cannot load raw collector data

ye_f_1
Beginner
875 Views

I want to use vtune to test the TLB miss of a program.The program will waste a lot of time.I get the following information.

wj@mcc21:~/graph/sssp-mpi$ mpirun -host mcc21 -n 1 amplxe-cl -target-system=mic-host-launch -c advanced-hotspots   -r /tmp/c ./main.cpu : -host mic0 -n 1 /home/wj/graph/main.mic
amplxe: Using target: mic-host-launch
amplxe: Analyzing data in the node-wide mode. The hostname (mcc21) will be added to the result path/name.
amplxe: Collection started. To stop the collection, either press CTRL-C or enter from another console window: amplxe-cl -r /tmp/c.mcc21 -command stop.
amplxe: Warning: To enable hardware event-base sampling, VTune Amplifier has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes

......

......

mcc21-mic0 Finish.
^C[mpiexec@mcc21] Sending Ctrl-C to processes as requested
[mpiexec@mcc21] Press Ctrl-C again to force abort
amplxe: CTRL-C signal is received.
amplxe: Error: The given command is not valid now. Please check the current state of the launcher using `status' command.
amplxe: Collection stopped.
amplxe: Error: target - ERROR: ld.so: object 'libittnotify_collector.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
amplxe: Error: target - ERROR: ld.so: object 'libittnotify_collector.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
amplxe: Using result path `/tmp/c.mcc21'
amplxe: Executing actions  8 % Loading raw data to the database                

amplxe: Error: Cannot load data file `/tmp/c.mcc21/data.0/sep2b0a4d55c700.20160411T161114.414300.tb6' (tbrw call "TBRW_dobind(tbrwFile->getHandle(), streamIndex)" failed: invalid string (97)).
amplxe: Executing actions 50 % done                                            
amplxe: Error: 0x4000001e (Cannot load raw collector data)


Then I use amplxe-cl to check data file.

wj@mcc21:/tmp/c.mcc21$ amplxe-cl -finalize -r /tmp/c.mcc21
amplxe: Using result path `/tmp/c.mcc21'
amplxe: Executing actions 16 % Loading raw data to the database                
amplxe: Error: Cannot load data file `/tmp/c.mcc21/data.0/sep2b0a4d55c700.20160411T161114.414300.tb6' (tbrw call "TBRW_reading_section(ptr, m_sectionId)" failed: Section does not exist (78)).
amplxe: Executing actions 100 % done                                           
amplxe: Error: 0x4000001e (Cannot load raw collector data)

But when I add -duration option.it works well.I want to get global imformation not just a point information.What should I do ?

0 Kudos
6 Replies
Dmitry_P_Intel1
Employee
875 Views

Hello,

Do I understand correctly that you used Ctrl-C to terminate the program?

Could you please do the following check - wait until the host rank is finished to ends the collection not pressing Ctrl-C?

It looks like the trace from MIC target was broken and I want to check if it can be connected with some issues of processing Ctrl-C from the host.

Thanks & Regards, Dmitry

0 Kudos
ye_f_1
Beginner
875 Views

It seems that Ctrl-C don't affect consequence.I run program again without Ctrl-C and get same consequence.

wj@mcc21:~/graph/sssp-mpi$ mpirun -host mcc21 -n 1 amplxe-cl -target-system=mic-host-launch -c advanced-hotspots   -r /tmp/g  ./main.cpu : -host mic0 -n 1 /home/wj/graph/main.mic
amplxe: Using target: mic-host-launch
amplxe: Analyzing data in the node-wide mode. The hostname (mcc21) will be added to the result path/name.
amplxe: Collection started. To stop the collection, either press CTRL-C or enter from another console window: amplxe-cl -r /tmp/g.mcc21 -command stop.
amplxe: Warning: To enable hardware event-base sampling, VTune Amplifier has disabled the NMI watchdog timer. The watchdog timer will be re-enabled after collection completes.

....

...

mcc21-mic0 Finish.
amplxe: Collection stopped.
amplxe: Error: target - ERROR: ld.so: object 'libittnotify_collector.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
amplxe: Error: target - ERROR: ld.so: object 'libittnotify_collector.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
amplxe: Using result path `/tmp/g.mcc21'
amplxe: Executing actions  8 % Loading raw data to the database                
amplxe: Error: Cannot load data file `/tmp/g.mcc21/data.0/sep2b95f5c4a700.20160411T184403.378317.tb6' (tbrw call "TBRW_dobind(tbrwFile->getHandle(), streamIndex)" failed: invalid string (97)).
amplxe: Executing actions 50 % done                                            
amplxe: Error: 0x4000001e (Cannot load raw collector data)

 

0 Kudos
Peter_W_Intel
Employee
875 Views

Issue of using Ctrl-C handling has been fixed in current VTune(TM) Amplifier XE 2016 U2 on Intel? Xeon? processor, is there any specific issue on Intel? Xeon? Phi? processor?

This is a valuable report. Is it possible that you can send test case (main.mic?) via private message to help of reproducing this issue on my side?

By the way, advanced-hotspots doesn't provide TLB miss info


 

0 Kudos
Peter_W_Intel
Employee
875 Views

Thank you of providing me test binary with data file. And you said, "I find that if I run simple program in symmetrical mode,everything is ok..." - unfortunately I can not run your test case on my side for some reasons, special environment required? 3rd-party program (.so) you used? Please provide more info via private message.

For general MPI/MIC profiling. I wrote below for others' reference:

1. Collect performance data in main.cpu (host program), which will run offload code on Xeon Phi

>mpirun -host mcc21 -n 1 amplxe-cl -target-system=mic-host-launch -c advanced-hotspots -search-dir=/tmp -r /tmp/r001_offload /tmp/main.cpu:-host mic0 -n 1 /tmp/main.mic 

2. Collect performance data in main.mic (native)

>mprirun -host mcc21 -n 1 /tmp/main.cpu:-host mic0 -n 1 amplxe-cl -target-system=mic-native -c advanced-hotspots -search-dir=/tmp -r /tmp/r002_native /tmp/main.mic

3. You might collect all data on MIC, whatever it was from offload code, or native code (system profiling on MIC)

>mprirun -host mcc21 -n 1 /tmp/main.cpu:-host mic0 -n 1 /tmp/main.mic ; manually launch app first

>amplxe-cl -target-system=mic-native -c advanced-hotspots -search-dir=/tmp -r r003_sys_mic -d 60 ;  system profiling on Xeon Phi

Note: if your application is long run application, you can use "-d 120" to stop collection instead of Ctrl-C.

You can send me vtune's result directory via private message, I can escalate to developer to investigate your program specific issue. Thank you.

 

0 Kudos
ye_f_1
Beginner
875 Views

wj@mcc21:/tmp$ mpirun -host mcc21 -n 1 /tmp/main.cpu:-host mic0 -n 1 amplxe-cl -target-system=mic-native -c advanced-hotspots -search-dir=/tmp -r /tmp/cc /tmp/main.mic
[proxy:0:0@mcc21] HYDU_create_process (../../utils/launch/launch.c:621): execvp error on file /tmp/main.cpu:-host (No such file or directory)

Then I scp amplxe-cl to phi.But It's seems that amplxe-cl can't execute on phi.

wj@mcc21:/tmp$ mpirun -host mcc21 -n 1 /tmp/main.cpu : -host mic0 -n 1 amplxe-cl -target-system=mic-native -c advanced-hotspots -search-dir=/tmp -r /tmp/dd /tmp/main.mic
/home/wj/bin/amplxe-cl: /home/wj/bin/amplxe-cl: cannot execute binary file

0 Kudos
Peter_W_Intel
Employee
875 Views

Sorry. Something have to be corrected for my post at 04/14/2016 - 20:07

I use a simple test case (pseudo symmetrical mode, attached) to try "-mic-host-launch" and "-mic-native". Copy Intel C++ compiler's libraries for MIC and IMPI's libraries for MIC to "/lib64" & "/bin" on Xeon Phi.
# export I_MPI_MIC=1
# export I_MPI_FABRICS=shm:tcp

# icc -g -O3 -openmp offload_pi.c -o offload_pi
# mpiicc -g -openmp -mmic -O3 omp_pi.c -o omp_pi.MIC ;copy omp_pi.MIC to mic0:/tmp

There was no problem to run:
# mpirun -host `hostname` -n 4 ./offload_pi : -host mic0 -n 4  /tmp/omp_pi.MIC

Case 1: // option "-target-system=mic-host-launch" can work, collect data for offload code on MIC
# mpirun -host `hostname` -n 1 amplxe-cl -target-system=mic-host-launch -c advanced-hotspots -r r001host ./offload_pi : -host mic0 -n 1  /tmp/omp_pi.MIC
Result was expected.

Case 2: // option "-target-system=mic-native" failed, collect data for native code on MIC
# mpirun -host `hostname` -n 1 ./offload_pi : -host mic0  -n 1 amplxe-cl -target-system=mic-native -c advanced-hotspots -r r002mic /tmp/omp_pi.MIC
[proxy:0:1@prc-mic01-mic0] HYDU_create_process (../../utils/launch/launch.c:590): execvp error on file amplxe-cl (No such file or directory)
Computed value of Pi by using MIC:  3.141592654
Elapsed time: 4.39 seconds

Even run "# mpirun -host mic0 -n 1 amplxe-cl -target-system=mic-native -c advanced-hotspots -r r002mic /tmp/omp_pi.MIC" to get same result.

Case 3: // use system profiling on MIC to collect data for native code
Alternatively, I use one terminal to run - (collect data on MIC)
# amplxe-cl -target-system=mic-native -c advanced-hotspots -search-dir=. -r r002mic -d 15
Then, run "mpirun -host `hostname` -n 1 ./offload_pi : -host mic0  -n 1 /tmp/omp_pi.MIC" in other terminal

I can see performance data in report both for offload code & native code.

I will talk with developer to investigate why advanced-hotspots for mic native code doesn't work when MPI's working on MIC, directly?

# icc --version
icc (ICC) 16.0.0 20150815
Copyright (C) 1985-2015 Intel Corporation.  All rights reserved.
# mpirun -version
Intel(R) MPI Library for Linux* OS, Version 5.0 Update 3 Build 20150128 (build id: 11250)
Copyright (C) 2003-2015, Intel Corporation. All rights reserved.
# amplxe-cl -version
Intel(R) VTune(TM) Amplifier XE 2016 Update 2 (build 444464) Command Line Tool

 

 

 

 

 

 

0 Kudos
Reply