- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I want to use vtune to test the TLB miss of a program.The program will waste a lot of time.I get the following information.
wj@mcc21:~/graph/sssp-mpi$ mpirun -host mcc21 -n 1 amplxe-cl -target-system=mic-host-launch -c advanced-hotspots -r /tmp/c ./main.cpu : -host mic0 -n 1 /home/wj/graph/main.mic
amplxe: Using target: mic-host-launch
amplxe: Analyzing data in the node-wide mode. The hostname (mcc21) will be added to the result path/name.
amplxe: Collection started. To stop the collection, either press CTRL-C or enter from another console window: amplxe-cl -r /tmp/c.mcc21 -command stop.
amplxe: Warning: To enable hardware event-base
......
......
mcc21-mic0 Finish.
^C[mpiexec@mcc21] Sending Ctrl-C to processes as requested
[mpiexec@mcc21] Press Ctrl-C again to force abort
amplxe: CTRL-C signal is received.
amplxe: Error: The given command is not valid now. Please check the current state of the launcher using `status' command.
amplxe: Collection stopped.
amplxe: Error: target - ERROR: ld.so: object 'libittnotify_collector.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
amplxe: Error: target - ERROR: ld.so: object 'libittnotify_collector.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
amplxe: Using result path `/tmp/c.mcc21'
amplxe: Executing actions 8 % Loading raw data to the database
amplxe: Error: Cannot load data file `/tmp/c.mcc21/data.0/sep2b0a4d55c700.20160411T161114.414300.tb6' (tbrw call "TBRW_dobind(tbrwFile->getHandle(), streamIndex)" failed: invalid string (97)).
amplxe: Executing actions 50 % done
amplxe: Error: 0x4000001e (Cannot load raw collector data)
Then I use amplxe-cl to check data file.
wj@mcc21:/tmp/c.mcc21$ amplxe-cl -finalize -r /tmp/c.mcc21
amplxe: Using result path `/tmp/c.mcc21'
amplxe: Executing actions 16 % Loading raw data to the database
amplxe: Error: Cannot load data file `/tmp/c.mcc21/data.0/sep2b0a4d55c700.20160411T161114.414300.tb6' (tbrw call "TBRW_reading_section(ptr, m_sectionId)" failed: Section does not exist (78)).
amplxe: Executing actions 100 % done
amplxe: Error: 0x4000001e (Cannot load raw collector data)
But when I add -duration option.it works well.I want to get global imformation not just a point information.What should I do ?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Do I understand correctly that you used Ctrl-C to terminate the program?
Could you please do the following check - wait until the host rank is finished to ends the collection not pressing Ctrl-C?
It looks like the trace from MIC target was broken and I want to check if it can be connected with some issues of processing Ctrl-C from the host.
Thanks & Regards, Dmitry
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It seems that Ctrl-C don't affect consequence.I run program again without Ctrl-C and get same consequence.
wj@mcc21:~/graph/sssp-mpi$ mpirun -host mcc21 -n 1 amplxe-cl -target-system=mic-host-launch -c advanced-hotspots -r /tmp/g ./main.cpu : -host mic0 -n 1 /home/wj/graph/main.mic
amplxe: Using target: mic-host-launch
amplxe: Analyzing data in the node-wide mode. The hostname (mcc21) will be added to the result path/name.
amplxe: Collection started. To stop the collection, either press CTRL-C or enter from another console window: amplxe-cl -r /tmp/g.mcc21 -command stop.
amplxe: Warning: To enable hardware event-base
....
...
mcc21-mic0 Finish.
amplxe: Collection stopped.
amplxe: Error: target - ERROR: ld.so: object 'libittnotify_collector.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
amplxe: Error: target - ERROR: ld.so: object 'libittnotify_collector.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
amplxe: Using result path `/tmp/g.mcc21'
amplxe: Executing actions 8 % Loading raw data to the database
amplxe: Error: Cannot load data file `/tmp/g.mcc21/data.0/sep2b95f5c4a700.20160411T184403.378317.tb6' (tbrw call "TBRW_dobind(tbrwFile->getHandle(), streamIndex)" failed: invalid string (97)).
amplxe: Executing actions 50 % done
amplxe: Error: 0x4000001e (Cannot load raw collector data)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Issue of using Ctrl-C handling has been fixed in current VTune(TM) Amplifier XE 2016 U2 on Intel? Xeon? processor, is there any specific issue on Intel? Xeon? Phi? processor?
This is a valuable report. Is it possible that you can send test case (main.mic?) via private message to help of reproducing this issue on my side?
By the way, advanced-hotspots doesn't provide TLB miss info
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you of providing me test binary with data file. And you said, "I find that if I run simple program in symmetrical mode,everything is ok..." - unfortunately I can not run your test case on my side for some reasons, special environment required? 3rd-party program (.so) you used? Please provide more info via private message.
For general MPI/MIC profiling. I wrote below for others' reference:
1. Collect performance data in main.cpu (host program), which will run offload code on Xeon Phi
>mpirun -host mcc21 -n 1 amplxe-cl -target-system=mic-host-launch -c advanced-hotspots -search-dir=/tmp -r /tmp/r001_offload /tmp/main.cpu:-host mic0 -n 1 /tmp/main.mic
2. Collect performance data in main.mic (native)
>mprirun -host mcc21 -n 1 /tmp/main.cpu:-host mic0 -n 1 amplxe-cl -target-system=mic-native -c advanced-hotspots -search-dir=/tmp -r /tmp/r002_native /tmp/main.mic
3. You might collect all data on MIC, whatever it was from offload code, or native code (system profiling on MIC)
>mprirun -host mcc21 -n 1 /tmp/main.cpu:-host mic0 -n 1 /tmp/main.mic ; manually launch app first
>amplxe-cl -target-system=mic-native -c advanced-hotspots -search-dir=/tmp -r r003_sys_mic -d 60 ; system profiling on Xeon Phi
Note: if your application is long run application, you can use "-d 120" to stop collection instead of Ctrl-C.
You can send me vtune's result directory via private message, I can escalate to developer to investigate your program specific issue. Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
wj@mcc21:/tmp$ mpirun -host mcc21 -n 1 /tmp/main.cpu:-host mic0 -n 1 amplxe-cl -target-system=mic-native -c advanced-hotspots -search-dir=/tmp -r /tmp/cc /tmp/main.mic
[proxy:0:0@mcc21] HYDU_create_process (../../utils/launch/launch.c:621): execvp error on file /tmp/main.cpu:-host (No such file or directory)
Then I scp amplxe-cl to phi.But It's seems that amplxe-cl can't execute on phi.
wj@mcc21:/tmp$ mpirun -host mcc21 -n 1 /tmp/main.cpu : -host mic0 -n 1 amplxe-cl -target-system=mic-native -c advanced-hotspots -search-dir=/tmp -r /tmp/dd /tmp/main.mic
/home/wj/bin/amplxe-cl: /home/wj/bin/amplxe-cl: cannot execute binary file
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry. Something have to be corrected for my post at 04/14/2016 - 20:07
I use a simple test case (pseudo symmetrical mode, attached) to try "-mic-host-launch" and "-mic-native". Copy Intel C++ compiler's libraries for MIC and IMPI's libraries for MIC to "/lib64" & "/bin" on Xeon Phi.
# export I_MPI_MIC=1
# export I_MPI_FABRICS=shm:tcp
# icc -g -O3 -openmp offload_pi.c -o offload_pi
# mpiicc -g -openmp -mmic -O3 omp_pi.c -o omp_pi.MIC ;copy omp_pi.MIC to mic0:/tmp
There was no problem to run:
# mpirun -host `hostname` -n 4 ./offload_pi : -host mic0 -n 4 /tmp/omp_pi.MIC
Case 1: // option "-target-system=mic-host-launch" can work, collect data for offload code on MIC
# mpirun -host `hostname` -n 1 amplxe-cl -target-system=mic-host-launch -c advanced-hotspots -r r001host ./offload_pi : -host mic0 -n 1 /tmp/omp_pi.MIC
Result was expected.
Case 2: // option "-target-system=mic-native" failed, collect data for native code on MIC
# mpirun -host `hostname` -n 1 ./offload_pi : -host mic0 -n 1 amplxe-cl -target-system=mic-native -c advanced-hotspots -r r002mic /tmp/omp_pi.MIC
[proxy:0:1@prc-mic01-mic0] HYDU_create_process (../../utils/launch/launch.c:590): execvp error on file amplxe-cl (No such file or directory)
Computed value of Pi by using MIC: 3.141592654
Elapsed time: 4.39 seconds
Even run "# mpirun -host mic0 -n 1 amplxe-cl -target-system=mic-native -c advanced-hotspots -r r002mic /tmp/omp_pi.MIC" to get same result.
Case 3: // use system profiling on MIC to collect data for native code
Alternatively, I use one terminal to run - (collect data on MIC)
# amplxe-cl -target-system=mic-native -c advanced-hotspots -search-dir=. -r r002mic -d 15
Then, run "mpirun -host `hostname` -n 1 ./offload_pi : -host mic0 -n 1 /tmp/omp_pi.MIC" in other terminal
I can see performance data in report both for offload code & native code.
I will talk with developer to investigate why advanced-hotspots for mic native code doesn't work when MPI's working on MIC, directly?
# icc --version
icc (ICC) 16.0.0 20150815
Copyright (C) 1985-2015 Intel Corporation. All rights reserved.
# mpirun -version
Intel(R) MPI Library for Linux* OS, Version 5.0 Update 3 Build 20150128 (build id: 11250)
Copyright (C) 2003-2015, Intel Corporation. All rights reserved.
# amplxe-cl -version
Intel(R) VTune(TM) Amplifier XE 2016 Update 2 (build 444464) Command Line Tool
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page