Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Pujiang_H_Intel
Employee
301 Views

Any suggestions to avoid internal error of advisor

I am trying to use advisor (2019 update 4) to profile the roofline, but encountered the internal error issue and the advisor hung (not crash). Do you have any suggestions on that? Thanks!

Here attaches the log:

# advixe-cl -collect roofline --project-dir roofline -target-pid=426780 -stop-after=10              

advixe: Warning: The Roofline is a special batch mode of data collection. It runs two analyses one by one. There are Survey Analysis and Trip Counts Analysis with FLOP respectively.
advixe: Starting command line: advixe-cl --collect survey --project-dir roofline --stop-after 10 --target-pid 426780 --
Intel(R) Advisor Command Line Tool
Copyright (C) 2009-2019 Intel Corporation. All rights reserved.
advixe: Opening result 25 % done
advixe: Preparing frequently used data  0 % done
advixe: Preparing frequently used data 100 % done
advixe: Collection started. To stop the collection, either press CTRL-C or enter from another console window: advixe-cl -r /home/pujiang/tensorflow/roofline/e000/hs002 -command stop.
^Cadvixe: Collection detached.
advixe: Collection stopped.
advixe: Opening result 19 % Loading '418800-426780.0.trace' file
advixe: Error: Cannot load data file `/home/pujiang/tensorflow/roofline/e000/hs002/data.1/.import' ().
advixe: Opening result 21 % Resolving information for `libm.so.6'
advixe: Warning: Cannot locate debugging information for file `/lib64/libgomp.so.1'.
advixe: Warning: Cannot locate debugging information for file `/lib64/libpthread.so.0'.
advixe: Warning: Cannot locate debugging information for file `/lib64/libstdc++.so.6'.
advixe: Warning: Cannot locate debugging information for file `/lib64/libm.so.6'.
advixe: Warning: Cannot locate debugging information for file `/lib64/libc.so.6'.
advixe: Opening result 24 % Resolving information for `libtensorflow_framework.
advixe: Warning: Cannot locate debugging information for file `/root/.cache/bazel/_bazel_root/3a774584f77b95d5cd1e53968137dc39/execroot/org_tensorflow/bazel-out/k8-opt/bin/_solib_k8/_U_S_Stensorflow_Scc_Sexample_Cexample___Utensorflow/libtensorflow_framework.so'.
advixe: Opening result 25 % Resolving information for `example'
advixe: Warning: Cannot locate debugging information for file `/root/.cache/bazel/_bazel_root/3a774584f77b95d5cd1e53968137dc39/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/cc/example/example'.
advixe: Opening result 99 % done
advixe: Preparing frequently used data  0 % done
advixe: Preparing frequently used data 100 % done

Elapsed Time: 7.28s
Total CPU time: 149.93
Time in 5 vectorized loops: 3.72

advixe: Starting command line: advixe-cl --collect tripcounts --project-dir roofline --stop-after 10 --target-pid 426780 --flop --no-trip-counts --
Intel(R) Advisor Command Line Tool
Copyright (C) 2009-2019 Intel Corporation. All rights reserved.
advixe: Opening result 25 % done
advixe: Preparing frequently used data  0 % done
advixe: Preparing frequently used data 100 % done
advixe: Collection started. To stop the collection, either press CTRL-C or enter from another console window: advixe-cl -r /home/pujiang/tensorflow/roofline/e000/trc001 -command stop.
advixe: Error: An internal error has occurred. Our apologies for this inconvenience. Please gather a description of the steps leading up to the problem and contact the Intel customer support team.

0 Kudos
7 Replies
Ruslan_M_Intel
Employee
301 Views

Could you please share your project?

Ruslan_M_Intel
Employee
301 Views

I've managed to reproduce the issue and see that attach/detach functionality doesn't work as expected for roofline collection. That's why the error happens (tripcounts collection failed to attach to the same pid due to survey collection doesn't detach correctly). Need to investigate more.

Pujiang_H_Intel
Employee
301 Views

Got. Then before the fix, I will start the process in the command line.

BTW, ask a question here: I see there are many points for small functions in roofline graph, but how to get the roofline for a big function which across multiple modules? (I am new to advisor)

Thanks!

Ruslan_M_Intel
Employee
301 Views

Are you talking about callchains? If so you can try "roofline" collection with stacks enabled (-stacks)

301 Views

Dmitry_Dontsov
Employee
301 Views

Roofline is combination of two collections: Survey and Tripcounts/Flops. Any collection could to attach to a process separately, But it is impossible to attach to a process which has been already profiled (with detach) by one another collector. This moment is missed in Advisor User Guide for a now. BTW, roofline in "attach to process" mode is a lot of not common situation, because of Survey collection collect basic information and Tripcounts/Flops collect extended information which would be matched on Survey results. If we try to collect roofline in "attach to process" mode, then we have situation when Survey collect information for one application state and Tripcounts/Flops - for one another, so we will have situation when Tripcounts/Flops data will not be able to be matched on Survey results.

Note: Tripcounts/Flop collector doe not support "-stacks" data collection in "attach-to-process" mode.

Dmitry_Dontsov
Employee
301 Views

何普江 (Intel) wrote:

how to get the roofline for a big function which across multiple modules?

You can use "pause-resume" feature: put itt_resume at the beginning of the function and itt_pause at the end of the function. Now it needs to start collection in "start-paused" mode. Result: Advisor will collect data only for function of interest.

Reply