Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
5141 Discussions

Attaching to a process does not have the same function/module/etc information as normal run

Coby
Beginner
1,249 Views

Hello,

I am running on an HPC cluster using MPI.

When I run vtune in from the shell script that I launch the app from, I get successful function information. For example,

 

mpirun --bind-to none vtune -collect memory-consumption -r=./cn45 ./test.exe $directory1/$file1 $directory2/$file2

 

This works fine. I am able to download my results from a linux ssh(inside vscode) to my windows machine, load, and analyze the data. Function, module and all the relevant allocation/deallocation information I need is there.

 

I have a problem though. The actual scenario I want to analyze is a large task that takes ~3000 seconds without analysis attached(It is running out of memory and I am investigating a leak. There is 115 Gig of memory available).

If a run vtune from the beginning, analysis takes a very, very long time to reach the problem state where I want to start analysis. I found that there was a way to attach to a running process. I logged into one of the nodes and tried to do this attachment mid run. Below is an example of the exact command I executed. The symbols are built into the exe, I've checked. By attaching mid run, it solves the problem of having to wait forever to get to the problem state.

However, when I detach and analyze the data I see no function information. It says [unknown].

My preliminary investigation has led me to try telling it where the symbols are(in the exe), but that has not made a difference.  See the attached image.

 

Why is my function/module information disappearing when I attach to the process on the compute nodes vs when I launch vtune from the very beginning?

 

vtune -verbose -collect memory-consumption -data-limit=0 -r ./.cn45 -target-pid=81076 -search-dir=/home/coby.soss/spareClone/595DS_FWL_C/3WL
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /home/coby.soss/spareClone/595DS_FWL_C/3WL/.cn45 -command stop.
^Cvtune: Collection detached.
vtune: Collection stopped.
vtune: Using result path `/home/coby.soss/spareClone/595DS_FWL_C/3WL/.cn45'
vtune: Executing actions 7 % Clearing the database
vtune: The database has been cleared, elapsed time is 0.768 seconds.
vtune: Executing actions 14 % Updating precomputed scalar metrics
vtune: Raw data has been loaded to the database, elapsed time is 0.263 seconds.
vtune: Executing actions 19 % Processing profile metrics and debug information
vtune: Data transformations have been finished, elapsed time is 0.019 seconds.
vtune: Executing actions 26 % Resolving interrupt name information
vtune: Symbol resolution has been finished, elapsed time is 0.086 seconds.
vtune: Executing actions 28 % Processing profile metrics and debug information
vtune: Deferred data transformations have been finished, elapsed time is 0.020 seconds.
vtune: Executing actions 28 % Setting data model parameters
vtune: Data model parameters have been set, elapsed time is 0.025 seconds.
vtune: Executing actions 35 % Updating precomputed scalar metrics
vtune: Precomputing frequently used data has been finished, elapsed time is 0.108 seconds.
vtune: Executing actions 41 % Saving the result
vtune: Redundant overtime data has been discarded, elapsed time is 0.006 seconds.
vtune: Raw collector data has been discarded, elapsed time is 0.000 seconds.
vtune: Executing actions 50 % Saving the result
vtune: Finalizing the result took 3.500 seconds.
vtune: Executing actions 75 % Setting data model parameters
vtune: Knob values have been set, elapsed time is 0.003 seconds.
vtune: Executing actions 75 % Generating a report
Collection and Platform Info
----------------------------
Parameter .cn45
------------------------ -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Application Command Line
Operating System 3.10.0-693.11.6.el7.x86_64 NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

Computer Name cn45
Result Size 3175430
Collection start time 09:00:00 29/01/2024 UTC
Collection stop time 09:05:39 29/01/2024 UTC
Collector Type User-mode sampling and tracing

CPU
---
Parameter .cn45
----------------- ------------------------------------------------------
Name Intel(R) Xeon(R) E5/E7 v3 Processor code named Haswell
Frequency 2593995593
Logical CPU Count 20

Summary
-------
Elapsed Time: 338.833
Paused Time: 0.0
vtune: Executing actions 100 % done

 

 

0 Kudos
3 Replies
yuzhang3_intel
Moderator
1,187 Views

To avoid unnecessary collection, you mean you profile workload on specific node by attaching mid run below, right?

vtune -verbose -collect memory-consumption -data-limit=0 -r ./.cn45 -target-pid=81076 -search-dir=/home/coby.soss/spareClone/595DS_FWL_C/3WL

 

Additionally, please upload vtune data, thanks.

0 Kudos
Coby
Beginner
1,160 Views

Yes. I actually attach on both nodes at roughly the same time, issuing the same command in each shell(with a different data folder of course).

I have attached the data.

0 Kudos
yuzhang3_intel
Moderator
1,146 Views

If you know the time point when profiling needs to be started, you can try with -start-paused option and resume command, like below example. In the case, VTune launches with collection paused.

$ vtune -start-paused -collect hotspot --target-pid 537834

 

When you want to start profiling, in another console window, click below command to resume collection manually.

$ vtune -command resume

https://www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/2024-0/start-paused.html

 

You  can also try with -start-paused and -resume-after combinations to resume collection automatically.

https://www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/2024-0/resume-after.html

 

If you know the exact code area to profile, you can try with itt_api, which can control the way that VTune collects data for applications。

https://www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/2024-0/collection-control-api.html

 

 

0 Kudos
Reply