- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am running on an HPC cluster using MPI.
When I run vtune in from the shell script that I launch the app from, I get successful function information. For example,
This works fine. I am able to download my results from a linux ssh(inside vscode) to my windows machine, load, and analyze the data. Function, module and all the relevant allocation/deallocation information I need is there.
I have a problem though. The actual scenario I want to analyze is a large task that takes ~3000 seconds without analysis attached(It is running out of memory and I am investigating a leak. There is 115 Gig of memory available).
If a run vtune from the beginning, analysis takes a very, very long time to reach the problem state where I want to start analysis. I found that there was a way to attach to a running process. I logged into one of the nodes and tried to do this attachment mid run. Below is an example of the exact command I executed. The symbols are built into the exe, I've checked. By attaching mid run, it solves the problem of having to wait forever to get to the problem state.
However, when I detach and analyze the data I see no function information. It says [unknown].
My preliminary investigation has led me to try telling it where the symbols are(in the exe), but that has not made a difference. See the attached image.
Why is my function/module information disappearing when I attach to the process on the compute nodes vs when I launch vtune from the very beginning?
vtune -verbose -collect memory-consumption -data-limit=0 -r ./.cn45 -target-pid=81076 -search-dir=/home/coby.soss/spareClone/595DS_FWL_C/3WL
vtune: Collection started. To stop the collection, either press CTRL-C or enter from another console window: vtune -r /home/coby.soss/spareClone/595DS_FWL_C/3WL/.cn45 -command stop.
^Cvtune: Collection detached.
vtune: Collection stopped.
vtune: Using result path `/home/coby.soss/spareClone/595DS_FWL_C/3WL/.cn45'
vtune: Executing actions 7 % Clearing the database
vtune: The database has been cleared, elapsed time is 0.768 seconds.
vtune: Executing actions 14 % Updating precomputed scalar metrics
vtune: Raw data has been loaded to the database, elapsed time is 0.263 seconds.
vtune: Executing actions 19 % Processing profile metrics and debug information
vtune: Data transformations have been finished, elapsed time is 0.019 seconds.
vtune: Executing actions 26 % Resolving interrupt name information
vtune: Symbol resolution has been finished, elapsed time is 0.086 seconds.
vtune: Executing actions 28 % Processing profile metrics and debug information
vtune: Deferred data transformations have been finished, elapsed time is 0.020 seconds.
vtune: Executing actions 28 % Setting data model parameters
vtune: Data model parameters have been set, elapsed time is 0.025 seconds.
vtune: Executing actions 35 % Updating precomputed scalar metrics
vtune: Precomputing frequently used data has been finished, elapsed time is 0.108 seconds.
vtune: Executing actions 41 % Saving the result
vtune: Redundant overtime data has been discarded, elapsed time is 0.006 seconds.
vtune: Raw collector data has been discarded, elapsed time is 0.000 seconds.
vtune: Executing actions 50 % Saving the result
vtune: Finalizing the result took 3.500 seconds.
vtune: Executing actions 75 % Setting data model parameters
vtune: Knob values have been set, elapsed time is 0.003 seconds.
vtune: Executing actions 75 % Generating a report
Collection and Platform Info
----------------------------
Parameter .cn45
------------------------ -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Application Command Line
Operating System 3.10.0-693.11.6.el7.x86_64 NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
Computer Name cn45
Result Size 3175430
Collection start time 09:00:00 29/01/2024 UTC
Collection stop time 09:05:39 29/01/2024 UTC
Collector Type User-mode sampling and tracing
CPU
---
Parameter .cn45
----------------- ------------------------------------------------------
Name Intel(R) Xeon(R) E5/E7 v3 Processor code named Haswell
Frequency 2593995593
Logical CPU Count 20
Summary
-------
Elapsed Time: 338.833
Paused Time: 0.0
vtune: Executing actions 100 % done
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To avoid unnecessary collection, you mean you profile workload on specific node by attaching mid run below, right?
vtune -verbose -collect memory-consumption -data-limit=0 -r ./.cn45 -target-pid=81076 -search-dir=/home/coby.soss/spareClone/595DS_FWL_C/3WL
Additionally, please upload vtune data, thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes. I actually attach on both nodes at roughly the same time, issuing the same command in each shell(with a different data folder of course).
I have attached the data.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you know the time point when profiling needs to be started, you can try with -start-paused option and resume command, like below example. In the case, VTune launches with collection paused.
$ vtune -start-paused -collect hotspot --target-pid 537834
When you want to start profiling, in another console window, click below command to resume collection manually.
$ vtune -command resume
https://www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/2024-0/start-paused.html
You can also try with -start-paused and -resume-after combinations to resume collection automatically.
https://www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/2024-0/resume-after.html
If you know the exact code area to profile, you can try with itt_api, which can control the way that VTune collects data for applications。
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page