- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am getting below warning message when trying to get the GPU Hotspot analysis with VTune for SYCL application
vtune: Warning: Cannot locate debugging information for the Linux kernel. Source-level analysis will not be possible. Function-level analysis will be limited to kernel symbol tables. See the Enabling Linux Kernel Analysis
topic in the product online help for instructions
I have tried enabling the debug information as mentioned in the post using -gline-tables-only and -fdebug-info-for-profiling flags. But with -gline-tables-only i was getting below error
llvm-foreach: Segmentation fault (core dumped)
icpx: error: gen compiler command failed with exit code 254 (use -v to see invocation)
Intel(R) oneAPI DPC++/C++ Compiler 2025.1.1 (2025.1.1.20250418)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/intel/oneapi/compiler/2025.1/bin/compiler
Configuration file: /opt/intel/oneapi/compiler/2025.1/bin/compiler/../icpx.cfg
icpx: note: diagnostic msg: Error generating preprocessed source(s)
Please help in resolving this issue
Below are the config:
OS: Ubuntu 22.04.4 LTS
GPU: Intel Data Center GPU Max 1550 (PVC)
Vtune: Intel(R) VTune(TM) Profiler 2025.3.0 (build 630104)
compiler: Intel(R) oneAPI DPC++/C++ Compiler 2025.1.1 (2025.1.1.20250418)
Output of sycl-ls
[level_zero:gpu][level_zero:0] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Data Center GPU Max 1550 12.60.7 [1.6.31294+9]
[opencl:cpu][opencl:0] Intel(R) OpenCL, Intel(R) Xeon(R) Platinum 8352Y CPU @ 2.20GHz OpenCL 3.0 (Build 0) [2025.19.4.0.18_160000.xmain-hotfix]
[opencl:gpu][opencl:1] Intel(R) OpenCL Graphics, Intel(R) Data Center GPU Max 1550 OpenCL 3.0 NEO [24.39.31294]
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This warning just tells you VTune can't locate the Linux kernel, do you need to trace code in the kernel? I suppose you want to profile the sycl application, so you only build your application with debug information.
You can refer to the cmake file in the oneapi sample code below:
oneAPI-samples/Tools/VTuneProfiler/matrix_multiply_vtune/CMakeLists.txt
set(CMAKE_CXX_COMPILER icpx)
cmake_minimum_required(VERSION 3.4)
project(matrix_multiply)
set(CMAKE_CXX_FLAGS "-g -O3 -fsycl -Wno-write-strings -w -D_Linux")
add_executable(matrix.dpcpp src/matrix.cpp src/multiply.cpp)
add_custom_target(run ./matrix.dpcpp)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for the reply
I need to trace the code in the kernel and findout which part of the source code is creating the bottleneck
Does -g helping in source analysis for VTune?
I have just tried the vtune self checker to verify if there is any issue with installation
It gave below output
The system is ready for the following analyses:
* Performance Snapshot
* Hotspots and Threading with user-mode sampling
* Hotspots with HW event-based sampling, HPC Performance Characterization, etc.
* Microarchitecture Exploration
* Memory Access
* Hotspots with HW event-based sampling and call stacks
* Threading with HW event-based sampling
The following analyses have failed on the system:
* GPU Compute/Media Hotspots (characterization mode)
* GPU Compute/Media Hotspots (source analysis mode)
Attaching the complete log
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you need to profile kernel code, you need to build a kernel with debug information using option, '-g'.
You can profile a GPU workload to see if the gpu profiling is ready, like:
vtune -collect gpu-hotspots -- {your application}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @yuzhang3_intel ,
Thanks for clarification
I do get the source code analysis enabled now with the -g flag. But, it is pointing to incorrect lines in the source code mapping window
I have multiple variations of a kernel in .hpp and in the source code analysis it is mapping the metrics to the variant which is not even getting executed. Also i am running the vtune analysis in the remote linux machine, getting the vtune output dumps there. I am downloading this output folder and visualizing the results in my windows machine
Will this create any problem with the mapping?
I am also attaching the source code files for your reference (reduction_sum_1d_header.c is the actual kernel file)
Note: I have renamed the .hpp file to _header.c since there portal is reporting some issue with the content of .hpp file
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The binary with debug information must be matched with the source. For example, if you rebuild the kernel, you need to re-map the binary with the corresponding source code.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I just rechecked
The binary and source code are matched correctly
Any suggestion on what else can cause this?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The binary/symbol and source path set correctly?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes
I downloaded the binary and source path from remote machine and added them to the correct search path in my windows vtune GUI
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I added -O2 flag to the compilation this time and this seems to have resolved the issue
Now i am able to see the correct mapping. Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Great! The original option you used is -O3?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have not used any optimization flag previously
I have actually added O3 flag now, not O2. There was a typo in previous answer. Kindly note

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page