Intel® oneAPI DPC++/C++ Compiler
Talk to fellow users of Intel® oneAPI DPC++/C++ Compiler and companion tools like Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and Intel® Distribution for GDB*
585 Discussions

dpcpp program performance gets drop by adding compile option '-g'

Jim
Beginner
1,156 Views

Hi all

 

I found dpcpp program performance gets drop by adding compile option '-g'.

And I also test the related cuda program, nvcc does not have this problem.

 

The program source code has been uploaded.

The compile cmdline is :

dpcpp -o a -O2 ./convSep_nocg.dp.cpp

dpcpp -o a_g -g -O2 ./convSep_nocg.dp.cpp

 

Program 'a_g' spend near 2x time vs 'a'. 

 

# ./a 10240 10240 1000
[./a] - Starting...
Image Width x Height = 10240 x 10240

Allocating and initializing host arrays...
Allocating and initializing CUDA arrays...
Running GPU convolution (1000 identical iterations)...

convolutionSeparable, Throughput = 1321.5942 MPixels/sec, Time = 0.07934 s, Size = 104857600 Pixels, NumDevsUsed = 1, Workgroup = 0

Reading back GPU results...

Checking the results...
 ...running convolutionRowCPU()
 ...running convolutionColumnCPU()
 ...comparing the results
 ...Relative L2 norm: 0.000000E+00

Shutting down...


# ./a_g 10240 10240 1000
[./a_g] - Starting...
Image Width x Height = 10240 x 10240

Allocating and initializing host arrays...
Allocating and initializing CUDA arrays...
Running GPU convolution (1000 identical iterations)...

convolutionSeparable, Throughput = 773.0526 MPixels/sec, Time = 0.13564 s, Size = 104857600 Pixels, NumDevsUsed = 1, Workgroup = 0

Reading back GPU results...

Checking the results...
 ...running convolutionRowCPU()
 ...running convolutionColumnCPU()
 ...comparing the results
 ...Relative L2 norm: 0.000000E+00

Shutting down...

 

OS Version: Ubuntu 18.04.3 LTS

linux-kernel: 4.15.18

oneAPI Basekit Version: 2021.1-beta06

CPU: Intel(R) Xeon(R) CPU E3-1585 v5 @ 3.50GHz

GPU: Intel Corporation Iris Pro Graphics P580

0 Kudos
3 Replies
PrasanthD_intel
Moderator
1,156 Views

Hi Jim,

When we use -g debug flag in dpcpp, it generates debug information for both host as well as device part of the code. 

Enabling -g option creates another section called debug section. So this will, in turn, create overhead during compilation, hence there could be a considerable increase during run time.

For more information, you refer to this link.

https://software.intel.com/content/www/us/en/develop/documentation/get-started-with-debugging-dpcpp-linux/top.html

However, when -g flag is passed to nvcc compiler it generates debug information only for the host. To generate debug information for the device there is a different flag that needs to be passed to the nvcc compiler.

Regards

Prasanth

0 Kudos
GouthamK_Intel
Moderator
1,135 Views

Hi Jim,


Could you please let us know if your issue is resolved.

If not do let us know. So that we will be able to help you regarding the same.

 

Regards

--Goutham


0 Kudos
PrasanthD_intel
Moderator
1,102 Views

Hi Jim,


This issue has been resolved and we will no longer respond to this thread. If you require additional assistance from Intel, please start a new thread. Any further interaction in this thread will be considered community only


Regards

Prasanth


0 Kudos
Reply