Re:dpcpp program performance gets drop by adding c...

Jim · ‎06-17-2020

Hi all

I found dpcpp program performance gets drop by adding compile option '-g'.

And I also test the related cuda program, nvcc does not have this problem.

The program source code has been uploaded.

The compile cmdline is :

dpcpp -o a -O2 ./convSep_nocg.dp.cpp

dpcpp -o a_g -g -O2 ./convSep_nocg.dp.cpp

Program 'a_g' spend near 2x time vs 'a'.

# ./a 10240 10240 1000
[./a] - Starting...
Image Width x Height = 10240 x 10240

Allocating and initializing host arrays...
Allocating and initializing CUDA arrays...
Running GPU convolution (1000 identical iterations)...

convolutionSeparable, Throughput = 1321.5942 MPixels/sec, Time = 0.07934 s, Size = 104857600 Pixels, NumDevsUsed = 1, Workgroup = 0

Reading back GPU results...

Checking the results...
 ...running convolutionRowCPU()
 ...running convolutionColumnCPU()
 ...comparing the results
 ...Relative L2 norm: 0.000000E+00

Shutting down...


# ./a_g 10240 10240 1000
[./a_g] - Starting...
Image Width x Height = 10240 x 10240

Allocating and initializing host arrays...
Allocating and initializing CUDA arrays...
Running GPU convolution (1000 identical iterations)...

convolutionSeparable, Throughput = 773.0526 MPixels/sec, Time = 0.13564 s, Size = 104857600 Pixels, NumDevsUsed = 1, Workgroup = 0

Reading back GPU results...

Checking the results...
 ...running convolutionRowCPU()
 ...running convolutionColumnCPU()
 ...comparing the results
 ...Relative L2 norm: 0.000000E+00

Shutting down...

OS Version: Ubuntu 18.04.3 LTS

linux-kernel: 4.15.18

oneAPI Basekit Version: 2021.1-beta06

CPU: Intel(R) Xeon(R) CPU E3-1585 v5 @ 3.50GHz

GPU: Intel Corporation Iris Pro Graphics P580

PrasanthD_intel · ‎06-18-2020

Hi Jim,

When we use -g debug flag in dpcpp, it generates debug information for both host as well as device part of the code.

Enabling -g option creates another section called debug section. So this will, in turn, create overhead during compilation, hence there could be a considerable increase during run time.

For more information, you refer to this link.

https://software.intel.com/content/www/us/en/develop/documentation/get-started-with-debugging-dpcpp-linux/top.html

However, when -g flag is passed to nvcc compiler it generates debug information only for the host. To generate debug information for the device there is a different flag that needs to be passed to the nvcc compiler.

Regards

Prasanth

GouthamK_Intel · ‎06-30-2020

Hi Jim,

Could you please let us know if your issue is resolved.

If not do let us know. So that we will be able to help you regarding the same.

Regards

--Goutham

PrasanthD_intel · ‎07-08-2020

Hi Jim,

This issue has been resolved and we will no longer respond to this thread. If you require additional assistance from Intel, please start a new thread. Any further interaction in this thread will be considered community only

Regards

Prasanth