Community
cancel
Showing results for 
Search instead for 
Did you mean: 
dawsfox
Beginner
345 Views

dpct "Could not generate replacement" (1004), terminate after "std::bad_alloc"

I've been trying to convert some cuda code to dpc++ using the DPC++ Compatibility Tool on DevCloud but I get the following errors:

Processing: /home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0/Kernels/kernel1.cxx
In file included from <built-in>:1:
In file included from /glob/development-tools/versions/oneapi/beta07/inteloneapi/dpcpp-ct/2021.1-beta07/lib/clang/11.0.0/include/__clang_cuda_runtime_wrapper.h:127:
In file included from /home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0/include/cuda_runtime.h:95:
/home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0/include/channel_descriptor.h:106:44: warning: DPCT1004:0: Could not generate replacement.
return cudaCreateChannelDesc(0, 0, 0, 0, cudaChannelFormatKindNone);
^
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc

Meet signal:SIGABRT
Intel(R) DPC++ Compatibility Tool trys to give analysis reports and terminates...

I'm not sure how to handle the warning 1004 and am unsure if that is what causes the bad alloc. Can anybody offer some advice?

0 Kudos
5 Replies
GouthamK_Intel
Moderator
326 Views

Hi,


Thanks for reaching out to us.!


If possible, could you please provide source code, logs, and steps to reproduce the issue you are facing so that we will able to investigate.



Regards

Goutham


dawsfox
Beginner
322 Views

Source code was cloned from here: https://bitbucket.org/berkeleylab/cs-roofline-toolkit/src/master/

In the Empirical_Roofline_Tool-1.1.0 directory I used the Makefile below to build a single step of the tool's process (it creates multiple kernels to record performance, the makefile only builds one). Using the intercept-build tool with the Makefile produced the .json shown below.

Makefile:

build:
mkdir -p Results.gpu_v100_smx2.jlse.anl.gov/Run.001/FLOPS.064
nvcc -O3 -I./Kernels -DERT_FLOP=64 -DERT_ALIGN=32 -DERT_MEMORY_MAX=1073741824 -DERT_WORKING_SET_MIN=128 -DERT_TRIALS_MIN=1 -DERT_WSS_MULT=1.1 -DERT_GPU -std=c++11 -x cu -arch=sm_70 -DERT_FP16 -DERT_FP32 -DERT_FP64 -c ./Drivers/driver1.cxx -o Results.gpu_v100_smx2.jlse.anl.gov/Run.001/FLOPS.064/driver1.o
nvcc -O3 -I./Kernels -DERT_FLOP=64 -DERT_ALIGN=32 -DERT_MEMORY_MAX=1073741824 -DERT_WORKING_SET_MIN=128 -DERT_TRIALS_MIN=1 -DERT_WSS_MULT=1.1 -DERT_GPU -std=c++11 -x cu -arch=sm_70 -DERT_FP16 -DERT_FP32 -DERT_FP64 -c ./Kernels/kernel1.cxx -o Results.gpu_v100_smx2.jlse.anl.gov/Run.001/FLOPS.064/kernel1.o
nvcc Results.gpu_v100_smx2.jlse.anl.gov/Run.001/FLOPS.064/driver1.o Results.gpu_v100_smx2.jlse.anl.gov/Run.001/FLOPS.064/kernel1.o -o Results.gpu_v100_smx2.jlse.anl.gov/Run.001/FLOPS.064/driver1.kernel1

compile_commands.json

[
{
"command": "nvcc -c -O3 -I./Kernels -DERT_FLOP=64 -DERT_ALIGN=32 -DERT_MEMORY_MAX=1073741824 -DERT_WORKING_SET_MIN=128 -DERT_TRIALS_MIN=1 -DERT_WSS_MULT=1.1 -DERT_GPU -std=c++11 --cuda-gpu-arch=sm_70 -DERT_FP16 -DERT_FP32 -DERT_FP64 -o Results.gpu_v100_smx2.jlse.anl.gov/Run.001/FLOPS.064/driver1.o -D__CUDACC__=1 ./Drivers/driver1.cxx",
"directory": "/home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0",
"file": "/home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0/Drivers/driver1.cxx"
},
{
"command": "nvcc -c -O3 -I./Kernels -DERT_FLOP=64 -DERT_ALIGN=32 -DERT_MEMORY_MAX=1073741824 -DERT_WORKING_SET_MIN=128 -DERT_TRIALS_MIN=1 -DERT_WSS_MULT=1.1 -DERT_GPU -std=c++11 --cuda-gpu-arch=sm_70 -DERT_FP16 -DERT_FP32 -DERT_FP64 -o Results.gpu_v100_smx2.jlse.anl.gov/Run.001/FLOPS.064/kernel1.o -D__CUDACC__=1 ./Kernels/kernel1.cxx",
"directory": "/home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0",
"file": "/home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0/Kernels/kernel1.cxx"
}
]

I would then run dpct like this:

dpct --cuda-include-path=./include/ -p=./compile_commands.json

./include/ holding all the cuda header files necessary. Below is the output I get from that command:

The directory "dpct_output" is used as "out-root"
Processing: /home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0/Kernels/kernel1.cxx
In file included from <built-in>:1:
In file included from /glob/development-tools/versions/oneapi/beta07/inteloneapi/dpcpp-ct/2021.1-beta07/lib/clang/11.0.0/include/__clang_cuda_runtime_wrapper.h:127:
In file included from /home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0/include/cuda_runtime.h:95:
/home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0/include/channel_descriptor.h:106:44: warning: DPCT1004:0: Could not generate replacement.
return cudaCreateChannelDesc(0, 0, 0, 0, cudaChannelFormatKindNone);
^
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc

Meet signal:SIGABRT
Intel(R) DPC++ Compatibility Tool trys to give analysis reports and terminates...

 

By the way, I am trying this on DevCloud and I couldn't find any cuda installations there so I copied the includes folder from a different machine which is using cuda version 10.2.

GouthamK_Intel
Moderator
273 Views

Hi,


Thanks for providing the required information.!


As per the logs provided by you, since you are using the below dpct command.


dpct --cuda-include-path=./include/ -p=./compile_commands.json


It is trying to migrate all the files present inside ./include directory (CUDA 10.2 include dir) as well which are redundant.


Instead, we suggest you to provide the files which you wanted to migrate explicitly to dpct command. You may refer to the below dpct command for reference.


dpct --cuda-include-path=./include/ -p compile_commands.json --in-root=./ Drivers/driver1.cxx Kernels/kernel1.cxx



Thanks & Regards

Goutham


GouthamK_Intel
Moderator
256 Views

Hi,

Just a quick reminder.


If your issue still persists, could you please let us know so that we can investigate the issue you are facing.


If your issue is resolved, please let us know the same.


Thanks & Regards

Goutham


GouthamK_Intel
Moderator
212 Views

Hi,

As we have not heard back from you, we are considering that your issue has been resolved and we have answered all your queries. So we will no longer respond to this thread. If you require any additional assistance from Intel, please start a new thread.

Any further interaction in this thread will be considered community only 


Have a Good day!


Thanks & Regards

Goutham


Reply