dpct "Could not generate replacement" (1004), terminate after "std::bad_alloc"

dawsfox · ‎07-07-2020

I've been trying to convert some cuda code to dpc++ using the DPC++ Compatibility Tool on DevCloud but I get the following errors:

Processing: /home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0/Kernels/kernel1.cxx
In file included from <built-in>:1:
In file included from /glob/development-tools/versions/oneapi/beta07/inteloneapi/dpcpp-ct/2021.1-beta07/lib/clang/11.0.0/include/__clang_cuda_runtime_wrapper.h:127:
In file included from /home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0/include/cuda_runtime.h:95:
/home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0/include/channel_descriptor.h:106:44: warning: DPCT1004:0: Could not generate replacement.
return cudaCreateChannelDesc(0, 0, 0, 0, cudaChannelFormatKindNone);
^
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc

Meet signal:SIGABRT
Intel(R) DPC++ Compatibility Tool trys to give analysis reports and terminates...

I'm not sure how to handle the warning 1004 and am unsure if that is what causes the bad alloc. Can anybody offer some advice?

GouthamK_Intel · ‎07-08-2020

Hi,

Thanks for reaching out to us.!

If possible, could you please provide source code, logs, and steps to reproduce the issue you are facing so that we will able to investigate.

Regards

Goutham

dawsfox · ‎07-08-2020

Source code was cloned from here: https://bitbucket.org/berkeleylab/cs-roofline-toolkit/src/master/

In the Empirical_Roofline_Tool-1.1.0 directory I used the Makefile below to build a single step of the tool's process (it creates multiple kernels to record performance, the makefile only builds one). Using the intercept-build tool with the Makefile produced the .json shown below.

Makefile:

build:
mkdir -p Results.gpu_v100_smx2.jlse.anl.gov/Run.001/FLOPS.064
nvcc -O3 -I./Kernels -DERT_FLOP=64 -DERT_ALIGN=32 -DERT_MEMORY_MAX=1073741824 -DERT_WORKING_SET_MIN=128 -DERT_TRIALS_MIN=1 -DERT_WSS_MULT=1.1 -DERT_GPU -std=c++11 -x cu -arch=sm_70 -DERT_FP16 -DERT_FP32 -DERT_FP64 -c ./Drivers/driver1.cxx -o Results.gpu_v100_smx2.jlse.anl.gov/Run.001/FLOPS.064/driver1.o
nvcc -O3 -I./Kernels -DERT_FLOP=64 -DERT_ALIGN=32 -DERT_MEMORY_MAX=1073741824 -DERT_WORKING_SET_MIN=128 -DERT_TRIALS_MIN=1 -DERT_WSS_MULT=1.1 -DERT_GPU -std=c++11 -x cu -arch=sm_70 -DERT_FP16 -DERT_FP32 -DERT_FP64 -c ./Kernels/kernel1.cxx -o Results.gpu_v100_smx2.jlse.anl.gov/Run.001/FLOPS.064/kernel1.o
nvcc Results.gpu_v100_smx2.jlse.anl.gov/Run.001/FLOPS.064/driver1.o Results.gpu_v100_smx2.jlse.anl.gov/Run.001/FLOPS.064/kernel1.o -o Results.gpu_v100_smx2.jlse.anl.gov/Run.001/FLOPS.064/driver1.kernel1

compile_commands.json

[
{
"command": "nvcc -c -O3 -I./Kernels -DERT_FLOP=64 -DERT_ALIGN=32 -DERT_MEMORY_MAX=1073741824 -DERT_WORKING_SET_MIN=128 -DERT_TRIALS_MIN=1 -DERT_WSS_MULT=1.1 -DERT_GPU -std=c++11 --cuda-gpu-arch=sm_70 -DERT_FP16 -DERT_FP32 -DERT_FP64 -o Results.gpu_v100_smx2.jlse.anl.gov/Run.001/FLOPS.064/driver1.o -D__CUDACC__=1 ./Drivers/driver1.cxx",
"directory": "/home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0",
"file": "/home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0/Drivers/driver1.cxx"
},
{
"command": "nvcc -c -O3 -I./Kernels -DERT_FLOP=64 -DERT_ALIGN=32 -DERT_MEMORY_MAX=1073741824 -DERT_WORKING_SET_MIN=128 -DERT_TRIALS_MIN=1 -DERT_WSS_MULT=1.1 -DERT_GPU -std=c++11 --cuda-gpu-arch=sm_70 -DERT_FP16 -DERT_FP32 -DERT_FP64 -o Results.gpu_v100_smx2.jlse.anl.gov/Run.001/FLOPS.064/kernel1.o -D__CUDACC__=1 ./Kernels/kernel1.cxx",
"directory": "/home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0",
"file": "/home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0/Kernels/kernel1.cxx"
}
]

I would then run dpct like this:

dpct --cuda-include-path=./include/ -p=./compile_commands.json

./include/ holding all the cuda header files necessary. Below is the output I get from that command:

The directory "dpct_output" is used as "out-root"
Processing: /home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0/Kernels/kernel1.cxx
In file included from <built-in>:1:
In file included from /glob/development-tools/versions/oneapi/beta07/inteloneapi/dpcpp-ct/2021.1-beta07/lib/clang/11.0.0/include/__clang_cuda_runtime_wrapper.h:127:
In file included from /home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0/include/cuda_runtime.h:95:
/home/u44750/ert-master/Empirical_Roofline_Tool-1.1.0/include/channel_descriptor.h:106:44: warning: DPCT1004:0: Could not generate replacement.
return cudaCreateChannelDesc(0, 0, 0, 0, cudaChannelFormatKindNone);
^
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc

Meet signal:SIGABRT
Intel(R) DPC++ Compatibility Tool trys to give analysis reports and terminates...

By the way, I am trying this on DevCloud and I couldn't find any cuda installations there so I copied the includes folder from a different machine which is using cuda version 10.2.

GouthamK_Intel · ‎07-10-2020

Hi,

Thanks for providing the required information.!

As per the logs provided by you, since you are using the below dpct command.

dpct --cuda-include-path=./include/ -p=./compile_commands.json

It is trying to migrate all the files present inside ./include directory (CUDA 10.2 include dir) as well which are redundant.

Instead, we suggest you to provide the files which you wanted to migrate explicitly to dpct command. You may refer to the below dpct command for reference.

dpct --cuda-include-path=./include/ -p compile_commands.json --in-root=./ Drivers/driver1.cxx Kernels/kernel1.cxx

Thanks & Regards

Goutham

GouthamK_Intel · ‎07-19-2020

Hi,

Just a quick reminder.

If your issue still persists, could you please let us know so that we can investigate the issue you are facing.

If your issue is resolved, please let us know the same.

Thanks & Regards

Goutham

GouthamK_Intel · ‎08-05-2020

Hi,

As we have not heard back from you, we are considering that your issue has been resolved and we have answered all your queries. So we will no longer respond to this thread. If you require any additional assistance from Intel, please start a new thread.

Any further interaction in this thread will be considered community only

Have a Good day!

Thanks & Regards

Goutham