Showing results for 
Search instead for 
Did you mean: 

Conflict due to overloaded functions when running DPCT


I am not sure how to address conflicts I see when running dpct on certain CUDA codes. the __clz device call is duplicated (redefined in dpcpp_ct in _clang_cuda_device_functions.h) when it is referenced from global  space in CUDA app, so naturally, dpct does not know where to get the right definition. There are other functions which are overloaded. please see the log below:


In file included from /home/farshad/proj/cudamigration/apps/cuda/hoomd-blue/hoomd/
In file included from /home/farshad/proj/cudamigration/apps/cuda/hoomd-blue/hoomd/extern/kernels/scan.cuh:37:
In file included from /home/farshad/proj/hoomd-blue/hoomd/extern/kernels/../device/../mgpudevice.cuh:38:
In file included from /home/farshad/proj/hoomd-blue/hoomd/extern/kernels/../device/../device/deviceutil.cuh:37:
/home/farshad/proj/cudamigration/apps/cuda/hoomd-blue/hoomd/extern/device/intrinsics.cuh:224:9: error: reference to __device__ function '__clz' in __host__ __device__ function
return __clz(x);
/home/farshad/proj/cudamigration/apps/cuda/hoomd-blue/hoomd/extern/device/intrinsics.cuh:283:15: note: called by 'FindLog2'
int a = 31 - clz(x);
/home/farshad/proj/cudamigration/apps/cuda/hoomd-blue/hoomd/extern/kernels/mergesort.cuh:174:18: note: called by 'MergesortPairs<unsigned int, unsigned int, mgpu::less<unsigned int>>'
int numPasses = FindLog2(numBlocks, true);
/home/farshad/proj/cudamigration/apps/cuda/hoomd-blue/hoomd/extern/kernels/mergesort.cuh:211:2: note: called by 'MergesortPairs<unsigned int, unsigned int>'
MergesortPairs(keys_global, values_global, count, mgpu::less<KeyType>(),
/home/farshad/proj/cudamigration/apps/cuda/hoomd-blue/hoomd/ note: called by 'gpu_update_group_table<2, group_storage<2>>'
mgpu::MergesortPairs(d_scratch_idx, d_scratch_g, group_size*n_groups, *mgpu_context);
/opt/intel/oneapi/dpcpp-ct/2021.1-beta08/lib/clang/11.0.0/include/__clang_cuda_device_functions.h:47:16: note: '__clz' declared here
__DEVICE__ int __clz(int __a) { return __nv_clz(__a); }


I am running above code using below command:

 dpct --report-type=all --cuda-include-path=/usr/local/cuda-10.2/include -p compile_commands.json /home/farshad/proj/hoomd-blue/hoomd/


Please note, the CUDA application is enabled under CUDA 11 installed on the development machine. I also see CUDA 11 available in dpcpp_ct directory in clang folder. Meanwhile, I am pushing dpct to use cuda-10.2 for its own compliance! We should be able to merge to one single CUDA reference

 I am attaching all dependencies to this inquiry. any idea how to please the tool? I can't remove the CUDA toolkit from the system because of other issues. 


One additional information: In the compile_commands.json, the include path to CUDA11 is removed. I tried with or without it. Nonetheless, the result is the same. It is included with other files and they work as expected.




0 Kudos
1 Reply


Can you attach all the migrated files generated by DPCT when you ran the command that you have mentioned?

I would like to see the replacement generated by DPCT for the above statements (shown in the dpct migration log).

I cannot use the json file provided by you to reproduce this issue, due to path inconsistencies (Makefile can help).

As long as your CUDA application is compatible with 10.2, there shouldn't be any problem during migration. But, if your CUDA application strictly needs CUDA 11, then it might cause some issues during migration since CUDA 11 is not supported by DPCT currently.



0 Kudos