Intel® oneAPI Data Parallel C++
Support for Intel® oneAPI DPC++ Compiler, Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and GDB*

Fat binary with PTX backend

Viet-Duc
Novice
507 Views

 

Hi,


To compiler SYCL code for a specific NVIDIA device, I've used to following:

clang++ \
  -fsycl \
  -fsycl-targets=nvptx64-nvidia-cuda-sycldevice \ 
  -fsycl-unnamed-lambda \
  -Xsycl-target-backend "--cuda-gpu-arch=sm_35" \
  test.cpp

Is there a way to generate a fat binary containing several sm_* computes ? 

I have gone through the manual with no avail.

 

Thanks

0 Kudos
1 Solution
RahulV_intel
Moderator
447 Views

Hi,


As per the documentation, SM-50 and above architectures are supported.

https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md


Please note that only the Github version of DPC++ supports the CUDA backend. If you have further queries, please raise a new issue in the below link:

https://github.com/intel/llvm/issues



Thanks,

Rahul


View solution in original post

5 Replies
RahulV_intel
Moderator
490 Views

Hi,

 

Could you try the following command and let us know?

 

clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda-sycldevice sample.cpp -o sample

 

IMO, the above command should work on any SM architecture.

 

 

Thanks,

Rahul

 

Viet-Duc
Novice
466 Views

Thanks for suggestions.

But if you remove '-Xsycl-target-backend', the program will fail at runtime, for instance on Tesla K40

PI CUDA ERROR:
        Value:           209
        Name:            CUDA_ERROR_NO_BINARY_FOR_GPU
        Description:     no kernel image is available for execution on the device
        Function:        build_program
        Source Location: .../apps/src/llvm/unstable/sycl/plugins/cuda/pi_cuda.cpp:516


PI CUDA ERROR:
        Value:           400
        Name:            CUDA_ERROR_INVALID_HANDLE
        Description:     invalid resource handle
        Function:        cuda_piProgramRelease
        Source Location: .../apps/src/llvm/unstable/sycl/plugins/cuda/pi_cuda.cpp:2938

The program was built for 1 devices
Build program log for 'Tesla K40m':
 -999 (Unknown OpenCL error code)

If I force sm35, it will not work on other gpus such as V100.

Since the manual is quite terse, I have asked the question in case I overlooked something.

RahulV_intel
Moderator
448 Views

Hi,


As per the documentation, SM-50 and above architectures are supported.

https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md


Please note that only the Github version of DPC++ supports the CUDA backend. If you have further queries, please raise a new issue in the below link:

https://github.com/intel/llvm/issues



Thanks,

Rahul


Viet-Duc
Novice
443 Views

Hi,

 

You are right.

For CUDA-related question, I should have asked the intel-llvm developers instead.

I will raise the issue through github page. Nevertheless, thanks for your time.

 

Regards.

RahulV_intel
Moderator
411 Views

Intel will no longer monitor this thread. Further discussions on this thread will be considered community only.


Reply