- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
To compiler SYCL code for a specific NVIDIA device, I've used to following:
clang++ \
-fsycl \
-fsycl-targets=nvptx64-nvidia-cuda-sycldevice \
-fsycl-unnamed-lambda \
-Xsycl-target-backend "--cuda-gpu-arch=sm_35" \
test.cpp
Is there a way to generate a fat binary containing several sm_* computes ?
I have gone through the manual with no avail.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
As per the documentation, SM-50 and above architectures are supported.
https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md
Please note that only the Github version of DPC++ supports the CUDA backend. If you have further queries, please raise a new issue in the below link:
https://github.com/intel/llvm/issues
Thanks,
Rahul
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Could you try the following command and let us know?
clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda-sycldevice sample.cpp -o sample
IMO, the above command should work on any SM architecture.
Thanks,
Rahul
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for suggestions.
But if you remove '-Xsycl-target-backend', the program will fail at runtime, for instance on Tesla K40
PI CUDA ERROR:
Value: 209
Name: CUDA_ERROR_NO_BINARY_FOR_GPU
Description: no kernel image is available for execution on the device
Function: build_program
Source Location: .../apps/src/llvm/unstable/sycl/plugins/cuda/pi_cuda.cpp:516
PI CUDA ERROR:
Value: 400
Name: CUDA_ERROR_INVALID_HANDLE
Description: invalid resource handle
Function: cuda_piProgramRelease
Source Location: .../apps/src/llvm/unstable/sycl/plugins/cuda/pi_cuda.cpp:2938
The program was built for 1 devices
Build program log for 'Tesla K40m':
-999 (Unknown OpenCL error code)
If I force sm35, it will not work on other gpus such as V100.
Since the manual is quite terse, I have asked the question in case I overlooked something.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
As per the documentation, SM-50 and above architectures are supported.
https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md
Please note that only the Github version of DPC++ supports the CUDA backend. If you have further queries, please raise a new issue in the below link:
https://github.com/intel/llvm/issues
Thanks,
Rahul
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
You are right.
For CUDA-related question, I should have asked the intel-llvm developers instead.
I will raise the issue through github page. Nevertheless, thanks for your time.
Regards.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Intel will no longer monitor this thread. Further discussions on this thread will be considered community only.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page