Intel® oneAPI DPC++/C++ Compiler
Talk to fellow users of Intel® oneAPI DPC++/C++ Compiler and companion tools like Intel® oneAPI DPC++ Library, Intel® DPC++ Compatibility Tool, and Intel® Distribution for GDB*
724 Discussions

SYCL builtins have issues "just working" - test for sqrt, cos, and sin attached to show a breakdown of different scenerios

JamesR
New Contributor II
3,261 Views

I have compiler bugs to report - they are strongly related, and may come back to a single issue in the compiler design:

(1) The compiler emits different kernel code for functions sqrt, cos, sin (which I will now call X) when coded as X(), std::X() than it does for cl::sycl::X() or sycl::X().  This is true even if “using namespace sycl” is used. I believe, given that -fsycl (or dpcpp driver) is used, when compiling a kernel should cause the SYCL standard to be observed.

(2) The Intel GPU driver crashes when it gets a sqrt, cos, or sin in the kernel code, to JIT, when the compiler was fed X() or std::X() instead of sycl::x() or cl::sycl::X().  The HOST driver handles them properly.

(3) When using -ffast-math - the llvm backend works for sqrt but CRASHES (in the compilation pass) for cos and sin.

All the files needed to replicate this are in a ZIP file, attached.

 

Xinmin is aware of this issue, and may have some commits for beta07 to address it - but wanted me to document this complete report so that all the bugs could be considered.

That said - the SYCL builtins, for FP32 and FP64 need to "just work" for user of DPC++.  Ideally - the entire matrix (in the attached ZIP file) of options (sycl:: and std::) would work without the user doing anything other than compiling.

 

$ uname -a
Linux s001-n176 4.15.18 #1 SMP Fri Oct 18 11:54:23 PDT 2019 x86_64 x86_64 x86_64 GNU/Linux

$ clang++ -v
Intel(R) oneAPI DPC++ Compiler 2021.1-beta06 (2020.4.0.0415)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /glob/development-tools/versions/oneapi/beta06/inteloneapi/compiler/latest/linux/bin
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.4.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.4.0
Candidate multilib: .;@m64
Selected multilib: .;@m64

 

0 Kudos
7 Replies
RahulV_intel
Moderator
3,262 Views

Hi James,

Thanks for providing the documentation for various bugs that you have encountered.

If I were to make guesses for some of the anomalies that you have encountered, it would be as follows:

 This is true even if “using namespace sycl” is used.

The scope of using namespace sycl; statement is only valid in the host part of the code. Since the individual compilation units of host and device processes host code and device code separately, the namespace statement remains valid only in the host part of the code. Hence, there could be unresolved dependencies in the device part of the code.

The Intel GPU driver crashes when it gets a sqrt, cos, or sin in the kernel code, to JIT, when the compiler was fed X() or std::X() instead of sycl::x() or cl::sycl::X().

Dpc++ offers support for some functions within the "std" namespace inside the kernel.

Link: https://software.intel.com/content/www/us/en/develop/documentation/oneapi-dpcpp-library-guide/top/tested-standard-c-apis.html

As per the documentation, these "std" functions  have been tested on GCC* 7.5.0. std:: cos and std::sqrt are derived from GCC's libstdc++ library (tested on GCC 7.5.0). I could see that your GCC version is 7.4.0. Let me confirm with the team if GCC 7.4.0's libraries(libstdc++) are supported inside the sycl kernel.

The HOST driver handles them properly.

As per SYCL specs 1.2.1,

The SYCL host device is a native C++ implementation of a device. It does not have an OpenCL cl_device_id and it will only appear in the available SYCL devices, as it is not an OpenCL device. It has full SYCL capabilities and reports them through the SYCL information retrieval interface. The SYCL host device is mandatory for every SYCL implementation and is always available, but may not achieve the same performance as an OpenCL CPU device. Any C++ application debugger can be used for debugging SYCL kernels executing on a SYCL host device.

When running on SYCL host device, it runs the device code as a native C++ implementation of the device. DPC++'s current implementation of host device uses TBB as its backend.

When using -ffast-math - the llvm backend works for sqrt but CRASHES (in the compilation pass) for cos and sin

This part is extremely intriguing. With -ffast-math flag, the code build/runs successfully for sqrt function inside the kernel(without sycl::sqrt too!). Where as, it fails for sin and cos functions. I'm not really sure as to what -ffast-math flag is doing in the backend. I will check this with the team and get back to you at the earliest. 

 

Regards,

Rahul

0 Kudos
JamesR
New Contributor II
3,262 Views

using namespace does apply within kernels, it is NOT limited to the host code.

The std:: functions you refer to are known to me - I put a table of them together for Chapter 17 of your upcoming book on SYCL and DPC++.

I've outlined a number of issues, and they are quite real.

Let's work to make sure they all get investigated.

Let me know if I can help.

- james

 

0 Kudos
RahulV_intel
Moderator
3,262 Views

Hi James,

That was just my assumption regarding namespace(I did not find any reference in SYCL specs 1.2.1).

The issues that you have encountered are reproducible at my end too. I was in the process of verifying your claims and hence the delay.

I will escalate these issues to the concerned team.

Thanks for the thorough documentation.

 

--Rahul

0 Kudos
Varsha_M_Intel
Employee
3,262 Views

HI James, 

Regarding, using sycl namespace for math functions -

We have had this discussion previously on using sycl namespace before call to math function on device . 
1. On GPU,  you cannot call any arbitrary function if it wasn't compiled for the device, so, external call to exp is unresolved.
2. The "using namespace" just helps a compiler to find a function, but does not enforce it to do in the exact scope (otherwise "using namespace cl::sycl;" would break usage of std::cout)

It is nicely explained using this example -
 https://godbolt.org/z/8ntX-9 . Here a class "Example" was added in a global scope and in a namespace "A". Then it was used in main() function after "using namespace A;". The result is compilation failure because of ambiguities declaration.

Thanks,
Varsha

 

0 Kudos
PrasanthD_intel
Moderator
2,957 Views

Hi James,


Thanks for your patience. The issue raised by you has been fixed in the latest OneAPI version 2021.2. Please download and let us know your experience with it.


0 Kudos
PrasanthD_intel
Moderator
2,933 Views

Hi James,


We haven't heard back from you. Please let us know your feedback after testing the latest version.


Regards

Prasanth


0 Kudos
PrasanthD_intel
Moderator
2,919 Views

Hi,


We are closing this thread as the issue has been resolved in the latest version.

We will no longer respond to this thread. If you require additional assistance from Intel, please start a new thread. Any further interaction in this thread will be considered community only


Regards

Prasanth


0 Kudos
Reply