Community
cancel
Showing results for 
Search instead for 
Did you mean: 
sirgienko
Employee
231 Views

DPC++ crash inside llvm code in runtime, if kernel don't use output buffer

The problem mostly described in title, but here more details.

In lambda, submitted in queue, we often create some output buffer, like this

auto addend_1_accessor = addend_1_buf.template get_access<sycl_read>(cgh);
auto addend_2_accessor = addend_2_buf.template get_access<sycl_read>(cgh);

// This is output (because we use write) buffer
auto sum_accessor = sum_buf.template get_access<sycl_write>(cgh);

And in kernel code, for example, inside parallel for, we modify it

cgh.parallel_for<class VectorAdd>(num_items, [=](id<1> wiID) {
    sum_accessor[wiID] = addend_1_accessor[wiID] + addend_2_accessor[wiID];
});

The problem is, that if you remove writing in the output, buffer, like this 

cgh.parallel_for<class VectorAdd>(num_items, [=](id<1> wiID) {
    int sum = addend_1_accessor[wiID] + addend_2_accessor[wiID];
});

then your code will crush in runtime with error inside llvm code from DPC++ libOclCpuBackEnd library. 

Minimal reproduce example is attached. File with building instruction, full backtrace and system description is attached too (inside archive).

Labels (1)
0 Kudos
2 Replies
RahulV_intel
Moderator
217 Views

Hi,

 

Looks like .so shared object file has got something to do with this error. Error is reproducible with CPU device(cpu_selector). However, It works fine with GPU device(gpu_selector). Tried with gen 9(Intel iGPU).

 

In the absence of dynamic library, the program(even without using output buffer) compiles/runs fine with CPU selector as well. Refer to the code below.

 

#include <CL/sycl.hpp>
#include <iostream>
#include <vector>

#define size 1024

void vecAdd(std::vector<int> &A, std::vector<int> &B, std::vector<int> &C) {
    //cl::sycl::queue q(cl::sycl::gpu_selector{});
    cl::sycl::queue q(cl::sycl::cpu_selector{});
    cl::sycl::range<1> R(size);
    std::cout<< "Running on: "<<q.get_device().get_info<cl::sycl::info::device::name>()<<"\n";
    {
    cl::sycl::buffer<int,1> buffA(A.data(),R);
    cl::sycl::buffer<int,1> buffB(B.data(),R);
    cl::sycl::buffer<int,1> buffC(C.data(),R);
    q.submit([&](cl::sycl::handler &cgh) {
            auto acc_buffA = buffA.get_access<cl::sycl::access::mode::read>(cgh);
            auto acc_buffB = buffB.get_access<cl::sycl::access::mode::read>(cgh);
            auto acc_buffC = buffC.get_access<cl::sycl::access::mode::write>(cgh);
            //cl::sycl::stream out(10240, 25, cgh);
            cgh.parallel_for(R,[=](cl::sycl::id<1> it) {
                auto id = it[0];
                //auto id = it.get_global_linear_id();
                //out << it.get_global_linear_id()<< cl::sycl::endl;
                //acc_buffC[id] = acc_buffB[id] + acc_buffA[id];
                int sum = acc_buffB[id] + acc_buffA[id];
            }
            );
        }
    ).wait();
    }
}

int main() {
    std::vector<int> A(size),B(size),C(size);
    for(int i=0;i<size;i++) {
        A[i] = i;
        B[i] = size - i;
    }
    vecAdd(A,B,C);
    std::vector<int> C_chk(size,size);
    (C==C_chk) ? std::cout << "Success\n" : std::cout<<"Failure\n";
    return 0;
}
//To compile and run
dpcpp vecadd_sample.cpp && ./a.out

 

 

I need to investigate a bit more on this issue(with dynamic library) and will get back to you at the earliest.

 

Thanks,

Rahul

 

RahulV_intel
Moderator
185 Views

Hi,


I have escalated this issue as a bug to the concerned team. Thanks for reporting this.


--Rahul


Reply