Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

MKL examples segmentation fault

Jose_M_Monsalve_Diaz
1,983 Views

Hi everyone,

 

I am trying to run simple examples of intel MKL for the integrated gpu using OpenMP. However I am running with some problems.

Machine:

  • Dell Precision 3630 Tower 
  • Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
  • 2 x 16 GB memory for a total of 32GB DDR4 2666MHz 
  • Ubuntu 20.04, kernel 5.4.0-48-generic 

Installed drivers and packages:

intel-gpu-tools/focal,now 1.25-2 amd64 [installed]
intel-hpckit-getting-started/all,all,now 2021.1-2205.beta09 all [installed,automatic]
intel-hpckit/all,now 2021.1-2205.beta09 amd64 [installed]
intel-level-zero-gpu/focal,now 1.0.17906+i374~u20.04 amd64 [installed]
intel-media-va-driver-non-free/focal,now 20.2.0+i374~u20.04 amd64 [installed]
intel-microcode/focal-updates,focal-security,now 3.20200609.0ubuntu0.20.04.2 amd64 [installed,automatic]
intel-oneapi-clck-2021.1-beta09/all,now 2021.1-760.beta09 amd64 [installed,automatic]
intel-oneapi-clck/all,now 2021.1-760.beta09 amd64 [installed,automatic]
intel-oneapi-common-licensing-2021.1-beta09/all,all,now 2021.1-541.beta09 all [installed,automatic]
intel-oneapi-common-licensing/all,all,now 2021.1-541.beta09 all [installed,automatic]
intel-oneapi-common-vars/all,all,now 2021.1-541.beta09 all [installed,automatic]
intel-oneapi-compiler-shared-2021.1-beta09/all,now 2021.1-2198.beta09 amd64 [installed,automatic]
intel-oneapi-compiler-shared-common-2021.1-beta09/all,all,now 2021.1-2198.beta09 all [installed,automatic]
intel-oneapi-compiler-shared-common-runtime-2021.1-beta09/all,all,now 2021.1-2198.beta09 all [installed,automatic]
intel-oneapi-compiler-shared-runtime-2021.1-beta09/all,now 2021.1-2198.beta09 amd64 [installed,automatic]
intel-oneapi-condaindex/all,now 2021.1-490.beta06 amd64 [installed,automatic]
intel-oneapi-dev-utilities-2021.1-beta09/all,now 2021.1-2165.beta09 amd64 [installed,automatic]
intel-oneapi-dev-utilities-eclipse-cfg/all,all,now 2021.1-2165.beta09 all [installed,automatic]
intel-oneapi-dev-utilities/all,now 2021.1-2165.beta09 amd64 [installed,automatic]
intel-oneapi-dpcpp-cpp-2021.1-beta09/all,now 2021.1-2198.beta09 amd64 [installed,automatic]
intel-oneapi-dpcpp-cpp-compiler-common-2021.1-beta09/all,all,now 2021.1-2198.beta09 all [installed,automatic]
intel-oneapi-dpcpp-cpp-compiler-pro-2021.1-beta09/all,now 2021.1-2198.beta09 amd64 [installed,automatic]
intel-oneapi-dpcpp-cpp-compiler-pro-common-2021.1-beta09/all,all,now 2021.1-2198.beta09 all [installed,automatic]
intel-oneapi-dpcpp-cpp-compiler-pro-eclipse-cfg/all,all,now 2021.1-2198.beta09 all [installed,automatic]
intel-oneapi-dpcpp-cpp-compiler-pro/all,now 2021.1-2198.beta09 amd64 [installed,automatic]
intel-oneapi-dpcpp-cpp-compiler-runtime-2021.1-beta09/all,now 2021.1-2198.beta09 amd64 [installed,automatic]
intel-oneapi-dpcpp-cpp-pro-fortran-compiler-shared-common-2021.1-beta09/all,all,now 2021.1-2198.beta09 all [installed,automatic]
intel-oneapi-dpcpp-cpp-pro-fortran-compiler-shared-runtime-2021.1-beta09/all,now 2021.1-2198.beta09 amd64 [installed,automatic]
intel-oneapi-ifort-2021.1-beta09/all,now 2021.1-2198.beta09 amd64 [installed,automatic]
intel-oneapi-ifort-common-2021.1-beta09/all,all,now 2021.1-2198.beta09 all [installed,automatic]
intel-oneapi-ifort-runtime-2021.1-beta09/all,now 2021.1-2198.beta09 amd64 [installed,automatic]
intel-oneapi-ifort/all,now 2021.1-2198.beta09 amd64 [installed,automatic]
intel-oneapi-inspector/all,now 2021.1-319.beta09 amd64 [installed,automatic]
intel-oneapi-itac-2021.1-beta09/all,now 2021.1-376.beta09 amd64 [installed,automatic]
intel-oneapi-itac/all,now 2021.1-376.beta09 amd64 [installed,automatic]
intel-oneapi-mkl-2021.1-beta09/all,now 2021.1-2089.beta09 amd64 [installed,automatic]
intel-oneapi-mkl-common-2021.1-beta09/all,all,now 2021.1-2089.beta09 all [installed,automatic]
intel-oneapi-mkl-common-devel-2021.1-beta09/all,all,now 2021.1-2089.beta09 all [installed,automatic]
intel-oneapi-mkl-devel-2021.1-beta09/all,now 2021.1-2089.beta09 amd64 [installed,automatic]
intel-oneapi-mkl-devel/all,now 2021.1-2089.beta09 amd64 [installed]
intel-oneapi-mkl/all,now 2021.1-2089.beta09 amd64 [installed]
intel-oneapi-mpi-2021.1-beta09/all,now 2021.1-2099.beta09 amd64 [installed,automatic]
intel-oneapi-mpi-devel-2021.1-beta09/all,now 2021.1-2099.beta09 amd64 [installed,automatic]
intel-oneapi-mpi-devel/all,now 2021.1-2099.beta09 amd64 [installed,automatic]
intel-oneapi-mpi/all,now 2021.1-2099.beta09 amd64 [installed,automatic]
intel-oneapi-openmp-2021.1-beta09/all,now 2021.1-2198.beta09 amd64 [installed,automatic]
intel-oneapi-openmp/all,now 2021.1-2198.beta09 amd64 [installed,automatic]
intel-oneapi-tbb-2021.1-beta09/all,now 2021.1-2193.beta09 amd64 [installed,automatic]
intel-oneapi-tbb-common-2021.1-beta09/all,all,now 2021.1-2193.beta09 all [installed,automatic]
intel-oneapi-tbb-common-devel-2021.1-beta09/all,all,now 2021.1-2193.beta09 all [installed,automatic]
intel-oneapi-tbb-devel-2021.1-beta09/all,now 2021.1-2193.beta09 amd64 [installed,automatic]
intel-oneapi-tbb-devel/all,now 2021.1-2193.beta09 amd64 [installed,automatic]
intel-oneapi-tbb/all,now 2021.1-2193.beta09 amd64 [installed,automatic]
intel-opencl-icd/focal,now 20.37.17906+i374~u20.04 amd64 [installed]
libdrm-intel1/focal,now 2.4.101+i374~u20.04 amd64 [installed,automatic]
level-zero/focal,now 1.0.0+i374~u20.04 amd64 [installed]

 Everything was installed through package manager

The code:

It's an example code so there is not much to it. I've simplified it a little and it still fails. I've tried several other example codes with the same result. Even sycl examples fail.

...

    float *a, *b, *c, alpha, beta;
    MKL_INT m, n, k, lda, ldb, ldc, i, j;
    MKL_INT sizea, sizeb, sizec;
    
    alpha = 1.0;
    beta = 1.0;

    m = 10;
    n = 10;
    k = 10;

    lda = m;
    ldb = k;
    ldc = m;

    sizea = lda * k;
    sizeb = ldb * n;
    sizec = ldc * n;
    
    // allocate matrices
    a = (float *)mkl_malloc((lda * k) * sizeof(float), 64);
    b = (float *)mkl_malloc(ldb * n * sizeof(float), 64);
    c = (float *)mkl_malloc(ldc * n * sizeof(float), 64);

    if ((a == NULL) || (b == NULL) || (c == NULL)) {
        printf("Cannot allocate matrices\n");
        return 1;
    }

    // initialize matrices
    init_single_array(lda * k, a, 1);
    init_single_array(ldb * n, b, 1);

#pragma omp target map(c[0:sizec])
    {
        for (i = 0; i < sizec; i++) {
            c[i] = 42;
        }
    }
    
    MKL_INT bound_m = (m > 10) ? 10 : m;
    MKL_INT bound_n = (n > 10) ? 10 : n;
    
#pragma omp target data map(to:a[0:sizea],b[0:sizeb]) map(tofrom:c[0:sizec]) device(dnum)
    {

        // run gemm on gpu, use standard oneMKL interface within a variant dispatch construct
#pragma omp target variant dispatch device(dnum) use_device_ptr(a, b, c)
        {
            sgemm("N", "N", &m, &n, &k, &alpha, a, &lda, b, &ldb, &beta, c, &ldc);
        }
        
    }
...

 

The compilation command

Again all resulting from the makefile of the examples. I'm just expanding it here to add the -g

> icx -c -g -DMKL_ILP64 -m64 -fiopenmp -fopenmp-targets=spir64 \
  -mllvm -vpo-paropt-use-raw-dev-ptr \
  -I/opt/intel/oneapi/mkl/2021.1-beta09/include \
  -Icommon blas/sgemm.c \
  -o _results/intel64_ilp64_sequential_so/blas/sgemm.o

> icx -g _results/intel64_ilp64_sequential_so/blas/sgemm.o  \
  -fiopenmp -fopenmp-targets=spir64 -mllvm -vpo-paropt-use-raw-dev-ptr \
  -L"/opt/intel/oneapi/mkl/2021.1-beta09/lib/intel64" \
  -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core  -lOpenCL \
  -lpthread -ldl -lm -o _results/intel64_ilp64_sequential_so/blas/sgemm.out

> LD_LIBRARY_PATH="/opt/intel/oneapi/mkl/2021.1-beta09/lib/intel64":\
    /opt/intel/oneapi/tbb/2021.1-beta09/env/../lib/intel64/gcc4.8:\
    /opt/intel/oneapi/mkl/2021.1-beta09/lib/intel64:\
    /opt/intel/oneapi/compiler/2021.1-beta09/linux/lib:\
    /opt/intel/oneapi/compiler/2021.1-beta09/linux/lib/x64:\
    /opt/intel/oneapi/compiler/2021.1-beta09/linux/lib/emu:\
    /opt/intel/oneapi/compiler/2021.1-beta09/linux/compiler/lib/intel64_lin:\
    /opt/intel/oneapi/compiler/2021.1-beta09/linux/compiler/lib:\
    /usr/local/cuda/lib64: \
    :"/opt/intel/oneapi/mkl/2021.1-beta09/../compiler/lib/intel64" \
    _results/intel64_ilp64_sequential_so/blas/sgemm.out

 

The stacktrace:

#0 0x0000000000000000 in ?? ()
#1 0x00007fffedb1c8ae in MKL_CL_Create_Handle () from /opt/intel/oneapi/mkl/2021.1-beta09/lib/intel64/libmkl_core.so
#2 0x00007fffedb36ba9 in mkl_cblas_sgemm_cl_offload_ilp64 () from /opt/intel/oneapi/mkl/2021.1-beta09/lib/intel64/libmkl_core.so
#3 0x0000000000401fbf in main.DIR.OMP.TARGET.DATA.629.split () at blas/sgemm.c:84
#4 0x0000000000401879 in main () at blas/sgemm.c:78

 

Last thing happening in strace

Just in case this is helpful

...
mprotect(0x7ff0b7fe9000, 28672, PROT_READ) = 0
openat(AT_FDCWD, "/opt/intel/oneapi/mkl/2021.1-beta09/lib/intel64/libmkl_vml_avx2.so", O_RDONLY|O_CLOEXEC) = 4
read(4, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`%\5\0\0\0\0\0"..., 832) = 832
fstat(4, {st_mode=S_IFREG|0755, st_size=14731352, ...}) = 0
mmap(NULL, 16379656, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 4, 0) = 0x7ff0bcf9d000
mprotect(0x7ff0bdd1e000, 2097152, PROT_NONE) = 0
mmap(0x7ff0bdf1e000, 118784, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 4, 0xd81000) = 0x7ff0bdf1e000
mmap(0x7ff0bdf3b000, 3848, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7ff0bdf3b000
close(4)                                = 0
mprotect(0x7ff0bdf1e000, 16384, PROT_READ) = 0
ioctl(3, DRM_IOCTL_I915_GEM_USERPTR, 0x7ffcf899eca0) = 0
ioctl(3, DRM_IOCTL_I915_GEM_USERPTR, 0x7ffcf899eca0) = 0
ioctl(3, DRM_IOCTL_I915_GEM_USERPTR, 0x7ffcf899eca0) = 0
ioctl(3, DRM_IOCTL_I915_GEM_USERPTR, 0x7ffcf899ef80) = 0
ioctl(3, DRM_IOCTL_I915_GEM_USERPTR, 0x7ffcf899ef80) = 0
openat(AT_FDCWD, "/opt/intel/oneapi/tbb/2021.1-beta09/env/../lib/intel64/gcc4.8/libOpenCL.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/opt/intel/oneapi/mkl/2021.1-beta09/lib/intel64/libOpenCL.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/opt/intel/oneapi/compiler/2021.1-beta09/linux/lib/libOpenCL.so", O_RDONLY|O_CLOEXEC) = 4
read(4, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 832) = 832
fstat(4, {st_mode=S_IFREG|0755, st_size=39848, ...}) = 0
close(4)                                = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=NULL} ---
+++ killed by SIGSEGV (core dumped) +++
Segmentation fault (core dumped)

 

LIBOMPTARGET_DEBUG=1

 

Libomptarget --> TargetOffloadPolicy = DEFAULT
Libomptarget --> Initialized OMPT
Libomptarget --> Loading RTLs...
Libomptarget --> Loading library 'libomptarget.rtl.level0.so'...
Target LEVEL0 RTL --> Target device type is set to GPU
Target LEVEL0 RTL --> omp_get_thread_limit() returned 2147483647
Target LEVEL0 RTL --> omp_get_max_teams() returned 0
Libomptarget --> Successfully loaded library 'libomptarget.rtl.level0.so'!
Libomptarget --> Optional interface: __tgt_rtl_data_submit_nowait
Libomptarget --> Optional interface: __tgt_rtl_data_retrieve_nowait
Libomptarget --> Optional interface: __tgt_rtl_manifest_data_for_region
Libomptarget --> Optional interface: __tgt_rtl_data_alloc_base
Libomptarget --> Optional interface: __tgt_rtl_data_alloc_user
Libomptarget --> Optional interface: __tgt_rtl_create_buffer
Libomptarget --> Optional interface: __tgt_rtl_release_buffer
Libomptarget --> Optional interface: __tgt_rtl_run_target_team_nd_region
Libomptarget --> Optional interface: __tgt_rtl_run_target_region_nowait
Libomptarget --> Optional interface: __tgt_rtl_run_target_team_region_nowait
Libomptarget --> Optional interface: __tgt_rtl_run_target_team_nd_region_nowait
Libomptarget --> Optional interface: __tgt_rtl_create_offload_queue
Libomptarget --> Optional interface: __tgt_rtl_release_offload_queue
Libomptarget --> Optional interface: __tgt_rtl_get_platform_handle
Libomptarget --> Optional interface: __tgt_rtl_get_device_handle
Libomptarget --> Optional interface: __tgt_rtl_data_alloc_managed
Libomptarget --> Optional interface: __tgt_rtl_data_delete_managed
Libomptarget --> Optional interface: __tgt_rtl_is_managed_ptr
Libomptarget --> Optional interface: __tgt_rtl_data_alloc_explicit
Libomptarget --> Optional interface: __tgt_rtl_init_ompt
Target LEVEL0 RTL --> Looking for Level0 devices...
Target LEVEL0 RTL --> Initialized L0, API 65536
Target LEVEL0 RTL --> Found 1 driver(s)!
Target LEVEL0 RTL --> Found a GPU device, Name = Intel(R) Gen9
Target LEVEL0 RTL --> Found 1 available devices.
Target LEVEL0 RTL --> Initialized OMPT
Libomptarget --> Registering RTL libomptarget.rtl.level0.so supporting 1 devices!
Libomptarget --> Loading library 'libomptarget.rtl.opencl.so'...
Target OPENCL RTL --> omp_get_thread_limit() returned 2147483647
Target OPENCL RTL --> omp_get_max_teams() returned 0
Target OPENCL RTL --> Target device type is set to GPU
Libomptarget --> Successfully loaded library 'libomptarget.rtl.opencl.so'!
Libomptarget --> Optional interface: __tgt_rtl_data_submit_nowait
Libomptarget --> Optional interface: __tgt_rtl_data_retrieve_nowait
Libomptarget --> Optional interface: __tgt_rtl_manifest_data_for_region
Libomptarget --> Optional interface: __tgt_rtl_data_alloc_base
Libomptarget --> Optional interface: __tgt_rtl_data_alloc_user
Libomptarget --> Optional interface: __tgt_rtl_create_buffer
Libomptarget --> Optional interface: __tgt_rtl_get_device_name
Libomptarget --> Optional interface: __tgt_rtl_release_buffer
Libomptarget --> Optional interface: __tgt_rtl_run_target_team_nd_region
Libomptarget --> Optional interface: __tgt_rtl_run_target_region_nowait
Libomptarget --> Optional interface: __tgt_rtl_run_target_team_region_nowait
Libomptarget --> Optional interface: __tgt_rtl_run_target_team_nd_region_nowait
Libomptarget --> Optional interface: __tgt_rtl_create_offload_queue
Libomptarget --> Optional interface: __tgt_rtl_release_offload_queue
Libomptarget --> Optional interface: __tgt_rtl_get_platform_handle
Libomptarget --> Optional interface: __tgt_rtl_data_alloc_managed
Libomptarget --> Optional interface: __tgt_rtl_data_delete_managed
Libomptarget --> Optional interface: __tgt_rtl_is_managed_ptr
Libomptarget --> Optional interface: __tgt_rtl_data_alloc_explicit
Libomptarget --> Optional interface: __tgt_rtl_init_ompt
Target OPENCL RTL --> Start initializing OpenCL
Target OPENCL RTL --> Platform OpenCL 2.1  has 1 Devices
Target OPENCL RTL --> Device 0: Intel(R) Gen9 HD Graphics NEO
Target OPENCL RTL --> Number of execution units on the device is 24
Target OPENCL RTL --> Maximum work group size for the device is 256
Target OPENCL RTL --> Addressing mode is 64 bit
Target OPENCL RTL --> Device local mem size: 65536
Target OPENCL RTL --> Initialized OMPT
Libomptarget --> Registering RTL libomptarget.rtl.opencl.so supporting 1 devices!
Libomptarget --> Loading library 'libomptarget.rtl.ve.so'...
Libomptarget --> Unable to load library 'libomptarget.rtl.ve.so': libomptarget.rtl.ve.so: cannot open shared object file: No such file or directory!
Libomptarget --> Loading library 'libomptarget.rtl.ppc64.so'...
Libomptarget --> Unable to load library 'libomptarget.rtl.ppc64.so': libomptarget.rtl.ppc64.so: cannot open shared object file: No such file or directory!
Libomptarget --> Loading library 'libomptarget.rtl.x86_64.so'...
Libomptarget --> Unable to load library 'libomptarget.rtl.x86_64.so': libffi.so.6: cannot open shared object file: No such file or directory!
Libomptarget --> Loading library 'libomptarget.rtl.cuda.so'...
Libomptarget --> Unable to load library 'libomptarget.rtl.cuda.so': libomptarget.rtl.cuda.so: cannot open shared object file: No such file or directory!
Libomptarget --> Loading library 'libomptarget.rtl.aarch64.so'...
Libomptarget --> Unable to load library 'libomptarget.rtl.aarch64.so': libomptarget.rtl.aarch64.so: cannot open shared object file: No such file or directory!
Libomptarget --> RTLs loaded!
Target LEVEL0 RTL --> Target binary is VALID
Libomptarget --> Image 0x000000000040b960 is compatible with RTL libomptarget.rtl.level0.so!
Libomptarget --> RTL 0x0000000001ac6150 has index 0!
Libomptarget --> Registering image 0x000000000040b960 with RTL libomptarget.rtl.level0.so!
Libomptarget --> Done registering entries!
Libomptarget --> Call to omp_get_num_devices returning 1
Libomptarget --> Default TARGET OFFLOAD policy is now mandatory (devices were found)
Libomptarget --> Entering target region with entry point 0x000000000040b186 and device Id -1
Libomptarget --> Checking whether device 0 is ready.
Libomptarget --> Is the device 0 (local ID 0) initialized? 0
Target LEVEL0 RTL --> Initialize requires flags to 1
Target LEVEL0 RTL --> Initialized Level0 device 0
Libomptarget --> Device 0 is ready to use.
Target LEVEL0 RTL --> Device 0: Loading binary from 0x000000000040b960
Target LEVEL0 RTL --> Expecting to have 1 entries defined
Target LEVEL0 RTL --> Module compilation options: -cl-std=CL2.0 
Target LEVEL0 RTL --> Created a module for libomp-fallback-cassert.spv
Target LEVEL0 RTL --> Created a module for libomp-fallback-cmath.spv
Target LEVEL0 RTL --> Created a module for libomp-fallback-cmath-fp64.spv
Target LEVEL0 RTL --> Created a module for libomp-fallback-complex.spv
Target LEVEL0 RTL --> Created a module for libomp-fallback-complex-fp64.spv
Target LEVEL0 RTL --> Allocated a shared memory object 0x00000000022d9000
Libomptarget --> Entry  0: Base=0x0000000001fe3640, Begin=0x0000000001fe3640, Size=400, Type=0x23
Libomptarget --> Entry  1: Base=0x00007fffeed03a90, Begin=0x00007fffeed03a90, Size=8, Type=0x21
Libomptarget --> Entry  2: Base=0x00007fffeed03a48, Begin=0x00007fffeed03a48, Size=8, Type=0x21
Libomptarget --> Looking up mapping(HstPtrBegin=0x0000000001fe3640, Size=400)...
Target LEVEL0 RTL --> Allocated a shared memory object 0x00000000022d9000
Target LEVEL0 RTL --> Allocated target memory 0x00000000022d9000 (Base: 0x00000000022d9000, Size: 65536) for host ptr 0x0000000001fe3640
Libomptarget --> Creating new map entry: HstBase=0x0000000001fe3640, HstBegin=0x0000000001fe3640, HstEnd=0x0000000001fe37d0, TgtBegin=0x00000000022d9000
Libomptarget --> There are 400 bytes allocated at target address 0x00000000022d9000 - is new
Libomptarget --> Moving 400 bytes (hst:0x0000000001fe3640) -> (tgt:0x00000000022d9000)
Target LEVEL0 RTL --> Copied 400 bytes (hst:0x0000000001fe3640) -> (tgt:0x00000000022d9000)
Libomptarget --> Looking up mapping(HstPtrBegin=0x00007fffeed03a90, Size=8)...
Target LEVEL0 RTL --> Allocated a shared memory object 0x0000000001fe9000
Target LEVEL0 RTL --> Allocated target memory 0x0000000001fe9000 (Base: 0x0000000001fe9000, Size: 65536) for host ptr 0x00007fffeed03a90
Libomptarget --> Creating new map entry: HstBase=0x00007fffeed03a90, HstBegin=0x00007fffeed03a90, HstEnd=0x00007fffeed03a98, TgtBegin=0x0000000001fe9000
Libomptarget --> There are 8 bytes allocated at target address 0x0000000001fe9000 - is new
Libomptarget --> Moving 8 bytes (hst:0x00007fffeed03a90) -> (tgt:0x0000000001fe9000)
Target LEVEL0 RTL --> Copied 8 bytes (hst:0x00007fffeed03a90) -> (tgt:0x0000000001fe9000)
Libomptarget --> Looking up mapping(HstPtrBegin=0x00007fffeed03a48, Size=8)...
Target LEVEL0 RTL --> Allocated a shared memory object 0x0000000002a95000
Target LEVEL0 RTL --> Allocated target memory 0x0000000002a95000 (Base: 0x0000000002a95000, Size: 65536) for host ptr 0x00007fffeed03a48
Libomptarget --> Creating new map entry: HstBase=0x00007fffeed03a48, HstBegin=0x00007fffeed03a48, HstEnd=0x00007fffeed03a50, TgtBegin=0x0000000002a95000
Libomptarget --> There are 8 bytes allocated at target address 0x0000000002a95000 - is new
Libomptarget --> Moving 8 bytes (hst:0x00007fffeed03a48) -> (tgt:0x0000000002a95000)
Target LEVEL0 RTL --> Copied 8 bytes (hst:0x00007fffeed03a48) -> (tgt:0x0000000002a95000)
Libomptarget --> Looking up mapping(HstPtrBegin=0x0000000001fe3640, Size=400)...
Libomptarget --> Mapping exists with HstPtrBegin=0x0000000001fe3640, TgtPtrBegin=0x00000000022d9000, Size=400, RefCount=1
Libomptarget --> Obtained target argument (Begin: 0x00000000022d9000, Offset: 0) from host pointer 0x0000000001fe3640
Libomptarget --> Looking up mapping(HstPtrBegin=0x00007fffeed03a90, Size=8)...
Libomptarget --> Mapping exists with HstPtrBegin=0x00007fffeed03a90, TgtPtrBegin=0x0000000001fe9000, Size=8, RefCount=1
Libomptarget --> Obtained target argument (Begin: 0x0000000001fe9000, Offset: 0) from host pointer 0x00007fffeed03a90
Libomptarget --> Looking up mapping(HstPtrBegin=0x00007fffeed03a48, Size=8)...
Libomptarget --> Mapping exists with HstPtrBegin=0x00007fffeed03a48, TgtPtrBegin=0x0000000002a95000, Size=8, RefCount=1
Libomptarget --> Obtained target argument (Begin: 0x0000000002a95000, Offset: 0) from host pointer 0x00007fffeed03a48
Libomptarget --> Launching target execution __omp_offloading_801_1a056af__Z4main_l64 with pointer 0x00000000027c4940 (index=0).
Libomptarget --> Manifesting used target pointers:
Target LEVEL0 RTL --> Executing a kernel 0x00000000027c4940...
Target LEVEL0 RTL --> Kernel argument 0 (value: 0x00000000022d9000) was set successfully
Target LEVEL0 RTL --> Kernel argument 1 (value: 0x0000000001fe9000) was set successfully
Target LEVEL0 RTL --> Kernel argument 2 (value: 0x0000000002a95000) was set successfully
Target LEVEL0 RTL --> Assumed kernel SIMD width is 32
Target LEVEL0 RTL --> Max group count is set to 1 (num_teams clause or no teams construct)
Target LEVEL0 RTL --> Group sizes = {32, 1, 1}
Target LEVEL0 RTL --> Group counts = {1, 1, 1}
Target LEVEL0 RTL --> Executed a kernel 0x00000000027c4940
Libomptarget --> Looking up mapping(HstPtrBegin=0x00007fffeed03a48, Size=8)...
Libomptarget --> Mapping exists with HstPtrBegin=0x00007fffeed03a48, TgtPtrBegin=0x0000000002a95000, Size=8, updated RefCount=1
Libomptarget --> There are 8 bytes allocated at target address 0x0000000002a95000 - is last
Libomptarget --> Looking up mapping(HstPtrBegin=0x00007fffeed03a90, Size=8)...
Libomptarget --> Mapping exists with HstPtrBegin=0x00007fffeed03a90, TgtPtrBegin=0x0000000001fe9000, Size=8, updated RefCount=1
Libomptarget --> There are 8 bytes allocated at target address 0x0000000001fe9000 - is last
Libomptarget --> Looking up mapping(HstPtrBegin=0x0000000001fe3640, Size=400)...
Libomptarget --> Mapping exists with HstPtrBegin=0x0000000001fe3640, TgtPtrBegin=0x00000000022d9000, Size=400, updated RefCount=1
Libomptarget --> There are 400 bytes allocated at target address 0x00000000022d9000 - is last
Libomptarget --> Moving 400 bytes (tgt:0x00000000022d9000) -> (hst:0x0000000001fe3640)
Target LEVEL0 RTL --> Copied 400 bytes (tgt:0x00000000022d9000) -> (hst:0x0000000001fe3640)
Libomptarget --> Looking up mapping(HstPtrBegin=0x00007fffeed03a48, Size=8)...
Libomptarget --> Deleting tgt data 0x0000000002a95000 of size 8
Target LEVEL0 RTL --> Deleted device memory 0x0000000002a95000 (Base: 0x0000000002a95000, Size: 65536)
Libomptarget --> Removing mapping with HstPtrBegin=0x00007fffeed03a48, TgtPtrBegin=0x0000000002a95000, Size=8
Libomptarget --> Looking up mapping(HstPtrBegin=0x00007fffeed03a90, Size=8)...
Libomptarget --> Deleting tgt data 0x0000000001fe9000 of size 8
Target LEVEL0 RTL --> Deleted device memory 0x0000000001fe9000 (Base: 0x0000000001fe9000, Size: 65536)
Libomptarget --> Removing mapping with HstPtrBegin=0x00007fffeed03a90, TgtPtrBegin=0x0000000001fe9000, Size=8
Libomptarget --> Looking up mapping(HstPtrBegin=0x0000000001fe3640, Size=400)...
Libomptarget --> Deleting tgt data 0x00000000022d9000 of size 400
Target LEVEL0 RTL --> Deleted device memory 0x00000000022d9000 (Base: 0x00000000022d9000, Size: 65536)
Libomptarget --> Removing mapping with HstPtrBegin=0x0000000001fe3640, TgtPtrBegin=0x00000000022d9000, Size=400
Libomptarget --> Entering data begin region for device 0 with 3 mappings
Libomptarget --> Checking whether device 0 is ready.
Libomptarget --> Is the device 0 (local ID 0) initialized? 1
Libomptarget --> Device 0 is ready to use.
Libomptarget --> Entry  0: Base=0x0000000001e5c480, Begin=0x0000000001e5c480, Size=400, Type=0x21
Libomptarget --> Entry  1: Base=0x0000000001fe3240, Begin=0x0000000001fe3240, Size=400, Type=0x21
Libomptarget --> Entry  2: Base=0x0000000001fe3640, Begin=0x0000000001fe3640, Size=400, Type=0x23
Libomptarget --> Looking up mapping(HstPtrBegin=0x0000000001e5c480, Size=400)...
Target LEVEL0 RTL --> Allocated a shared memory object 0x00000000022d9000
Target LEVEL0 RTL --> Allocated target memory 0x00000000022d9000 (Base: 0x00000000022d9000, Size: 65536) for host ptr 0x0000000001e5c480
Libomptarget --> Creating new map entry: HstBase=0x0000000001e5c480, HstBegin=0x0000000001e5c480, HstEnd=0x0000000001e5c610, TgtBegin=0x00000000022d9000
Libomptarget --> There are 400 bytes allocated at target address 0x00000000022d9000 - is new
Libomptarget --> Moving 400 bytes (hst:0x0000000001e5c480) -> (tgt:0x00000000022d9000)
Target LEVEL0 RTL --> Copied 400 bytes (hst:0x0000000001e5c480) -> (tgt:0x00000000022d9000)
Libomptarget --> Looking up mapping(HstPtrBegin=0x0000000001fe3240, Size=400)...
Target LEVEL0 RTL --> Allocated a shared memory object 0x0000000001fe9000
Target LEVEL0 RTL --> Allocated target memory 0x0000000001fe9000 (Base: 0x0000000001fe9000, Size: 65536) for host ptr 0x0000000001fe3240
Libomptarget --> Creating new map entry: HstBase=0x0000000001fe3240, HstBegin=0x0000000001fe3240, HstEnd=0x0000000001fe33d0, TgtBegin=0x0000000001fe9000
Libomptarget --> There are 400 bytes allocated at target address 0x0000000001fe9000 - is new
Libomptarget --> Moving 400 bytes (hst:0x0000000001fe3240) -> (tgt:0x0000000001fe9000)
Target LEVEL0 RTL --> Copied 400 bytes (hst:0x0000000001fe3240) -> (tgt:0x0000000001fe9000)
Libomptarget --> Looking up mapping(HstPtrBegin=0x0000000001fe3640, Size=400)...
Target LEVEL0 RTL --> Allocated a shared memory object 0x0000000002a95000
Target LEVEL0 RTL --> Allocated target memory 0x0000000002a95000 (Base: 0x0000000002a95000, Size: 65536) for host ptr 0x0000000001fe3640
Libomptarget --> Creating new map entry: HstBase=0x0000000001fe3640, HstBegin=0x0000000001fe3640, HstEnd=0x0000000001fe37d0, TgtBegin=0x0000000002a95000
Libomptarget --> There are 400 bytes allocated at target address 0x0000000002a95000 - is new
Libomptarget --> Moving 400 bytes (hst:0x0000000001fe3640) -> (tgt:0x0000000002a95000)
Target LEVEL0 RTL --> Copied 400 bytes (hst:0x0000000001fe3640) -> (tgt:0x0000000002a95000)
Libomptarget --> Checking whether device 0 is ready.
Libomptarget --> Is the device 0 (local ID 0) initialized? 1
Libomptarget --> Device 0 is ready to use.
Libomptarget --> Call to omp_get_num_devices returning 1
Libomptarget --> Call to __tgt_create_interop_obj with device_id 0, is_async false, async_obj 0x0000000000000000
Libomptarget --> Checking whether device 0 is ready.
Libomptarget --> Is the device 0 (local ID 0) initialized? 1
Libomptarget --> Device 0 is ready to use.
Target LEVEL0 RTL --> __tgt_rtl_create_offload_queue returns a new asynchronous command queue 0x00000000022b2d60
Libomptarget --> Entering data begin region for device 0 with 3 mappings
Libomptarget --> Checking whether device 0 is ready.
Libomptarget --> Is the device 0 (local ID 0) initialized? 1
Libomptarget --> Device 0 is ready to use.
Libomptarget --> Entry  0: Base=0x0000000001e5c480, Begin=0x0000000001e5c480, Size=0, Type=0x60
Libomptarget --> Entry  1: Base=0x0000000001fe3240, Begin=0x0000000001fe3240, Size=0, Type=0x60
Libomptarget --> Entry  2: Base=0x0000000001fe3640, Begin=0x0000000001fe3640, Size=0, Type=0x60
Libomptarget --> Looking up mapping(HstPtrBegin=0x0000000001e5c480, Size=0)...
Libomptarget --> Mapping exists with HstPtrBegin=0x0000000001e5c480, TgtPtrBegin=0x00000000022d9000, Size=0, updated RefCount=2
Libomptarget --> There are 0 bytes allocated at target address 0x00000000022d9000 - is not new
Libomptarget --> Returning device pointer 0x00000000022d9000
Libomptarget --> Looking up mapping(HstPtrBegin=0x0000000001fe3240, Size=0)...
Libomptarget --> Mapping exists with HstPtrBegin=0x0000000001fe3240, TgtPtrBegin=0x0000000001fe9000, Size=0, updated RefCount=2
Libomptarget --> There are 0 bytes allocated at target address 0x0000000001fe9000 - is not new
Libomptarget --> Returning device pointer 0x0000000001fe9000
Libomptarget --> Looking up mapping(HstPtrBegin=0x0000000001fe3640, Size=0)...
Libomptarget --> Mapping exists with HstPtrBegin=0x0000000001fe3640, TgtPtrBegin=0x0000000002a95000, Size=0, updated RefCount=2
Libomptarget --> There are 0 bytes allocated at target address 0x0000000002a95000 - is not new
Libomptarget --> Returning device pointer 0x0000000002a95000
Libomptarget --> Call to __tgt_get_interop_property with interop_obj 0x000000000209ac40, property_id 2
Libomptarget --> Call to __tgt_get_interop_property with interop_obj 0x000000000209ac40, property_id 5
Libomptarget --> Call to __tgt_get_interop_property with interop_obj 0x000000000209ac40, property_id 6
Segmentation fault (core dumped)

 

Other systems:

I've run the same code in a different system (based on a Intel(R) Xeon(R) CPU E3-1585 v5 @ 3.50GHz) and it works. But it is on my desktop that I can't seem to make it work. 

 

Any suggestions would be much appreciated. Also, if there is any more information needed please let me know

 

Thanks

0 Kudos
1 Solution
Jose_M_Monsalve_Diaz
1,817 Views

I was able to get help from other person within my organization. 

The solution was as simple as changing the LIBOMPTARGET_PLUGIN. The L0 Plugin is broken and it does not work. 

Taken from my conversation:

when I looked at the LIBOMPTARGET_DEBUG trace you put on the forum, it looks like it’s using L0 backend Libomptarget --> Loading library 'libomptarget.rtl.level0.so'...

so it’s worth a shot to export LIBOMPTARGET_PLUGIN=OPENCL on your system
 
That last part made the trick

View solution in original post

0 Kudos
4 Replies
PrasanthD_intel
Moderator
1,969 Views

Hi Jose,

 

There is a dedicated forum for MKL (https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/bd-p/oneapi-math-kernel-library). We are moving this thread to that forum.

Regards

Prasanth

 

Jose_M_Monsalve_Diaz
1,951 Views

Sorry about that, I didn't see it at first. Thanks for moving the post. 

0 Kudos
Jose_M_Monsalve_Diaz
1,818 Views

I was able to get help from other person within my organization. 

The solution was as simple as changing the LIBOMPTARGET_PLUGIN. The L0 Plugin is broken and it does not work. 

Taken from my conversation:

when I looked at the LIBOMPTARGET_DEBUG trace you put on the forum, it looks like it’s using L0 backend Libomptarget --> Loading library 'libomptarget.rtl.level0.so'...

so it’s worth a shot to export LIBOMPTARGET_PLUGIN=OPENCL on your system
 
That last part made the trick
0 Kudos
Gennady_F_Intel
Moderator
1,757 Views

Thanks for the update. The issue is closing and we will no longer respond to this thread. If you require additional assistance from Intel, please start a new thread. Any further interaction in this thread will be considered community only.


0 Kudos
Reply