GPU Compute Software
Ask questions about Intel® Graphics Compute software technologies, such as OpenCL* GPU driver and oneAPI Level Zero
191 Discussions

OpenCL Kernel Name appends ".1"

Scott_S_2
Novice
4,784 Views

Hi,

I've been compiling some OpenCL kernels but I've been getting CL_INVALID_KERNEL_NAME errors when generating a program and calling clCreateKernel

heres the example kernel:

 

__kernel void window_level(__read_only image2d_t srcImg,__write_only image2d_t dstImg,float window_bottom,float width_rcp)
{
    int2 coord = (int2) ((int)get_global_id(0), (int)get_global_id(1));

    int2 size = get_image_dim(srcImg);

    if(coord.x >= size.x || coord.y >= size.y){
    return;
    } 
 
    sampler_t sampler = CLK_FILTER_NEAREST | CLK_NORMALIZED_COORDS_FALSE | CLK_ADDRESS_CLAMP; 
 
    int4 src_value = read_imagei(srcImg, sampler, coord); 
 
    int4 dst_value;
    dst_value.x = (int)((src_value.x - window_bottom) * width_rcp);
 
    if(dst_value.x > 255){ dst_value.x = 255; }
    if(dst_value.x < 0){ dst_value.x = 0; }

    write_imagei(dstImg, coord, dst_value); 
}

 

and the corresponding generation calls:

 

clang -target spirv64 -emit-llvm -O0 -c -o window_level.bc window_level.clcpp
llvm-spirv -spirv-max-version=1.2 -o window_level.spv window_level.bc 

 

This generates window_level.spv which seems to have the kernel name "window_level" when I inspect the SPRIV with https://www.khronos.org/spir/visualizer/

 

[0]  OpCapability Addresses
[1]  OpCapability Linkage
[2]  OpCapability Kernel
[3]  OpCapability Int64
[4]  OpCapability ImageBasic
[5]  OpCapability LiteralSampler
[6]  %1 = OpExtInstImport "OpenCL.std"
[7]  OpMemoryModel Physical64 OpenCL
[8]  OpEntryPoint Kernel %101 "window_level" %5
[9]  OpExecutionMode %101 ContractionOff
Debug Information
[10]  OpSource OpenCL_C 200000
[11]  OpName %5 "__spirv_BuiltInGlobalInvocationId"
[12]  OpName %11 "window_level"
[13]  OpName %74 "TempSampledImage"
Annotations
[14]  OpDecorate %5 LinkageAttributes "__spirv_BuiltInGlobalInvocationId" Import
[15]  OpDecorate %5 Constant
[16]  OpDecorate %5 BuiltIn GlobalInvocationId
[17]  OpDecorate %11 LinkageAttributes "window_level" Export
...

 

However when I do

 

cl_int ret = clBuildProgram(program, 1, &device_id[0], NULL, NULL, NULL);
cl_kernel kernel = clCreateKernel(program, "window_level", &ret);

 

I get a CL_INVALID_KERNEL_NAME error.  Turns out that a ".1" has been appended to the kernel name and I have to call clCreateKernel with "window_level.1" to find the correct kernel name.

Before "window_level" worked fine.  Is there something I'm missing here?  Where does the ".1" come from?

Thanks
Scott


System Setup Information:
-----------------------------------------

System Used: Inspiron 3670
CPU SKU: Intel(R) Core(TM) i3-8100 CPU @ 3.60GHz 3.60 GHz
GPU SKU: Intel(R) UHD Graphics 630
System BIOS Version: Dell Inc. 1.3.4, 24/05/2018
Graphics Driver Version: 30.0.101.1069
Operating System: Windows 10 Pro
OS Version: 10.0.19042 Build 19042
API: OpenCL
Occurs on non-Intel GPUs?: no tested

0 Kudos
7 Replies
Anita_Intel
Employee
4,732 Views

Dear User,

 

Thanks for reaching out to us.

 

Could you please provide us with a sample reproducer (your complete programs/codes) and the steps you have followed so that we can try it at our end?

 

Thanks & Regards,

Anita

0 Kudos
Scott_S_2
Novice
4,711 Views

sure

#include "CL/cl.h"
#include <iostream>
#include <fstream>
#include <vector>

#define MAX_PLATFORM_SIZE 256
#define MAX_DEVICE_SIZE 256

namespace {
std::vector<unsigned char> load_file(const std::string& file)
{
    std::fstream input(file, std::ios::in | std::ios::binary | std::ios::ate);
    auto size = input.tellg();
    input.seekg(0, std::ios::beg);
    std::vector<unsigned char> binary(size);
    input.read((char*)binary.data(),size);
    input.close();

    return binary;
}
}

int main()
{  
    std::vector<unsigned char> data = load_file("window_level.spv");

    cl_device_id device_id[256];
    cl_platform_id platform_id[256];
    cl_uint ret_num_devices;
    cl_uint ret_num_platforms;
    char buf[4096];

    cl_int ret = clGetPlatformIDs(0, 0, &ret_num_platforms);
    ret = clGetPlatformIDs(ret_num_platforms, platform_id, &ret_num_platforms);
    if (ret != CL_SUCCESS) {
        std::cout << "something went wrong in clGetPlatformIDs" << std::endl;
        return 0;
    }

    for (unsigned int i=0; i < ret_num_platforms; i++) 
    {
        ret = clGetPlatformInfo(platform_id[i], CL_PLATFORM_NAME, sizeof(buf), buf, NULL);
        if (ret != CL_SUCCESS) {
            std::cout << "something went wrong in clGetPlatformInfo" << std::endl;
            return 1;
        }

        ret = clGetDeviceIDs(platform_id[i], CL_DEVICE_TYPE_ALL, MAX_DEVICE_SIZE, device_id, &ret_num_devices);
        if (ret != CL_SUCCESS) {
            std::cout << "something went wrong in clGetDeviceIDs" << std::endl;
            return 1;
        }

        for (unsigned int j=0; j < ret_num_devices; j++) {
            ret = clGetDeviceInfo(device_id[j], CL_DEVICE_NAME, sizeof(buf), buf, NULL);
            if (ret != CL_SUCCESS) {
                std::cout << "something went wrong in clGetDeviceInfo" << std::endl;
                return 1;
            }

            ret = clGetDeviceInfo(device_id[j], CL_DEVICE_VENDOR, sizeof(buf), buf, NULL);
            if (ret != CL_SUCCESS) {
                std::cout << "something went wrong in clGetDeviceInfo" << std::endl;
                return 1;
            }
        }
        std::cout << std::endl;
    }
 
    cl_context context = clCreateContext( NULL, 1, &device_id[0], NULL, NULL, &ret);
 
    cl_int err = 0;
    cl_program program = clCreateProgramWithIL((cl_context)context, data.data(), data.size(), &err); 
    if(err != CL_SUCCESS){
        std::cout << "failed to create opencl program with IL " << std::endl;
        return 1;
    }

    err = clBuildProgram(program, 1, &device_id[0], NULL, NULL, NULL);

    cl_int build_status;
    err = clGetProgramBuildInfo(program, device_id[0], CL_PROGRAM_BUILD_STATUS, sizeof(cl_build_status), &build_status, NULL);

    if(build_status != CL_SUCCESS){
        size_t ret_val_size;
        err = clGetProgramBuildInfo(program, device_id[0], CL_PROGRAM_BUILD_LOG, 0, NULL, &ret_val_size);
                 
        std::vector<char> build_log(ret_val_size+1,0);
        err = clGetProgramBuildInfo(program, device_id[0], CL_PROGRAM_BUILD_LOG, ret_val_size, build_log.data(), NULL);

        build_log[ret_val_size] = '\0';
        std::cout << "failed to build opencl program: \n" << build_log.data() << std::endl;
    }


    size_t ret_val_size = 0;
    err = clGetProgramInfo(program, CL_PROGRAM_KERNEL_NAMES, 0, NULL, &ret_val_size);
                 
    std::vector<char> kernel_names(ret_val_size+1,0);
    err = clGetProgramInfo(program, CL_PROGRAM_KERNEL_NAMES, ret_val_size, kernel_names.data(), NULL);

    kernel_names[ret_val_size] = '\0';
    std::cout << "kernel names: \n" << kernel_names.data() << std::endl;

    // prints out window_level.1, expecting window_level from viewing spirv file

    return 0;
}

I have attached the spirv file too.

Thanks
Scott

0 Kudos
Ben_A_Intel
Employee
4,702 Views

FWIW I can't reproduce this problem on my Linux NUC with a similar GPU and recent drivers.

bashbaug@bashbaug-nuc:~/git/SimpleOpenCLSamples/install/RelWithDebInfo$ ./enumopencl
Enumerated 1 platforms.

Platform:
	Name:           Intel(R) OpenCL HD Graphics
	Vendor:         Intel(R) Corporation
	Driver Version: OpenCL 3.0 
Device[0]:
	Type:           GPU 
	Name:           Intel(R) HD Graphics 620 [0x5916]
	Vendor:         Intel(R) Corporation
	Device Version: OpenCL 3.0 NEO 
	Driver Version: 22.34.24023

Done.
bashbaug@bashbaug-nuc:~/git/temp/forum-bug4$ ./test 

kernel names: 
window_level

We may need to find a Windows system with your driver version (Graphics Driver Version: 30.0.101.1069).

Would it be possible for you to try newer drivers on your end?

0 Kudos
Scott_S_2
Novice
4,696 Views

ok I updated to CL_DRIVER_VERSION: 31.0.101.2111 and ".1" addition is gone, but all my kernels now report CL_INVALID_KERNEL_ARGS

I'm not entirely sure why, none of the code has changed between driver changes.

Having an intermittent ".1" appended to kernel names makes it difficult to support many driver versions.

Scott

 

0 Kudos
Scott_S_2
Novice
4,567 Views

So I've been getting CL_INVALID_KERNEL_ARGS for all my kernels on newer drivers, so I've been reverting driver versions all day.  The latest driver version that works is:

CL_DRIVER_VERSION: 30.0.101.1340

Has something specific changed for driver versions greater than this that would cause this CL_INVALID_KERNEL_ARGS issue?

0 Kudos
pxcc
Novice
2,402 Views

Any progress/resolution for this? I have the same problem, and I'm using very similar code, toolchain, and graphics device.

Here is some info that might help:

  1. Windows 11 Pro 10.0.22621, HD Graphics 630 on i7-10700K, Intel driver 30.0.101.1273 and 31.0.101.2125
  2. If I compile kernels to SPIR-V using the ocloc.exe offline compiler from the latest oneAPI installation, everything works fine. But this is not a desirable workaround (explained below)
  3. If I use clang.exe and llvm-spirv.exe (from a recent LLVM 17.0.0 snapshot), I observe the following:
    1. Feeding the resulting SPIR-V to clCreateProgramWithIL succeeds
    2. With driver 30.0.101.1273, clBuildProgram succeeds, but creating a kernel fails with CL_INVALID_KERNEL_NAME (so, same as described by Scot_S_2)
    3. After updating to 31.0.101.2125, clBuildProgram crashes with an access violation
    4. I get the same result with both std=cl3.0 (C language) and std=clc++2021 (C++ language) set for clang.exe
  4. I've attached sample SPIR-V and spirv-dis disassembly output for both ocloc and clang.

This could be a deal-breaker for me with regard to supporting Intel devices. I'm trying to achieve offline compilation of C++ kernels to SPIR-V. I've developed a cross-platform framework with backends for Metal, Cuda (driver API), OpenCL, and CPU (SIMD), to handle compute for a consumer desktop application. The generic programming features of C++ are critical for productivity when working with the variety of data types, layouts, dimensionalities, memory spaces, access modes, and platform idiosyncrasies encountered in modern compute-heavy applications. 

ocloc.exe is not an effective workaround, because it only accepts C kernels. The clang.exe + llvm-spirv.exe toolchain seems to be the only way to compile C++ kernels to SPIR-V.  SYCL/HIP/Cuda single-source C++ compilation isn't a near-term solution for various reasons.  With some effort, I could probably replace the OpenCL backend with a Level Zero backend, but it won't make a difference if the Level Zero driver uses the same SPIR-V-to-device compiler that the OpenCL driver is using.

So:

A) If Scott_S_2's reproducer is still broken on Windows with current drivers, I hope this new report of the same problem will cause someone at Intel to take a closer look.

B) If the reproducer now works on Windows, let me know if anything specific was needed to make it work. If I still can't get things to work on my side, I'll try to extract a standalone test case.

 

0 Kudos
pxcc
Novice
2,400 Views

[Reposting.  My initial reply disappeared shortly after I posted it.]

Any progress/resolution for this? I have the same problem, and I'm using very similar code, toolchain, and graphics device.

Here is some info that might help:

  1. Windows 11 Pro 10.0.22621, HD Graphics 630 on i7-10700K, Intel driver 30.0.101.1273 and 31.0.101.2125
  2. If I compile kernels to SPIR-V using the ocloc.exe compiler from the latest oneAPI installation, everything works fine. But this is not a desirable workaround (need C++ kernels, but ocloc.exe only accepts C)
  3. If I use clang.exe and llvm-spirv.exe (from a recent LLVM 17.0.0 snapshot), I observe the following:
    1. Feeding the resulting SPIR-V to clCreateProgramWithIL succeeds
    2. With driver 30.0.101.1273, clBuildProgram succeeds, but creating a kernel fails with CL_INVALID_KERNEL_NAME (so, same as described in the original post for this thread)
    3. With driver 31.0.101.2125, clBuildProgram crashes with an access violation
    4. I get the same result with both std=cl3.0 (C language) and std=clc++2021 (C++language) set for clang.exe
  4. I've attached sample SPIR-V and spirv-dis disassembly output for both ocloc and clang.

 

0 Kudos
Reply