- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hit this problem while playing around with ray-tracing in OpenCL and using this RNG https://github.com/bstatcomp/RandomCL/blob/master/generators/kiss09.cl
I have 100% reliable repro on Ubuntu 22.04 with Intel Core i7-1165G7.
I was able to reduce the repro to the attached code. Invoking the `rnd()` function results in odd and invalid behavior (depending on specific program that invokes the function):
- At the minimum, in all cases computer hangs (not even mouse pointer moves). Sometimes permanently, sometimes it gets unstuck after a few seconds.
- Full scale ray-tracing program fails to execute kernel due to "out of resources"
- Unit test that invokes same kernels completes but fails since some buffers are not synced to the host from the device after `queue::finish()` returns
Changing the `rnd()` function to call function `kiss09_func_ulong` instead of using `kiss09_ulong` macro fixes the problem in all invocation scenarios.
So it seems that macro with a bunch of statements separated by comma operator is what causes the breakage.
Same binary executes without any problems under `oclgrind` and on NVIDIA GPU.
Is this the right place to report such problems or there is bug database where bugs can be opened directly?
break_repro.cl
#define KISS09_FLOAT_MULTI 5.4210108624275221700372640e-20f
#define KISS09_DOUBLE_MULTI 5.4210108624275221700372640e-20
struct kiss09_state {
ulong x, c, y, z;
};
#define kiss09_ulong(state) \
(/*multiply with carry*/ \
state->c = state->x >> 6, state->x += (state->x << 58) + state->c, \
state->c += state->x < (state->x << 58) + state->c, /*xorshift*/ \
state->y ^= state->y << 13, state->y ^= state->y >> 17, \
state->y ^= state->y << 43, /*linear congruential*/ \
state->z = 6906969069UL * state->z + 1234567UL, \
state->x + state->y + state->z)
ulong kiss09_func_ulong(__private struct kiss09_state* state) {
// multiply with carry
ulong t = (state->x << 58) + state->c;
state->c = state->x >> 6;
state->x += t;
state->c += state->x < t;
// xorshift
state->y ^= state->y << 13;
state->y ^= state->y >> 17;
state->y ^= state->y << 43;
// linear congruential
state->z = 6906969069UL * state->z + 1234567UL;
return state->x + state->y + state->z;
}
void kiss09_seed(__private struct kiss09_state* state, ulong j) {
state->x = 1234567890987654321UL ^ j;
state->c = 123456123456123456UL ^ j;
state->y = 362436362436362436UL ^ j;
if (state->y == 0) {
state->y = 1;
}
state->z = 1066149217761810UL ^ j;
}
float rnd(__private struct kiss09_state* rnd_state) {
return kiss09_ulong(rnd_state) * KISS09_FLOAT_MULTI;
}
__kernel void break_repro(__global float* output) {
int gid = get_global_id(0);
struct kiss09_state rnd_state;
kiss09_seed(&rnd_state, gid * 0x9e3779b9);
output[gid] = rnd(&rnd_state);
}
break_repro.cpp
#include <array>
#include <cstdlib>
#include <fstream>
#include <string>
#include "boost/compute.hpp"
#include "gflags/gflags.h"
#include "glog/logging.h"
namespace {
boost::compute::kernel compile_and_load_kernel(
boost::compute::context& context) {
std::ifstream kernel_file("break_repro.cl");
if (!kernel_file.is_open()) {
LOG(FATAL) << "Failed to open kernel file";
}
std::string kernel_source((std::istreambuf_iterator<char>(kernel_file)),
std::istreambuf_iterator<char>());
kernel_file.close();
LOG(INFO) << "Building kernel source:\n" << kernel_source;
boost::compute::program program =
boost::compute::program::create_with_source(kernel_source, context);
program.build();
LOG(INFO) << "Build log: " << program.build_log();
LOG(INFO) << "Creating kernel: break_repro";
return program.create_kernel("break_repro");
}
} // namespace
int main(int argc, char* argv[]) {
FLAGS_logtostderr = true;
FLAGS_stderrthreshold = 0;
google::InitGoogleLogging(argv[0]);
gflags::ParseCommandLineFlags(&argc, &argv, true);
try {
auto cl_device = boost::compute::system::default_device();
auto cl_context = boost::compute::context(cl_device);
auto cl_queue = boost::compute::command_queue(cl_context, cl_device);
// print device info
LOG(INFO) << "Device name: " << cl_device.get_info<CL_DEVICE_NAME>();
auto kernel = compile_and_load_kernel(cl_context);
std::array<cl_float, 4> output{};
boost::compute::buffer output_buffer(cl_context,
output.size() * sizeof(cl_float));
LOG(INFO) << "Enqueueing kernel: break_repro";
kernel.set_arg(0, output_buffer);
auto const kernel_event =
cl_queue.enqueue_1d_range_kernel(kernel, 0, output.size(), 0);
cl_queue.enqueue_read_buffer(output_buffer, 0,
output.size() * sizeof(cl_float),
output.data(), {kernel_event});
LOG(INFO) << "Waiting for kernel: break_repro";
cl_queue.finish();
LOG(INFO) << "output:";
for (auto const& o : output) {
LOG(INFO) << "\t" << o;
}
return EXIT_SUCCESS;
} catch (boost::compute::opencl_error const& e) {
LOG(ERROR) << "OpenCL error: " << e.what();
return EXIT_FAILURE;
}
}
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It sounds like you have encountered a bug in the OpenCL implementation on your specific hardware and operating system. The best place to report such a bug would be to the developers of the OpenCL implementation you are using. In this case, it sounds like you are using the OpenCL implementation provided by Intel, so you should report the bug to Intel's developer support team. They will be able to investigate the issue and potentially provide a fix or workaround. Additionally, you could also open a bug report on the GitHub repository of the project you are using (RandomCL) so that the developers of that project are aware of the issue.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page