Intel OpenCL kernel hangs - how to debug?


I am having some trouble running my OpenCL code on Intel CPUs and GPUs. The code is a Monte Carlo simulation code, and can be downloaded from

The code has no issue when running on NVIDIA ocl. However, when running on Intel ocl (CPU or HD GPU), it hangs almost every time when running a large number of photons. 

The kernel has only one while() loop that may generate such hanging behavior:

however, I set a counter and force that loop to quit when exceeding a limit, but that did not stop the hanging. It appears to me that something else must be responsible. I am now suspecting the clWaitForEvents at this line fails to return for some reason:

what else I can do to debug this issue? could it be possible that the kernel had completed execution, but clWaitForEvents stalled? 

any suggestion is welcome! thanks in advance!


PS: you can reproduce this issue by following the below commands:

git clone
cd mcxcl/src
cd ../example/quicktest
../../bin/mcxcl  -t 128 -T 8 -g 10 -n 1e7 -f qtest.inp -s qtest -r 1 -a 0 -b 0 -k ../../src/ -d 1 -G 1

without hanging, the program is supposed to finish in about 15-25 seconds on an Intel CPU/GPU. If it hangs, you may lower the number after -n. Lower than 1e6 typically will work without hanging.

I did some further testing, it looks like the hanging only happens when using Intel's HD graphics (HD 5500). The Intel CPU backend runs fine.

I found how to use gdb to debug ocl code on Intel CPU. But this method does not work for HD graphics. When I press Ctrl+C after program hangs, gdb always give me the following 

fangq@dayu:/Project/github/mcxcl/example/quicktest$ ../../bin/mcxcl -L
Platform [0] Name Intel(R) OpenCL
============ GPU device ID 1 [1 of 1]: Intel(R) HD Graphics ============
 Compute units   :    23 core(s)
 Global memory   :    6587885159 B
 Local memory    :    65536 B
 Constant memory :    1646971289 B
 Clock speed     :    950 MHz 

Starting program: /Project/github/mcxcl/bin/mcxcl -t 12800 -T 64 -g 10 -n 5e6 -f qtest.inp -s qtest -r 1 -a 0 -b 1 -k ../../src/ -d 1 -G 1 -J -g\ -s\ /home/fangq/space/Gitroot/Project/github/mcxcl/src/
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/".
=                     Monte Carlo eXtreme (MCX) -- OpenCL                     =
=           Copyright (c) 2009-2016 Qianqian Fang <q.fang at>         =
= =
=                    Computational Imaging Laboratory (CIL)                   =
=             Department of Bioengineering, Northeastern University           =
$MCXCL$Rev::    $ Last Commit $Date::                     $ by $Author:: fangq$
- variant name: [Detective MCXCL] compiled with OpenCL version [1]
- compiled with: [RNG] Logistic-Lattice [Seed Length] 5
initializing streams ...    init complete : 0 ms
build program complete : 584 ms
- [device 0] threadph=390 oddphotons=8000 np=5000000.0 nthread=12800 repetition=1
set kernel arguments complete : 584 ms
lauching mcx_main_loop for time window [0.0ns 5.0ns] ...
simulation run# 1 ...

Program received signal SIGINT, Interrupt.
0x00007ffff6dec816 in __pthread_mutex_unlock_usercnt (decr=1, mutex=0x895f70) at pthread_mutex_unlock.c:73
73    pthread_mutex_unlock.c: No such file or directory.
(gdb) where
#0  0x00007ffff6dec816 in __pthread_mutex_unlock_usercnt (decr=1, mutex=0x895f70) at pthread_mutex_unlock.c:73
#1  __GI___pthread_mutex_unlock (mutex=0x895f70) at pthread_mutex_unlock.c:310
#2  0x00007ffff6206089 in ?? () from /opt/intel/opencl/
#3  0x00007ffff613039f in ?? () from /opt/intel/opencl/
#4  0x00007ffff612cf4e in ?? () from /opt/intel/opencl/
#5  0x00007ffff612d8f8 in ?? () from /opt/intel/opencl/
#6  0x00007ffff6135013 in ?? () from /opt/intel/opencl/
#7  0x00007ffff6149495 in ?? () from /opt/intel/opencl/
#8  0x00007ffff6156d84 in ?? () from /opt/intel/opencl/
#9  0x00007ffff61b5203 in ?? () from /opt/intel/opencl/
#10 0x00007ffff7bd5813 in clWaitForEvents () from /opt/intel/opencl/
#11 0x00000000004043c0 in mcx_run_simulation (cfg=<optimized out>, fluence=<optimized out>, totalenergy=<optimized out>) at mcx_host.cpp:443
#12 0x00000000004018ab in main (argc=27, argv=0x7fffffffdc18) at mcxcl.c:35 

I can't tell whether the kernel was stuck somewhere or simply the clWaitForEvents failed. 

any other ways for debugging HD graphics?

Replicated in this thread:

Will get back to you with more info on debugging on that thread soon.



