- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello!
I got some undesired behavior. (Opencl 14.0 on CentOs 6.5 with Pico m506) I have two identical kernels that differs only in __attribute__((reqd_work_group_size(32,1,1))) One has workgroup size 32, other 512. My host program has this: size_t global_work_size[1] = { NDRANGE }; size_t local_work_size[1] = { WORK_GROUP_SIZE }; status = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, global_work_size, local_work_size, 1, write_event, &kernel_event); I have problem with launching kernel with 512 work group size with big NDRANGE value. It is ok with NDRANGE = 512, but with NDRANGE = 100000 i have kernel panic (listed below). I don't have this problem with kernel with workgroup size = 32 and NDRANGE = 40 000 000. Crash report: <4> <4>Pid: 3987, comm: aclkmdq Tainted: P --------------- 2.6.32-431.20.3.el6.x86_64 #1 (https://picocomputing.zendesk.com/tickets/1) ASUS All Series/Z87-PLUS <4>RIP: 0010:[<ffffffff8152a4f3>] [<ffffffff8152a4f3>] down_write+0x23/0x40 <4>RSP: 0018:ffff8807ad497d40 EFLAGS: 00010246 <4>RAX: 0000000000000068 RBX: 0000000000000068 RCX: 00000000000ff940 <4>RDX: ffffffff00000001 RSI: ffff8807f48aa000 RDI: 0000000000000068 <4>RBP: ffff8807ad497d50 R08: 0000000000000000 R09: 00000000ffffffff <4>aclpci_close (185): <7>aclpci = ffff88080d87a000, pid = 3978, dma_idle = 0 <4>R10: 000000a47dceaa6c R11: 0000000000000001 R12: 0000000000000100 <4>R13: ffff8807f48aa000 R14: ffff8807ad497fd8 R15: ffffe8ffffc0f048 <4>FS: 0000000000000000(0000) GS:ffff880028340000(0000) knlGS:0000000000000000 <4>CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b <4>CR2: 0000000000000068 CR3: 00000007e882a000 CR4: 00000000001407e0 <4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>Process aclkmdq (pid: 3987, threadinfo ffff8807ad496000, task ffff8807f1a22040) <4>Stack: <4> 0000000000000000 ffff88080f773500 ffff8807ad497d80 ffffffffa01458f2 <4><d> 0000000000000000 ffff8807ad4e8030 ffff88080d87a000 00000001000633ad <4><d> ffff8807ad497db0 ffffffffa01434b9 ffff8807ad497db0 ffff88080d87a000 <4>Call Trace: <4> [<ffffffffa01458f2>] aclpci_release_user_pages+0x32/0x70 [aclpci_drv] <4> [<ffffffffa01434b9>] unlock_dma_buffer+0x39/0x80 [aclpci_drv] <4> [<ffffffffa01441a0>] ? wq_func_dma_update+0x0/0x20 [aclpci_drv] <4> [<ffffffffa0143bf3>] aclpci_dma_update+0xe3/0x690 [aclpci_drv] <4> [<ffffffffa01441a0>] ? wq_func_dma_update+0x0/0x20 [aclpci_drv] <4> [<ffffffffa01441b7>] wq_func_dma_update+0x17/0x20 [aclpci_drv] <4> [<ffffffff81094a20>] worker_thread+0x170/0x2a0 <4> [<ffffffff8109afa0>] ? autoremove_wake_function+0x0/0x40 <4> [<ffffffff810948b0>] ? worker_thread+0x0/0x2a0 <4> [<ffffffff8109abf6>] kthread+0x96/0xa0 <4> [<ffffffff8100c20a>] child_rip+0xa/0x20 <4> [<ffffffff8109ab60>] ? kthread+0x0/0xa0 <4> [<ffffffff8100c200>] ? child_rip+0x0/0x20 <4>Code: c3 e8 62 77 b4 ff 00 00 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 48 89 fb e8 0a ed ff ff 48 ba 01 00 00 00 ff ff ff ff 48 89 d8 <f0> 48 0f c1 10 48 85 d2 74 05 e8 fe 4c d6 ff 48 83 c4 08 5b c9 <1>RIP [<ffffffff8152a4f3>] down_write+0x23/0x40 <4> RSP <ffff8807ad497d40> <4>CR2: 0000000000000068Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The Linux kernel panic happens during DMA transfer, which is done by the host on the CPU with a help of Altera OpenCL Linux kernel driver (aclpci_drv). My guess here is that you're trying to DMA more memory than you actually allocated. Double-check sizes passed to clCreateBuffer versus clEnqueueReadBuffer or clEnqueueWriteBuffer. I suspect that the clEnqueueRead/WriteBuffer got a larger size argument than the one to clCreateBuffer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, all sizes are equal.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am also stuck up in the same position. have you found out the solution for that?

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page