Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
17255 Discussions

OpenCL kernel is freezing

Altera_Forum
Honored Contributor II
2,664 Views

Hello. 

 

 

I'm using Centos 6 and picocomputing m506 board. 

 

 

What is maximum size of kernel's arguments? Summary of cl_mem buffer's size must be less than CL_DEVICE_GLOBAL_MEM_SIZE? Or there is some reserved space? 

 

 

I have some strange behavior: 

 

 

input_size = 4288000000;  

output_size = 1340000000;  

global_work_size = 67000000;  

unsigned int* buffer = (unsigned int*) alignedMalloc(input_size);  

unsigned int* digest = (unsigned int*) alignedMalloc(output_size);  

cl_mem inputBuffer = clCreateBuffer(context, CL_MEM_READ_ONLY, input_size, NULL, &status);  

cl_mem outputBuffer = clCreateBuffer(context, CL_MEM_WRITE_ONLY, output_size, NULL, &status); 

 

 

status = clEnqueueWriteBuffer(queue, inputBuffer, CL_FALSE, 0,input_size, (void *) buffer, 0, NULL, &write_event[0]); 

 

 

status = clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *) &inputBuffer);  

status = clSetKernelArg(kernel, 1, sizeof(cl_mem), (void *) &outputBuffer); 

 

 

size_t global_work_size[1] = { global_work_size };  

status = clEnqueueNDRangeKernel(queue, kernel, 1, NULL, global_work_size, NULL, 1, write_event, &kernel_event);  

status = clEnqueueReadBuffer(queue, outputBuffer, CL_FALSE, 0, output_size, buffer_out, 1, &kernel_event, &finish_event); 

 

 

clReleaseEvent(write_event[0]);  

clWaitForEvents(1, &finish_event); 

 

 

This pseudo code will work.  

But same code with another values for input&output sizes will freeze.  

Values:  

input_size = 4352000000;  

output_size = 1360000000;  

global_work_size = 68000000; 

 

 

clWaitForEvents(1, &finish_event); will never exit.
0 Kudos
4 Replies
Altera_Forum
Honored Contributor II
889 Views

The m506 has only 4GB. I'm not sure how the first one even passed, since input_size + output_size > 4GB. OpenCL also reserves some memory for it's own use, so no you don't get the full 4GB. You can query CL_DEVICE_MAX_MEM_ALLOC_SIZE to get the actual available size. 

 

Make sure you check for SUCCESS after the CreateBuffer calls.
0 Kudos
Altera_Forum
Honored Contributor II
889 Views

Our m506 has 8GB. 

 

Verified that the kernel mode driver is installed on the host machine. 

 

Using platform: Altera SDK for OpenCL 

Using Device with name: m506 : m506 

Using Device from vendor: Pico Computing Inc. 

clGetDeviceInfo CL_DEVICE_GLOBAL_MEM_SIZE = 8589934592 

clGetDeviceInfo CL_DEVICE_MAX_MEM_ALLOC_SIZE = 8588886016 

Memory consumed for internal use = 1048576 

Actual maximum buffer size = 8588886016 bytes 

Writing 8191 MB to global memory ... 

Write speed: 2179.79 MB/s [2178.55 -> 2180.43] 

Reading and verifying 8191 MB from global memory ... 

Read speed: 3121.04 MB/s [3120.68 -> 3121.29] 

Successfully wrote and readback 8191 MB buffer 

 

Transferring 8192 KBs in 16 512 KB blocks ... 2625.61 MB/s 

Transferring 8192 KBs in 8 1024 KB blocks ... 2680.01 MB/s 

Transferring 8192 KBs in 4 2048 KB blocks ... 2875.58 MB/s 

Transferring 8192 KBs in 2 4096 KB blocks ... 2995.90 MB/s 

Transferring 8192 KBs in 1 8192 KB blocks ... 3048.22 MB/s 

 

PCIe Gen2.0 peak speed: 500MB/s/lane 

 

Writing 8192 KBs with block size (in bytes) below: 

 

Block_Size Avg Max Min End-End (MB/s) 

524288 1791.03 1890.40 1718.31 1784.67 

1048576 1885.19 1918.46 1837.53 1880.12 

2097152 2013.79 2042.29 1960.92 2011.01 

4194304 2100.40 2107.05 2093.78 2099.05 

8388608 2115.81 2115.81 2115.81 2115.81 

 

Reading 8192 KBs with block size (in bytes) below: 

 

Block_Size Avg Max Min End-End (MB/s) 

524288 2528.00 2625.61 2357.35 2514.59 

1048576 2630.47 2680.01 2604.23 2619.32 

2097152 2848.14 2875.58 2826.59 2839.07 

4194304 2980.27 2995.90 2964.80 2977.00 

8388608 3048.22 3048.22 3048.22 3048.22 

 

Write top speed = 2115.81 MB/s 

Read top speed = 3048.22 MB/s 

Throughput = 2582.02 MB/s 

 

DIAGNOSTIC_PASSED
0 Kudos
Altera_Forum
Honored Contributor II
889 Views

What version of the SDK are you using? 13.x only works with <= 4GB. Make sure you're using 14.0 or 14.1.

0 Kudos
Altera_Forum
Honored Contributor II
889 Views

I'm using 14.0.

0 Kudos
Reply