Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Altera_Forum
Honored Contributor I
736 Views

clflush or any openCL API is taking more time once kernel execution is completed.

In function1(), we are launching kernel, but we are not waiting for kernel completion. After kernel launch, some processing is happening on CPU which takes around 300msec. While processing on CPU we have triggered for the kernel execution status along with timer. It is showing status as running for few iterations and time taken by clFlush is negligible. Once kernel execution is completed clFlush is consuming 6711 micro sec(Instead of clfinish, if we have clenqueuereadbuffer it is also consuming the same time). Why is it consuming more time once the execution is completed? Is there any alternate method to reduce the time? 

 

void function1() 

 

 

printf("In function-1\n"); 

... 

err = clEnqueueTask(commandQueue[0], kernel, 0, NULL, &kernel_event[0]); 

if (CL_SUCCESS != err) { 

printf("Error in clEnqueueTask kernel-1 %d\n\n", err); 

exit(-1); 

void function2() 

 

 

printf("In function-2\n"); 

 

struct timeval start_timer, end_timer; 

gettimeofday(&start_timer, NULL); 

 

err = clEnqueueReadBuffer(commandQueue[0], dstOut1, CL_TRUE, 0, 120 * 23 * sizeof(cl_short2), output1, 0, NULL, NULL); 

if (CL_SUCCESS != err) { 

printf("Error in clEnqueueReadBuffer dstOut1 %d\n", err); 

exit(-1); 

gettimeofday(&end_timer, NULL); 

time_taken = ((end_timer.tv_sec * 1000000 + end_timer.tv_usec) - (start_timer.tv_sec * 1000000 + start_timer.tv_usec)); 

printf("Time taken by clEnqueueReadBuffer-1 %ld\n", time_taken); 

 

 

gettimeofday(&start_timer, NULL); 

 

err = clEnqueueReadBuffer(commandQueue[0], dstOut2, CL_TRUE, 0, 120 * 23 * sizeof(cl_short2), output2, 0, NULL, NULL); 

if (CL_SUCCESS != err) { 

printf("Error in clEnqueueReadBuffer dstOut2 %d\n", err); 

exit(-1); 

 

gettimeofday(&end_timer, NULL); 

time_taken = ((end_timer.tv_sec * 1000000 + end_timer.tv_usec) - (start_timer.tv_sec * 1000000 + start_timer.tv_usec)); 

printf("Time taken by clEnqueueReadBuffer-2 %ld\n", time_taken); 

 

 

//launching another kernel 

err = clEnqueueTask(commandQueue[0], kernel, 0, NULL, &kernel_event[0]); 

if (CL_SUCCESS != err) { 

printf("Error in clEnqueueTask kernel-1 %d\n\n", err); 

exit(-1); 

 

 

 

 

int main() 

 

 

for(int i = 0; i < 100; i++) 

if(i == 0) 

function1(); 

else 

function2(); 

 

 

 

/* processing on cpu */ 

 

 

for(int id = 0; id < 1000; id++) 

 

 

/* processing on cpu */ 

 

 

 

struct timeval begin_cq, end_cq; 

gettimeofday(&begin_cq, NULL); 

 

cl_int res, status; 

res = clGetEventInfo(kernel_event[0], CL_EVENT_COMMAND_EXECUTION_STATUS, sizeof(cl_int), &status, NULL); 

switch (status) 

case CL_QUEUED: 

printf("Execution Status: Queued\n"); 

break; 

case CL_SUBMITTED: 

printf("Execution Status: Submitted\n"); 

break; 

case CL_RUNNING: 

printf("Execution Status: Running\n"); 

break; 

case CL_COMPLETE: 

printf("Execution Status: Completed\n"); 

break; 

default: 

printf("Execution Status: Error (%d)\n", status); 

break; 

clFlush(commandQueue[0]); 

 

gettimeofday(&end_cq, NULL); 

long time_taken_cq = ((end_cq.tv_sec * 1000000 + end_cq.tv_usec) - (begin_cq.tv_sec * 1000000 + begin_cq.tv_usec)); 

printf("Time taken by clFlush %ld micro sec\n", time_taken_cq); 

 

 

}//for(id) 

 

}//for(i) 

 

 

return 0; 

 

 

 

 

 

 

Output: 

In function-1 

Execution Status: Running 

Time taken by clFlush 5 micro sec 

Execution Status: Running 

Time taken by clFlush 4 micro sec 

Execution Status: Running 

Time taken by clFlush 4 micro sec 

Execution Status: Running 

Time taken by clFlush 4 micro sec 

Execution Status: Running 

Time taken by clFlush 4 micro sec 

Execution Status: Running 

Time taken by clFlush 4 micro sec 

Execution Status: Running 

Time taken by clFlush 4 micro sec 

Execution Status: Completed 

Time taken by clFlush 6711 micro sec 

Execution Status: Completed 

Time taken by clFlush 1 micro sec 

Execution Status: Completed 

Time taken by clFlush 1 micro sec 

Execution Status: Completed 

Time taken by clFlush 2 micro sec 

Execution Status: Completed 

Time taken by clFlush 2 micro sec 

Execution Status: Completed 

Time taken by clFlush 2 micro sec 

Execution Status: Completed 

Time taken by clFlush 1 micro sec 

Execution Status: Completed 

Time taken by clFlush 2 micro sec 

In function-2 

 

 

Thanks, in advance
0 Kudos
2 Replies
Altera_Forum
Honored Contributor I
27 Views

Hi All, 

Would like to know if anybody has faced similar kind of issue, while doing host pipelining to cover the FPGA time wrt Host processing. Any help from Altera would be of great help. 

Thanks in Advance.
Altera_Forum
Honored Contributor I
27 Views

 

--- Quote Start ---  

Hi All, 

Would like to know if anybody has faced similar kind of issue, while doing host pipelining to cover the FPGA time wrt Host processing. Any help from Altera would be of great help. 

Thanks in Advance. 

--- Quote End ---  

 

 

I am afraid this forum is not monitored by Altera. You can go to Altera.com, creat an account and open a service request.
Reply