Intel® High Level Design
Support for Intel® High Level Synthesis Compiler, DSP Builder, OneAPI for Intel® FPGAs, Intel® FPGA SDK for OpenCL™
661 Discussions

How to debug why OpenCL command queue stalled on Cyclone V SoC?

OChen25
Beginner
1,715 Views

On Terasic DE10-Nano I've created an OpenCL BSP based on one originally modified from de10_nano_sharedonly. It works fine for programs like hello_world, 'aocl diagnose'. However, for vector_add, it runs successfully for only ONCE. The 2nd call to vector_add will hang on clWaitForEvents(). 

 

Using clGetEventInfo() I found that the status of a command sent to command queue by clEnqueueReadBuffer() freezes on CL_QUEUED and never changes. On contract, in the first call to vector_add, that status will become CL_COMPLETE immediately.

 

I'm wondering if there is any way I can debug why this command queue is stalled? For example, maybe I can probe some signal/bus inside the circuit, or call some API to know the status of the command queue?

 

0 Kudos
10 Replies
Kenny_Tan
Moderator
1,275 Views
Are you using emulator? You can use printf for issue debugging
0 Kudos
OChen25
Beginner
1,275 Views

No, it's not on emulator. This issue occurs on an FPGA board: Terasic DE-10 Nano.

Any comment?

0 Kudos
Kenny_Tan
Moderator
1,275 Views
0 Kudos
OChen25
Beginner
1,275 Views

Thanks for this suggestion.

That document says I can use "aocl diagnose" to check the hardware.

In my case "aocl diagnose" passed without any problem, as I've mentioned in my original post.

Any other way I can test?

0 Kudos
Kenny_Tan
Moderator
1,275 Views

Can you attached your kernal files and host for me to try it out?

0 Kudos
OChen25
Beginner
1,275 Views

OK. In the attached vector_add.tgz you can find

  1. the opencl kernel source: vector_add/device/vecor_add.cl
  2. the host application source: vector_add/host/src/main.cpp

 

I got them as part of the OpenCL BSP of DE10-Nano from Terasic.

I guess its original source is the v17.1 Arm32 Linux package (.tar.gz) from:

https://www.intel.com/content/www/us/en/programmable/support/support-resources/design-examples/design-software/opencl/vector-addition.html

 

Please kindly understand that you won't duplicate this symptom on the official DE10-Nano OpenCL BSP from Terasic.

It only happens on the BSP I created.

Anyway, as I said in my original post, the symptom is: the host application will stop on the call to clWaitForEvents() (Line 361) in the second run of this host application.

 

0 Kudos
Kenny_Tan
Moderator
1,275 Views

What modification that you have make on the BSP? Do you know how to use signal tap for debugging?

0 Kudos
OChen25
Beginner
1,275 Views

My OpenCL BSP is based on this one: https://github.com/thinkoco/c5soc_opencl, which is originally modified from de10_nano_sharedonly from Terasic.

I modified the Quartus project, the u-boot spl, u-boot, kernel, etc to enable 1920*1080 HDMI output.

 

Yes, I know how to use SignalTap. Any signal/bus inside the circuit you could kindly suggest me to probe?

0 Kudos
Kenny_Tan
Moderator
1,274 Views
To enable signal tap debug, you can follow https://www.intel.com/content/www/us/en/programmable/support/support-resources/support-centers/opencl-bsp-support.html -> 4 debug -> signal tap debug Make sure your timing is closed in your design, you can refer to https://www.intel.com/content/www/us/en/programmable/support/support-resources/support-centers/opencl-bsp-support.html -> 2. floor planing and timing closure Since the design is coming from them, can you post this question to them on the signal to probe https://github.com/thinkoco/c5soc_opencl/issues.
0 Kudos
OChen25
Beginner
1,275 Views

Actually I've contacted the author of c5soc_opencl for this issue, but he has no idea about how to debug this issue, even he had solved many of my issues before.

OK, I'll try to debug in the way you suggest.

Thanks for your kind help. I wish you a wonderful day!

0 Kudos
Reply