Community
cancel
Showing results for 
Search instead for 
Did you mean: 
QTian
Beginner
1,327 Views

OpenCL clGetDeviceIDs() returns a wrong num_devices

* Environment

FPGA board: Arria 10 GX FPGA development kit(DKDEV10AX115SA)

OpenCL and Quartus version: 18.1 pro.

OS: Win10(64bit)

PC: Dell precision tower 7910 (with two CPU installed)

 

* Problem

This FPGA board is initialized and installed in the PC according: “AN 807: Configuring the Intel Arria 10 GX FPGA Development Kit for the Intel® FPGA SDK for OpenCL”.

I run two Intel Altera OpenCL examples on it: “exm_opencl_hello_world_x64_windows” runs well on emulator and FPGA, but “exm_opencl_vector_add_x64_windows”.

For “exm_opencl_hello_world_x64_windows”, both on emulator and on FPGA are ok.

For “exm_opencl_vector_add_x64_windows”, on emulator is ok, but on FPGA will be stopped by some errors. By debugging it, I found that clGetDeviceIDs() get the num_devices as 128 for platform “Intel(R) FPGA SDK for OpenCL(TM)”. In fact, I have only one Intel FPGA OpenCL board there. If it tries to use 128 devices, surely it will crash. (“exm_opencl_hello_world_x64_windows” is ok because it only uses the first device.)

If I force it to only use the first device by modifying some codes, the “exm_opencl_vector_add_x64_windows” runs well on FPGA. But this is not a thorough solution.

I try to list all the platforms and the first device of each platform as followed. Is there something wrong? How to solve the problem that the clGetDeviceIDs() return a wrong device number?

 

 

 

***

When running on FPGA:

(vector_add_fpga_output.txt)

Initializing OpenCL

num_platforms: 3

 

print out platform info from No. 1 upto No. 3

 

 

Platform number 1:

Platform Name: NVIDIA CUDA

Platform Profile: FULL_PROFILE

Platform Version: OpenCL 1.1 CUDA 6.5.14

Platform Vendor: NVIDIA Corporation

This platform has 1 device(s), and its 1st device is:

Querying device for info:

========================

CL_DEVICE_NAME                          = Quadro K5200

CL_DEVICE_VENDOR                        = NVIDIA Corporation

CL_DEVICE_VENDOR_ID                     = 4318

CL_DEVICE_VERSION                       = OpenCL 1.1 CUDA

CL_DRIVER_VERSION                       = 340.66

CL_DEVICE_ADDRESS_BITS                  = 32

CL_DEVICE_AVAILABLE                     = true

CL_DEVICE_ENDIAN_LITTLE                 = true

CL_DEVICE_GLOBAL_MEM_CACHE_SIZE         = 196608

CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE     = 128

CL_DEVICE_GLOBAL_MEM_SIZE               = 0

CL_DEVICE_IMAGE_SUPPORT                 = true

CL_DEVICE_LOCAL_MEM_SIZE                = 49151

CL_DEVICE_MAX_CLOCK_FREQUENCY           = 771

CL_DEVICE_MAX_COMPUTE_UNITS             = 12

CL_DEVICE_MAX_CONSTANT_ARGS             = 9

CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE      = 65536

CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS      = 3

CL_DEVICE_MEM_BASE_ADDR_ALIGN           = 4096

CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE      = 128

CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR   = 1

CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT  = 1

CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT    = 1

CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG   = 1

CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT  = 1

CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE = 1

Command queue out of order?             = true

Command queue profiling enabled?        = true

 

 

Platform number 2:

Platform Name: Intel(R) FPGA SDK for OpenCL(TM)

Platform Profile: EMBEDDED_PROFILE

Platform Version: OpenCL 1.0 Intel(R) FPGA SDK for OpenCL(TM), Version 18.1

Platform Vendor: Intel(R) Corporation

This platform has 128 device(s), and its 1st device is:

Querying device for info:

========================

CL_DEVICE_NAME                          = a10gx : Arria 10 Reference Platform (acla10_ref0)

CL_DEVICE_VENDOR                        = Intel(R) Corporation

CL_DEVICE_VENDOR_ID                     = 4466

CL_DEVICE_VERSION                       = OpenCL 1.0 Intel(R) FPGA SDK for OpenCL(TM), Version 18.1

CL_DRIVER_VERSION                       = 18.1

CL_DEVICE_ADDRESS_BITS                  = 64

CL_DEVICE_AVAILABLE                     = true

CL_DEVICE_ENDIAN_LITTLE                 = true

CL_DEVICE_GLOBAL_MEM_CACHE_SIZE         = 32768

CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE     = 0

CL_DEVICE_GLOBAL_MEM_SIZE               = 2147483648

CL_DEVICE_IMAGE_SUPPORT                 = false

CL_DEVICE_LOCAL_MEM_SIZE                = 16384

CL_DEVICE_MAX_CLOCK_FREQUENCY           = 1000

CL_DEVICE_MAX_COMPUTE_UNITS             = 1

CL_DEVICE_MAX_CONSTANT_ARGS             = 8

CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE      = 536870912

CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS      = 3

CL_DEVICE_MEM_BASE_ADDR_ALIGN           = 8192

CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE      = 1024

CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR   = 4

CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT  = 2

CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT    = 1

CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG   = 1

CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT  = 1

CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE = 0

Command queue out of order?             = false

Command queue profiling enabled?        = true

 

 

Platform number 3:

Platform Name: Intel(R) FPGA Emulation Platform for OpenCL(TM) (preview)

Platform Profile: EMBEDDED_PROFILE

Platform Version: OpenCL 1.0 Intel(R) FPGA SDK for OpenCL(TM), Version 18.1

Platform Vendor: Intel(R) Corporation

This platform has 1 device(s), and its 1st device is:

Querying device for info:

========================

CL_DEVICE_NAME                          = Intel(R) FPGA Emulation Device (preview)

...

 

Platform: Intel(R) FPGA SDK for OpenCL(TM)

Using 128 device(s)

 a10gx : Arria 10 Reference Platform (acla10_ref0)

Using AOCX: vector_add.aocx

Launching for device 0 (1000000 elements)

 

Time: 183.631 ms

Kernel time (device 0): 174.791 ms

 

Verification: PASS

 

0 Kudos
6 Replies
336 Views

Hi @QTian​ 

 

I am looking into this issue, did you see this in past release as well or this happen only with the latest release 18.1 Pro ?

 

Thanks,

Arslan

QTian
Beginner
336 Views

Thank you Arslan,

 

I am new using Intel OpenCL for FPGA, and only have 18.1 pro installed.

Do you suggest me try some older version?

Could it be caused by hardware error of my Arria 10 GX board?

 

Best,

Qingyuan

336 Views

 

I am checking on this.

 

Would you be able to test with only single platform ? Arria10 board only.

 

this will be helpful to provide the feedback to Engineering team.

 

Thanks,

Arslan

QTian
Beginner
336 Views

Dear Arslan,

 

After forcing the NVIDIA and Intel(R) FPGA Emulation Platform disabled, by changing register value, the output is like:

 

>>>

Initializing OpenCL

num_platforms: 1

 

print out platform info from No. 1 upto No. 1

 

 

Platform number 1:

Platform Name: Intel(R) FPGA SDK for OpenCL(TM)

Platform Profile: EMBEDDED_PROFILE

Platform Version: OpenCL 1.0 Intel(R) FPGA SDK for OpenCL(TM), Version 18.1

Platform Vendor: Intel(R) Corporation

This platform has 128 device(s), and its 1st device is:

Querying device for info:

========================

CL_DEVICE_NAME                          = a10gx : Arria 10 Reference Platform (acla10_ref0)

CL_DEVICE_VENDOR                        = Intel(R) Corporation

CL_DEVICE_VENDOR_ID                     = 4466

CL_DEVICE_VERSION                       = OpenCL 1.0 Intel(R) FPGA SDK for OpenCL(TM), Version 18.1

CL_DRIVER_VERSION                       = 18.1

CL_DEVICE_ADDRESS_BITS                  = 64

CL_DEVICE_AVAILABLE                     = true

CL_DEVICE_ENDIAN_LITTLE                 = true

CL_DEVICE_GLOBAL_MEM_CACHE_SIZE         = 32768

CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE     = 0

CL_DEVICE_GLOBAL_MEM_SIZE               = 2147483648

CL_DEVICE_IMAGE_SUPPORT                 = false

CL_DEVICE_LOCAL_MEM_SIZE                = 16384

CL_DEVICE_MAX_CLOCK_FREQUENCY           = 1000

CL_DEVICE_MAX_COMPUTE_UNITS             = 1

CL_DEVICE_MAX_CONSTANT_ARGS             = 8

CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE      = 536870912

CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS      = 3

CL_DEVICE_MEM_BASE_ADDR_ALIGN           = 8192

CL_DEVICE_MIN_DATA_TYPE_ALIGN_SIZE      = 1024

CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR   = 4

CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT  = 2

CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT    = 1

CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG   = 1

CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT  = 1

CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE = 0

Command queue out of order?             = false

Command queue profiling enabled?        = true

Platform: Intel(R) FPGA SDK for OpenCL(TM)

Using 128 device(s)

 a10gx : Arria 10 Reference Platform (acla10_ref0)

Using AOCX: vector_add.aocx

Launching for device 0 (1000000 elements)

 

Time: 183.060 ms

Kernel time (device 0): 174.776 ms

 

Verification: PASS

<<<

 

Best,

Qingyuan

 

336 Views

Hi,

Thanks for the information.

 

Here is a possible solution to root cause the problem.

 

The idea is you can unintall all the bsps, and only install the a10_ref.

 

1.Can you confirm which BSP are you using ? Is Arria10 GX BSP that comes with installation package "a10_ref" and is used during aocl install ?

 

2.Could you share the console output when doing aocl install and aocl uninstall ?

 

3.To cross check if there is any other BSP installed, make sure there is only one fcd file in the directory specified by the reg key (ie. HKEY_LOCAL_MACHINE\Software\Intel\OpenCL\Boards)

 

4.How do you compile the host? By using Makefile or using Visual Studio? 

 

Thanks,

Arslan

336 Views

Hi,

 

To workaround this problem, set the environment variable called "CL_OVERRIDE_NUM_DEVICES_INTELFPGA". set it to be 1 so that the number of device will be overwritten to be 1.

 

You can find the detailed workaround in release notes.

https://www.intel.com/content/www/us/en/programmable/documentation/ewa1412772636144.html#ewa14127730...

 

Thanks,

Arslan

Reply