Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16975 Discussions

Case #: 00288368 - Unable to configure Arria 10 GX FPGA Development Kit for OpenCL

aejjeh
Beginner
16,470 Views

I am trying to configure our Intel Arria 10 GX FPGA Development Kit for use with the OpenCL SDK by following thefollowing guide. I am able to reach "Installing the OpenCL Runtime Driver - step 3". However, when I get to the point where I try to program the flash memory "Programming the Flash memory on the Intel Arria 10 GX FPGA Development Kit - step 1,b", aocl flash is unable to detect the device. This happened after I installed the Intel SDK for OpenCL Applications version 7.0.0  (note that we need the SDK because we also use GPUs in our development environment).

Note that even before I install the SDK for OpenCL Applications, I was having an issue configuring the Arria 10 Development Kit where when I get to the last step in the guide where it asks for a hard reboot of the machine, the board is not detected anymore by the OpenCL runtime after I perform the reboot. However this is a different issue which I think might not be relevent to the first part above. I do, however, feel that there might be a hardware issue with the board not being able to boot from the on-board flash. 

Please we need to be able to resolve the first issue ASAP since our research depends on it. The second issue can be tackled later.

0 Kudos
1 Solution
25 Replies
HRZ
Valued Contributor III
6,946 Views

That step requires JTAG. If it cannot find the board, either your cable has a problem or your JTAG service is not working correctly. Try "quartus_pgm -a" to see if it detects the board. If not, kiill the jtagd service and run "quartus_pgm -a" again as root. Note that there is a long-standing issue with the jtag service that unless the first time quartus_pgm is called, it is called by the root account, the cable will never be detected.

0 Kudos
aejjeh
Beginner
6,946 Views

No this is not a JTAG issue. The JTAG server is able to detect and program the board (which happens in the first step of the setup guide. However, the opencl runtime seems to be not detecting the board as an opencl device, thus aocl flash is failing. I hope the people who were working on my case can please get in touch and resume helping me resolve the issue. This is pretty urgent for our research lab at the moment.

0 Kudos
HRZ
Valued Contributor III
6,946 Views

If you think the problem is the run-time not detecting the board, then check "/etc/OpenCL/vendors/" and make sure that Altera.icd exists in that path. If not, copy it from the "hld" folder in your Quartus installation.

0 Kudos
MuhammadAr_U_Intel
6,946 Views

Hi,

Please refer to the document below.

https://www.intel.com/content/www/us/en/programmable/documentation/tgy1490191698959.html

 

I believe the problem you are facing is mentioned as first two topics under "Troubleshooting"

 

--

Arslan

aejjeh
Beginner
6,946 Views

The issue I'm facing is different from what is described in troubleshoot. I donot get a programming error. The issue is that when I run "aocl flash acl0 boardtest.aocx " after running "aocl install", I get the help message from aocl saying that the arguments are wrong. When I try to do "aocl diagnose" I get the following error:

Warning: No devices attached for package: /opt/intelFPGA_pro/18.0/hld/board/a10_ref --------------------------------------------------------------------

Note that before we installed the Intel OpenCL SDK that issue didn't happen; I was able to run "aocl diagnose" and see the device listed after doing "aocl install". I hope this makes sense.

0 Kudos
MuhammadAr_U_Intel
6,947 Views

@aejjeh​  Can you share the console output after aocl install ? Did you see any Error.

 

Also to confirm you are using tool version 18.0 ?

 

0 Kudos
SSFyTMT
Novice
6,947 Views

I have the nearly the same issue. I'm using tool version 17.1.2 on Ubuntu 16.04. I've also tried this under windows 7 using 17.1.2. But this issue is present from a fresh install.

 

Is it possible the JTAG'd FPGA image is being blown away in the "warm reboot" step. So that when the PCIe bus is re-enumerated - there is nothing to recognize? I've also tried to skip the warm reboot step and rescan the PCIe bus using "sudo bash -c 'echo 1 > /sys/bus/pci/rescan'" - still no luck getting aocl diagnose or aocl list-devices to recognize the board.

 

Any help on this issue? Any way to flash the device with the .aocx file through the Quartus GUI?

 

0 Kudos
SSFyTMT
Novice
6,947 Views

I believe I was able to confirm that using /sbin/reboot or just restarting windows 7 does not blow way the FPGA that is configuring using quartus_pgm in the "Initializing the Intel Arria 10 GX FPGA Development Kit for use with OpenCL" steps of AN 807.

 

Basically I watched the behavior of the on board LEDs. From a cold boot they act very different than after the top.sof is programmed via jtag. After a reboot, they do NOT revert to the same behavior as after a cold reboot.

 

So now I'm stuck. Both in Windows 7 and in Ubuntu 16.04 I get to the "aocl flash boardtext.aocx" step and only get the help message like my arguments are wrong. At this point I have a $5,000 paper weight.

 

 

0 Kudos
MuhammadAr_U_Intel
6,947 Views

@SSFyTMT​ 

Correct with a soft reset also called (warm reboot) FPGA won't loose the configuration.

 

Can you share the console output log file from the "aocl install step until aocl flash and aocl diagnose" so we can further take a look.

 

Also confirm what version of OpenCL you are using?

0 Kudos
SSFyTMT
Novice
6,947 Views

I'm using 17.1.1.273 as reported by aocl version.

 

From Windows 7 "aocl install" from an elevated command prompt:

 

******************************

 

e:\>aocl install

Do you want to install E:\intelFPGA_pro\17.1\hld\board\a10_ref? [y/n] y

aocl install: Running install from e:/intelFPGA_pro/17.1/hld/board/a10_ref/windows64/libexec

+------------------------------------------------------+

+ Performing initial checks...             +

+------------------------------------------------------+

 

+------------------------------------------------------+

+ Installing kernel driver module...          +

+------------------------------------------------------+

 

WDREG utility v10.21. Build Aug 31 2010 14:21:54

 

Processing HWID *WINDRVR6

Installing a signed driver package for *WINDRVR6

LOG ok: 1, ENTER: DriverPackageInstallA

LOG ok: 1, ENTER: DriverPackageInstallW

LOG ok: 1, Looking for Model Section [DeviceList.NTamd64]...

LOG ok: 1, Installing INF file 'e:\intelFPGA_pro\17.1\hld\board\a10_ref\windows64\driver\windrvr6.inf' (Plug and Play).

LOG ok: 1, Looking for Model Section [DeviceList.NTamd64]...

LOG ok: 1, Installing devices with Id "*WINDRVR6" using INF "E:\Windows\System32\DriverStore\FileRep

ository\windrvr6.inf_amd64_neutral_cc434239b4be1779\windrvr6.inf".

LOG ok: 1, ENTER UpdateDriverForPlugAndPlayDevices...

LOG ok: 0, RETURN UpdateDriverForPlugAndPlayDevices.

LOG ok: 1, Installation was successful.

LOG ok: 0, Install completed

LOG ok: 1, RETURN: DriverPackageInstallW (0x0)

LOG ok: 1, RETURN: DriverPackageInstallA (0x0)

 difx_install_preinstall_inf: err 0, last event 0, last error 0. SUCCESS

install: completed successfully

 

+------------------------------------------------------+

+ Installing board drivers...             +

+------------------------------------------------------+

 

WDREG utility v10.21. Build Aug 31 2010 14:21:54

 

Processing HWID PCI\VEN_1172&DEV_2494&SUBSYS_A1511172&REV_01

Installing a non-signed driver package for PCI\VEN_1172&DEV_2494&SUBSYS_A1511172&REV_01

Device node (hwid:PCI\VEN_1172&DEV_2494&SUBSYS_A1511172&REV_01): does not exist and is not configured. Pre-installing.

LOG ok: 1, ENTER: DriverPackagePreinstallA

LOG ok: 1, ENTER: DriverPackagePreinstallW

LOG ok: 0, e:\intelFPGA_pro\17.1\hld\board\a10_ref\windows64\driver\acl_boards_a10_ref.inf is preinstalled.

LOG ok: 1, RETURN: DriverPackagePreinstallW (0x0)

LOG ok: 1, RETURN: DriverPackagePreinstallA (0x0)

 difx_install_preinstall_inf: err 0, last event 0, last error 0. SUCCESS

install: completed successfully

 

+------------------------------------------------------+

+ ****** SUCCESS! Please reboot your system      +

+------------------------------------------------------+

Press any key to continue . . .

The operation completed successfully.

 

e:\>aocl version

aocl 17.1.1.273 (Intel(R) FPGA SDK for OpenCL(TM), Version 17.1.1 Build 273, Copyright (C) 2017 Inte

l Corporation)

 

*******************************

 

 

From aocl flash. It doesn't matter what parameters I put after aocl flash, it prints the help message.

 

***************************

e:\>aocl flash acl0 boardtest.aocx

 

aocl flash - Initialize the FPGA with a specific startup configuration.

 

 

Usage: aocl flash <device_name> <file.aocx>

 

  Supply the .aocx file for the design you wish to set as the default

  configuration which is loaded on power up.

 

Description:

 

  This command initializes the board with a default configuration

  that is loaded onto the FPGA on power up. Not all boards will

  support this, check with your board vendor documentation.

 

 

 

0 Kudos
SSFyTMT
Novice
6,947 Views

I'm using 17.1.2.304 as reported by aocl version on Ubuntu 16.04

 

From Ubuntu 16.04 "aocl install" run with sudo:

 

aocl install: Running install from /home/user/intelFPGA_pro/17.1/hld/board/a10_ref/linux6/libexec

/home/user/intelFPGA_pro/17.1/hld/board/a10_ref/linux64/libexec/install: 9: [: aclpci_a10_ref_drv: unexpected operator

Looking for kernel source files in /lib/modules/4.15.0-34-generic/build

Using kernel source files from /lib/modules/4.15.0-34-generic/build

Building driver for BSP with name a10_ref

make: Entering directory '/usr/src/linux-headers-4.15.0-34-generic'

 CC [M] /tmp/opencl_driver_fon1Cy/aclpci_queue.o

 CC [M] /tmp/opencl_driver_fon1Cy/aclpci.o

 CC [M] /tmp/opencl_driver_fon1Cy/aclpci_fileio.o

 CC [M] /tmp/opencl_driver_fon1Cy/aclpci_dma.o

 CC [M] /tmp/opencl_driver_fon1Cy/aclpci_pr.o

 CC [M] /tmp/opencl_driver_fon1Cy/aclpci_cmd.o

 LD [M] /tmp/opencl_driver_fon1Cy/aclpci_a10_ref_drv.o

 Building modules, stage 2.

 MODPOST 1 modules

 CC     /tmp/opencl_driver_fon1Cy/aclpci_a10_ref_drv.mod.o

 LD [M] /tmp/opencl_driver_fon1Cy/aclpci_a10_ref_drv.ko

make: Leaving directory '/usr/src/linux-headers-4.15.0-34-generic'

 

*********

If you notice the unexpected operator error at line 9 of install, I've traced that to the piece of code below. The uninstall script gives the same error, but I don't think it's causing any issues as the MODULE_NAME is being built correctly and it's just the check that's failing.

 

********

BSP_NAME=`aocl board-name`

MODULE_NAME=aclpci_$BSP_NAME\_drv

 

if [ "$MODULE_NAME" == "aclpci__drv" ]

then

 echo Failed to determine BSP name

 exit 1

fi

***********

 

 

aocl flash is exactly the same as Windows 7 - just the help printout.

 

 

0 Kudos
aejjeh
Beginner
6,947 Views

@MUsman​  Sorry for the delay in my response. I have been a little busy. Since my post, I have moved the board to a new machine which only has the board connected to it (no other openCL drivers are installed). After installing Quartus and the Intel FPGA OpenCL SDK (version 18.0) I started following the AN 807 steps to set up the board for OpenCL. Unlike the old machine where the Intel FPGA OpenCL SDK/drivers/runtime was working until we had to install the regular Intel OpenCL SDK (for GPU, AVX); I am now unable to get it to work at all on the new machine. here are the steps I've taken so far:

 

1. After verifying that the jtag is working by running jtagconfig and quartus_pgm -l, and after setting the jtag clock to 6M, I ran the following two commands:

  • quartus_pgm -c 1 -m JTAG -o "p;max5_150.pof@2"
  • quartus_pgm -c 1 -m JTAG -o “p;top.sof”

2. I performed a pcie bus rescan using the command that @SSFyTMT​  referred to in his earlier post, followed by an lspci but did not see the board in the pci list.

3. I performed a soft reboot and then ran lscpi but I still did not see the board in the list.

4. I tried to run aocl install, I think it succeeded and this is the output:

root@hpvmfpga:/srv/FPGA_Tools/a10_ref_initialization# aocl install Do you want to install /opt/intelFPGA_pro/18.0/hld/board/a10_ref? [y/n] y aocl install: Running install from /opt/intelFPGA_pro/18.0/hld/board/a10_ref/linux64/libexec Looking for kernel source files in /lib/modules/4.15.0-34-generic/build Using kernel source files from /lib/modules/4.15.0-34-generic/build Building driver for BSP with name a10_ref make: Entering directory '/usr/src/linux-headers-4.15.0-34-generic' Makefile:976: "Cannot use CONFIG_STACK_VALIDATION=y, please install libelf-dev, libelf-devel or elfutils-libelf-devel" CC [M] /tmp/opencl_driver_Foizmr/aclpci_queue.o CC [M] /tmp/opencl_driver_Foizmr/aclpci.o CC [M] /tmp/opencl_driver_Foizmr/aclpci_fileio.o CC [M] /tmp/opencl_driver_Foizmr/aclpci_dma.o CC [M] /tmp/opencl_driver_Foizmr/aclpci_pr.o CC [M] /tmp/opencl_driver_Foizmr/aclpci_cmd.o LD [M] /tmp/opencl_driver_Foizmr/aclpci_a10_ref_drv.o Building modules, stage 2. MODPOST 1 modules CC /tmp/opencl_driver_Foizmr/aclpci_a10_ref_drv.mod.o LD [M] /tmp/opencl_driver_Foizmr/aclpci_a10_ref_drv.ko make: Leaving directory '/usr/src/linux-headers-4.15.0-34-generic'

5. After running aocl instal I ran aocl flash and the following error comes up:

root@hpvmfpga:/srv/FPGA_Tools/a10_ref_initialization# aocl flash acl0 ./boardtest.aocx sh: 1: Syntax error: "(" unexpected   aocl flash - Initialize the FPGA with a specific startup configuration.     Usage: aocl flash <device_name> <file.aocx>   Supply the .aocx file for the design you wish to set as the default configuration which is loaded on power up.   Description:   This command initializes the board with a default configuration that is loaded onto the FPGA on power up. Not all boards will support this, check with your board vendor documentation.

6. Running lspci at this stage, still the board doesn't appear.

 

Any suggestions as to what might be the issue? I am running Ubuntu 16.04 by the way.

0 Kudos
SSFyTMT
Novice
6,947 Views

@MUsman​  any feedback on this issues? Since my last update, I've switched to CentOS 7 - hoping there was an issue with Ubuntu - exact same results. I also realized I had clicked on the 30 day eval license when installing Quartus and hoped that was the issue. I put my full license on this machine - no change.

 

So now we have people reporting the exact same issue on Windows 7, Ubuntu, and CentOS 7. We follow the instructions in AN 807 and it doesn't work - and there is no way to debug it. What causes the aocl flash command to always return the help dialogue??

 

In step 5 of "Initializing the Intel Arria 10 GX FPGA Development Kit for use with OpenCL" it says:

 

After you program the FPGA and perform a soft reboot, the host system should recognize the Intel® Arria® 10 GX FPGA Development Kit PCIe card. Your system must recognize the card before you load the Intel® FPGA SDK for OpenCL™ driver.

 

Ok, how do we confirm it recognizes the card? If it doesn't, what steps can we take? Should lspci shows the board? If so, what will it show up as? Is there a way to see if it finds the board but can't find the driver for it??

 

 

 

 

 

 

0 Kudos
HRZ
Valued Contributor III
6,947 Views

If the board is correctly programmed with a valid OpenCL binary via JTAG and the PCI-E core is enabled, after soft reboot, "sudo lspci -v | grep -i intel" (or altera if you are using older versions of the compiler) should return something. If not, many possibilities exist:

 

1- Board is not programmed correctly via JTAG.

2- Board has not been seated on the PCI-E port correctly.

3- Board is powered off/broken.

 

Running aocl flash/program without correctly programming the board and installing the driver and passing aocl diagnose first is pointless.

 

P.S. This thread might be relevant to your problem:

 

https://forums.intel.com/s/question/0D50P00003yyToNSAU/arria-10-gx-opencl-setup-question

0 Kudos
SSFyTMT
Novice
6,947 Views

Hi @HRZ - thanks for still being an active member on these forums. lspci run after JTAG configuration and soft reboot does list a lot of intel devices (pci bridges, 10GigE controllers, "Sunrise Point-H" devices) - but nothing that looks to me like an FPGA accelerator board, just system devices. But it would be nice to know what it should show up as. I would hope its description isn't "memory controller", but who knows.

 

1 - I've done my best to follow AN807 to use jtagconfig and quartus_pgm to configure the Max 5 CPLD and the Arria FPGA. I've watched the behaviour of the on board LEDs and believe they are acting accordingly. No JTAG commands report errors. I've confirmed soft rebooting the machine as instructed isn't resetting the FPGA configuration (see earlier post).

2 - It is.

3 - The boards is on, LEDs are on, the JTAG chain is recognized. I've triple checked the switch positions against AN807. The 6 pin PCIe power connector is connected. I've used this exact PCIe slot with a Bittware Arria 10 accelerator board without issues (I'm booting from a different hard drive w/ a fresh install of Linux and Intel tools).

 

I agree that running aocl flash without the board being programmed correctly is pointless, which is why I've tried to run the diagnose and list-devices commands. aocl diagnose shows no devices attached for the a10_ref BSP. aocl list-devices shows the same.

 

I've seen and reviewed the linked thread, it doesn't seem that it provides a solution either. Some things it mentions:

 

1) Re-install the driver if you're moving from 16.1 to 17.1. - This is a fresh install of 17.1.2. Yes, I've followed the instructions in AN807 about installing the driver from an elevated terminal after configuring via JTAG.

2) bas/dash configuration - I'm on CentOS. I've confirmed that /bin/sh references /bin/bash

3) set the jtag to 6M again after the soft reboot. I've tried this, but not sure it matters since the root issue seems to be the board isn't recognized via PCIe.

 

Someone also mentions the lsmod command. I can see in my setup that lsmod does show the following:

Module                   Size  Used by

aclpci_a10_ref_drv    41403 0

 

I'll also complain that thread is very difficult to follow as it was migrated to this forum from the far superior altera forum - all names were lost and now "Altera Forum" (Intel) shows up as the author for every post. So I'm not sure who posted what - but clearly I'm not the only one with this issue.

 

 

 

 

 

 

 

 

 

 

 

 

0 Kudos
MuhammadAr_U_Intel
6,947 Views

lspci_accelerator.PNG@SSFyTMT​ 

 

As for the output of lspci, here is the output I get for FPGA card.

 

0 Kudos
HRZ
Valued Contributor III
6,946 Views

@SSFyTMT, can you post the output of these two commands after configuring the FPGA via JTAG and doing a soft reboot?

 

sudo lspci -v | grep -i altera -A 15

 

sudo lspci -v | grep -i intel -A 15

 

(you can remove the lines belonging to other devices at the end of each hit, I am just trying to make sure all the configurations of each hit are included by adding the -A 15 argument)

0 Kudos
SSFyTMT
Novice
6,946 Views

@HRZ - No utput for altera devices. For intel, just system devices. I've attached the output here as a txt file as it was too long for a post. (it took me 3 or 4 tries because the forum page kept crashing when trying to attach a file).

 

@MUsman - thanks!

0 Kudos
HRZ
Valued Contributor III
6,946 Views

Indeed the board is not detected. Can you elaborate what binary you are using to configure the FPGA via JTAG, and mention the exact command line you are using to do so?

 

By the way, you can use this script to convert the boardtest.aocx file to sof and use that to configure the FPGA instead. Just make sure to correct the path to the file first.

#!/bin/bash   aocxfile="*path*/boardtest.aocx" binfile="fpga_temp.bin" sofopenclfile="fpga_temp.sof"   aocl binedit $aocxfile get .acl.fpga.bin $binfile aocl binedit $binfile get .acl.sof $sofopenclfile   quartus_pgm --mode=JTAG --cable=1 -o "p;$sofopenclfile"

 

0 Kudos
SSFyTMT
Novice
6,684 Views

@HRZ - The binary I'm using is from the a10_ref BSP design as described in AN807 https://www.intel.com/content/www/us/en/programmable/documentation/tgy1490191698959.html . It is provided as part of the Quartus/AOCL install. On my environment it is in intelFPGA_pro/17.1/hld/board/a10_ref/bringup.

 

Thanks for that command, I'll give it a shot.

 

I gave it a shot and created the .sof from the .aocx. I configured the FPGA via JTAG the same as in AN807, but replaced the top.sof that comes with the a10_ref BSP with the .sof created from boardtext.aocx (that also comes with the a10_ref BSP). No luck in getting the board to be recognized.

 

 

 

 

 

0 Kudos
Reply