Intel® Quartus® Prime Software
Intel® Quartus® Prime Design Software, Design Entry, Synthesis, Simulation, Verification, Timing Analysis, System Design (Platform Designer, formerly Qsys)
16593 Discussions

Arria 10 aocl diagnose failed

Altera_Forum
Honored Contributor II
3,570 Views

Dear all, 

 

I have used an Arria 10 board on Ubuntu 12.04 OS. And I successfully flash the board with OpenCL hello_world image. After reboot the PC and type lspci | grep altera, I get: 

01:00.0 Class 1200: Altera Corporation Device 2494 (rev 01) 

Which should mean that the board is properly installed on the system. But when I execute command aocl diagnose, it gives error: 

aocl diagnose: Running diagnose from /root/altera_pro/16.0/hld/board/de5a_net_i2/linux64/libexec 

aocl diagnose: failed 32 times. First error below: 

 

Unable to find the kernel mode driver. 

 

Please make sure you have properly installed the driver. To install the driver, run 

aocl install 

 

DIAGNOSTIC_FAILED 

Then I executed command aocl install, and it worked well and indeed the command lsmod | grep alcpci gives result: 

aclpci_de5a_net_i2_drv 36670 0 

But aocl diagnose still gives the same error, showing that it cannot detect the board kernel mode driver. 

 

Could anyone know about this?
0 Kudos
13 Replies
Altera_Forum
Honored Contributor II
1,208 Views

Altera's PCI-E driver for OpenCL does not officially support Ubuntu; it only supports CentOS/RedHat and windows 7/8.1/10. It is quite likely that the PCI-E driver would not work on an Ubuntu machine even if it installs correctly. Still, try this command and see if the kernel module is actually loaded: sudo lspci -v | grep -A 10 -i altera

0 Kudos
Altera_Forum
Honored Contributor II
1,208 Views

 

--- Quote Start ---  

Altera's PCI-E driver for OpenCL does not officially support Ubuntu; it only supports CentOS/RedHat and windows 7/8.1/10. It is quite likely that the PCI-E driver would not work on an Ubuntu machine even if it installs correctly. Still, try this command and see if the kernel module is actually loaded: sudo lspci -v | grep -A 10 -i altera 

--- Quote End ---  

 

 

Well, sudo lspci -v | grep -A 10 -i altera gives results: 

01:00.0 Class 1200: Altera Corporation Device 2494 (rev 01) (prog-if 01) 

Subsystem: Altera Corporation Device 0002 

Flags: bus master, fast devsel, latency 0, IRQ 16 

Memory at e0d40000 (64-bit, prefetchable)  

Memory at e0d00000 (64-bit, prefetchable)  

Capabilities: [50] MSI: Enable- Count=1/4 Maskable- 64bit+ 

Capabilities: [78] Power Management version 3 

Capabilities: [80] Express Endpoint, MSI 00 

Capabilities: [100] Virtual Channel 

Capabilities: [200] Vendor Specific Information: ID=1172 Rev=0 Len=044 <?> 

Capabilities: [300]# 19 

Kernel driver in use: aclpci_de5a_net_i2 

I think the kernel driver is installed and works...
0 Kudos
Altera_Forum
Honored Contributor II
1,208 Views

Can you also try with -A 12 (or more) and see if you also have a "Kernel modules:" line?

0 Kudos
Altera_Forum
Honored Contributor II
1,208 Views

 

--- Quote Start ---  

Can you also try with -A 12 (or more) and see if you also have a "Kernel modules:" line? 

--- Quote End ---  

 

 

I have changed to CentOS but the same error happens and# lspci -v | grep -A 12 -i altera gives results: 

 

01:00.0 Processing accelerators: Altera Corporation Device 2494 (rev 01) (prog-if 01) 

Subsystem: Altera Corporation Device 0002 

Flags: bus master, fast devsel, latency 0, IRQ 16 

Memory at e0d40000 (64-bit, prefetchable)  

Memory at e0d00000 (64-bit, prefetchable)  

Capabilities: [50] MSI: Enable- Count=1/4 Maskable- 64bit+ 

Capabilities: [78] Power Management version 3 

Capabilities: [80] Express Endpoint, MSI 00 

Capabilities: [100] Virtual Channel 

Capabilities: [200] Vendor Specific Information: ID=1172 Rev=0 Len=044 <?> 

Capabilities: [300]# 19 

Kernel driver in use: aclpci_de5a_net_i2 

Kernel modules: aclpci_de5a_net_i2_drv
0 Kudos
Altera_Forum
Honored Contributor II
1,208 Views

The driver is certainly working correctly then. I can think of more reason that it might not be working for you which I myself also encountered before. Since Terasic distributes their Linux BSP as a zip file which doesn't preserve file permissions, the diagnose script from the BSP might have incorrect permission and that is why it is giving this error. Try chmoding all the files inside the BSP_install_folder/linux64/libexec path to 755 and retrying the command. Better yet switch to root user temporarily, make sure the driver is loaded correctly also under the root user, and then running the diagnose command under that user.

0 Kudos
Altera_Forum
Honored Contributor II
1,208 Views

 

--- Quote Start ---  

The driver is certainly working correctly then. I can think of more reason that it might not be working for you which I myself also encountered before. Since Terasic distributes their Linux BSP as a zip file which doesn't preserve file permissions, the diagnose script from the BSP might have incorrect permission and that is why it is giving this error. Try chmoding all the files inside the BSP_install_folder/linux64/libexec path to 755 and retrying the command. Better yet switch to root user temporarily, make sure the driver is loaded correctly also under the root user, and then running the diagnose command under that user. 

--- Quote End ---  

 

 

Well, seems that all the files inside the BSP_install_folder/linux64/libexec already have 755 permission. 

 

[root@localhost libexec]# ls -al 

total 35944 

drwxr-xr-x. 2 root root 122 Jan 31 20:21 . 

drwxr-xr-x. 5 root root 46 Jan 31 20:16 .. 

-rwxr-xr-x. 1 root root 36729431 Jan 31 20:16 de5a_net_pfl.sof 

-rwxr-xr-x. 1 root root 34033 Jan 31 20:16 diagnose 

-rwxr-xr-x. 1 root root 922 Jan 31 20:16 flash 

-rwxr-xr-x. 1 root root 5325 Jan 31 20:16 flash.pl 

-rwxr-xr-x. 1 root root 1856 Jan 31 20:16 install 

-rwxr-xr-x. 1 root root 14158 Jan 31 20:16 program 

-rwxr-xr-x. 1 root root 785 Jan 31 20:16 uninstall 

 

I tried to chmod 755 again and run aocl diagnose, same error.... 

 

But I found one thing, every time I cool down the PC and restart, lspci cannot give "Altera" results and then I reboot and lspci gives results with Altera. Does this mean that the device is somehow unstable?
0 Kudos
Altera_Forum
Honored Contributor II
1,208 Views

I am really surprised that it is still not working, maybe Terasic's diagnose script is broken? Have you tried an actual OpenCL kernel to see if it works? You can try Altera's hello_world example. 

 

Regarding the board disappearing from lspci: if you do a full shutdown and start up, the FPGA image will be wiped due to power loss but if you have put the image on the on-board flash, the board will be reprogrammed again after start up. If your machine fully boots before the FPGA is fully programmed, chances are the OS will not recognize the device and ignores it and that is why you don't see it in lspci, but after a soft reboot, the device shows up since it is already programmed. I find it quite unlikely for the FPGA to take so long to be reprogrammed. I recommend contacting Terasic, maybe the board is really faulty.
0 Kudos
Altera_Forum
Honored Contributor II
1,208 Views

 

--- Quote Start ---  

I am really surprised that it is still not working, maybe Terasic's diagnose script is broken? Have you tried an actual OpenCL kernel to see if it works? You can try Altera's hello_world example. 

 

Regarding the board disappearing from lspci: if you do a full shutdown and start up, the FPGA image will be wiped due to power loss but if you have put the image on the on-board flash, the board will be reprogrammed again after start up. If your machine fully boots before the FPGA is fully programmed, chances are the OS will not recognize the device and ignores it and that is why you don't see it in lspci, but after a soft reboot, the device shows up since it is already programmed. I find it quite unlikely for the FPGA to take so long to be reprogrammed. I recommend contacting Terasic, maybe the board is really faulty. 

--- Quote End ---  

 

 

I have also tried real OpenCL kernel, but got CL_DEVICE_NOT_FOUND error. Now I've caught some interesting point, things seem to be the problem of the PCIe driver, typing command dmesg | grep -i taint got: 

[ 6.224217] aclpci_de5a_net_i2_drv: module verification failed: signature and/or required key missing - tainting kernel 

which means that the driver module is not signed!! So how could I solve this?
0 Kudos
Altera_Forum
Honored Contributor II
1,208 Views

You might be onto something but the driver seems to be loading anyway, or else the "Kernel modules" field from lspci should be empty. I did some searching, it seems to avoid that issue, you either have to recompile your OS kernel with driver signature verification disabled, or get Terasic to sign that driver for you. Take a look here (http://stackoverflow.com/questions/24975377/kvm-module-verification-failed-signature-and-or-required-key-missing-taintin) for more info. As a last resort, you can try CentOS 6.x, I am pretty sure there is no signature verification on CentOS 6.

0 Kudos
Altera_Forum
Honored Contributor II
1,208 Views

 

--- Quote Start ---  

You might be onto something but the driver seems to be loading anyway, or else the "Kernel modules" field from lspci should be empty. I did some searching, it seems to avoid that issue, you either have to recompile your OS kernel with driver signature verification disabled, or get Terasic to sign that driver for you. Take a look here (http://stackoverflow.com/questions/24975377/kvm-module-verification-failed-signature-and-or-required-key-missing-taintin) for more info. As a last resort, you can try CentOS 6.x, I am pretty sure there is no signature verification on CentOS 6. 

--- Quote End ---  

 

 

Well, the problem was finally solved by contacting the support from Terasic. They got a mistake by unchanging the board name in the driver code...
0 Kudos
Altera_Forum
Honored Contributor II
1,208 Views

Ouch... Sometimes I wonder whether these people even test their own stuff before releasing them to public... I am glad to hear the issue has been resolved though. 

 

P.S. It seems they have not updated the BSP on their website; do they expect people to contact them one by one to get the fixed BSP?
0 Kudos
Altera_Forum
Honored Contributor II
1,208 Views

 

--- Quote Start ---  

Ouch... Sometimes I wonder whether these people even test their own stuff before releasing them to public... I am glad to hear the issue has been resolved though. 

 

P.S. It seems they have not updated the BSP on their website; do they expect people to contact them one by one to get the fixed BSP? 

--- Quote End ---  

 

 

Well, seems that they have already updated the BSP now
0 Kudos
Altera_Forum
Honored Contributor II
1,208 Views

Haha, yes, seems I spoke too soon.

0 Kudos
Reply