Application Acceleration With FPGAs
Programmable Acceleration Cards (PACs), DCP, FPGA AI Suite, Software Stack, and Reference Designs
479 Discussions

Can not find device after server carry two Arria 10 GX PAC.

wgshnyhlw
Beginner
2,128 Views

Hi everyone, 

 

I want to try to use two Acceleration card with Arria 10 GX FPGA, so that our system can deal with larger amounts of data in parallel. Everything works well before I plug in the second PAC on server.

 

After I plug in the second PAC, add an new PCI device on server vmware, and reinstall the development stack V 1.2.1. I found Linux kernel installation output is not full, 'lsmod | grep fpga' output like the picture shows below,  without 'intel_fpga_fme', 'intel_fpga_pac_hssi' and so on. It is obvious that no device could be found by 'sudo fpgainfo fme'. 

wgshnyhlw_0-1683358455197.png

I make sure that two PAC can be find in PCI list, but it seems that two card have the same name 09c5, is it normal situation? 

wgshnyhlw_2-1683358643151.png

 

My question: 

1.  Why the two acceleration card have the same name in pci list?

2.  I have been follow the TroubleShooting section 'F.5. Troubleshooting OPAE Installation on RHEL' steps to update Linux kernel and reinstall software, but still useless. Can two acceleration card be used at the same time? Does anyone can give me other advice to solve this problem?  

 

Hope to get some useful support. Thanks.

0 Kudos
17 Replies
JohnT_Intel
Employee
2,091 Views

Hi,


  1. Both card had the same name as you are using same board connected to the server. If you look at the left enumerated number, it will be different as that is usually used by CPU.
  2. Both card can be used at the same time without any issue. Can you check if the driver is loaded into both the card? You can check it with "lspci -s 13:00.0 -v" and "lspci -s 1b:00.0"


Thanks.


0 Kudos
JohnT_Intel
Employee
1,989 Views

Hi,


May I know if you have any other queries on this?


0 Kudos
wgshnyhlw
Beginner
1,965 Views

Hi John, 

 

Yes I have some questions. I pulled out one of the two Arria 10 GX PAC to try to fix the driver. I tried to reinstall the development kit, but it seems still not installed successfully. 

After I reinstalled the development software, "lspci" logs shows the PAC device 09c5 could be found:

wgshnyhlw_0-1685429412070.png

However, "lsmod | grep fpga" logs still not full:

wgshnyhlw_1-1685429468951.png

 

Before I started using two PAC cards, the driver was perfectly usable with my code. 

 

Question:

What's wrong the the driver now? What should I do to fix the driver at first? I want to check whether I miss some operations before I reinstalled the driver. 

If there are any useful documents could be provided to show me the correct steps to let multiple PAC cards running was really appreciated.

 

Thanks. Best wishes.

0 Kudos
JohnT_Intel
Employee
1,954 Views
0 Kudos
wgshnyhlw
Beginner
1,932 Views

Hi John, 

 

I totally follow this documents to install the development driver. Before I insert the second PAC card on the server, everything works fine. I have been followed the Quick Start Guide to reinstall Opea software, and also follow the section "Troubleshooting OPAE Installation on RHEL" to try to fix the OPAE software. However, all operations are useless. 

 

Do you have any other advice? How can I find the cause of the problem? 

 

Thank you.

0 Kudos
JohnT_Intel
Employee
1,930 Views

Hi,


Can you performed "lspci -vv" to see if any driver is attach to PAC card?


0 Kudos
wgshnyhlw
Beginner
1,923 Views

Hi John,

 

The "lspci -vv" command output is shown below: 

lspci-vv-09c5.png

Can these information tell you what the problem is? Thanks for your support.

0 Kudos
JohnT_Intel
Employee
1,909 Views

Hi,


Are you seeing same driver on both card? You are only providing information on 1 card only.


What do you observed when performing ""fpgainfo fme"?


0 Kudos
wgshnyhlw
Beginner
1,887 Views

Hey John,

 

The second PAC card lspci log shows below: 

pci-1b-00.png

"fpgainfo  fme" command print "No device found". 

0 Kudos
JohnT_Intel
Employee
1,880 Views

Hi,


I observed that when you grep the OPAE driver, you only have 2 driver installed. Below is the example that you should observed when installing the OPAE correctly.



0 Kudos
wgshnyhlw
Beginner
1,866 Views

Hi,

 

I know the opae driver not fully be installed. Beacause before I use two PAC card, the opae driver installed correctly, which shows same with your picture. 

So I want to know what happened with the driver after I use two PAC cards. It is strange because I think the driver should automatically detect when I plug in an new card.  However the driver seems didn't look like what I thought. 

I have been tried to reinstall the opae driver, but still not helpful, do you have other suggestion?

0 Kudos
JohnT_Intel
Employee
1,805 Views

Hi,


Can you remove 1 of the card and see if you are still able to detect the card witthout any issue? I have tried from my side and there is no issue on connecting another card into the system


0 Kudos
wgshnyhlw
Beginner
1,764 Views

Hi John,

 

I have tried to remove one of the card and reinstall the opae driver. But "lsmod | grep fpga" also performs weird with only two output. 

wgshnyhlw_0-1686299582655.png

My reinstall step is:

"sudo rm -r /inteldecstack"

cd ~/a10_gx_pac_ias_1_2_1_pv_dev_installer

./setup.sh

 

Is there a problem with my installation steps? 

Wish you have a good day. Thank you.

0 Kudos
JohnT_Intel
Employee
1,715 Views

Hi.


May I know where do you get the installation file? Are you installing based on the script only? Have you try to reeinstall the PAC installation as well?


0 Kudos
wgshnyhlw
Beginner
1,615 Views

Hi John,

 

I download the installation file from this URL with version 1.2.1: https://www.intel.com/content/www/us/en/software-kit/665840/intel-pac-with-intel-arria-10-gx-fpga-acceleration-stack-version-1-2-1.html? . 

 

The size of the compressed acceleration stack is 9.3G. It is obvious that I download a full acceleration stack but not a script only.

wgshnyhlw_0-1687244479877.png

 

What you mean to "reinstall the PAC installation"?   I just tried to reinstall the opae driver with the acceleration stack I downloaded before.

0 Kudos
JohnT_Intel
Employee
1,609 Views

Hi,


Have you try un-installed before installing again? Do you have another system to test it out?


0 Kudos
wgshnyhlw
Beginner
1,506 Views

Hi John,

 

What is the right operation you mentioned "un-install"?  I un-install the acceleration stack by "sudo rm -r ../inteldevstack". If my operation was wrong, please give me the correct way.

 

I only have one system to support acceleration card. Do you think I should reinstall the operating system? 

 

Thank you.

0 Kudos
Reply