Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
661 Views

Intel BD-NVV-N3000-2 issue

After installing new FPGA card on PCIe x16 Slot, when we try to enter in BIOS server is halting in BIOS with error "Nmi activated - system halted".

Note : It’s booting to OS without any issue.

Server Product Code : LWF2208IR848505
Server S/N : BQF900200624
FPGA Model : BD-NVV-N3000-2
FPGA S/N : 644C36123B68
Intel BIOS : 02.01.0012

OS : CentOS 7.7

Please find attached lspci output.

Need your help to resolve this issue, can you please escalate this issue to Intel core development team. We are assuming this is not an hardware failure issue, it could be due to compatibility issue or new modified/beta BIOS can resolve this issue.

0 Kudos
31 Replies
Highlighted
Employee
181 Views

Hi, I have sent you a private message.

0 Kudos
Highlighted
Beginner
170 Views

Hi Jonway,

I have tried same FPGA card in Supermicro 7049GP-TRT server and observed FPGA is not detecting. Please find attached "lspci" command output.

0 Kudos
Highlighted
Beginner
160 Views

HI Jonway,

Have you any solution on this.

0 Kudos
Highlighted
Employee
155 Views

Hi @samir_bhansali 

1) If you suspect, it is a card issue, can you try with a different card?

2) If it is the same on all cards, can you let me know if lspci not detect card happens immediately right after boot up

OR

it was detected at first but failed later on?

 

Also, is the LED at blinking at 1 sec interval. If yes, happens immediately right after boot up or failed later on?

0 Kudos
Highlighted
Beginner
143 Views

Hi Jonway,

I have checked again Intel FPGA card with Supermicro SYS-7049GP-TRT server and found, after powering on the server there is no led glowing on FPGA led. After booting to OS led blinking status is 1,3 blinking green and 2,4 blinking yellow.

Immediately after booting to OS and also later on, we tried with "lspci" command and now the card is showing in output list.

Please find attached output for your reference.

Kindly check attachment and let us know whether card is detecting properly or not.

Please suggest what could be the issue with Intel server for FPGA card detection.

0 Kudos
Highlighted
Employee
118 Views

HI @samir_bhansali 

Based on the supermicro lspci output, it shows the card is detected properly.

60:00.0 Ethernet controller: Intel Corporation Ethernet Controller XXV710 Intel(R) FPGA Programmable Acceleration Card N3000 for Networking (rev 02)
60:00.1 Ethernet controller: Intel Corporation Ethernet Controller XXV710 Intel(R) FPGA Programmable Acceleration Card N3000 for Networking (rev 02)
61:00.0 Processing accelerators: Intel Corporation Device 0b30
62:00.0 Ethernet controller: Intel Corporation Ethernet Controller XXV710 Intel(R) FPGA Programmable Acceleration Card N3000 for Networking (rev 02)
62:00.1 Ethernet controller: Intel Corporation Ethernet Controller XXV710 Intel(R) FPGA Programmable Acceleration Card N3000 for Networking (rev 02)
63:00.0 Processing accelerators: Intel Corporation Device 0b32

0 Kudos
Highlighted
Employee
115 Views

Given that the Supermicro works and the Wolfpass server does not. It is most probably that you need the Airduct recommended by the server, which should be used with the card: https://ark.intel.com/content/www/us/en/ark/products/125929/passive-airduct-kit-awfcoproductad.html

 

0 Kudos
Highlighted
Beginner
98 Views

Hi Jonway,

Below suggested accessories is already installed in Intel LWF2208IR848505 server.

Accessory: AWFCOPRODUCTBKT: High Air flow Air Duct Bracket Kit & A2UL16RISER2: 2-slot PCIe* Riser card.

Please suggest else what could be the issue in intel server.

0 Kudos
Highlighted
Employee
91 Views

1. Confirm that you are running at maximum fan speed

2. Does the card shutdown even at idling? if no, you may want to check with your workload developer how to reduce the workload.

 

0 Kudos
Highlighted
Beginner
80 Views

Hi Jonway,

Fans are running on normal mode.

Yes server is in idle mode.

0 Kudos
Highlighted
Beginner
65 Views

Hi Jonway,

Below is the attached output from supermicro server when immediately run the command after booted to OS for your reference.

But after OS booted later on the output is different. 

0 Kudos
Highlighted
Employee
58 Views

I think this further confirmed that you dont have enough airflow.

When you first booted into OS, the card is still working but very near to shutdown temperature.

(12) FPGA Die Temperature : 96.50 Celsius

After a while, when the temperature exceed 100C, it will shutdown, and you wont get any reading.

 

The card is fine and working as expected. The problem is that you dont have enough airflow to the card. You will need to check with the server vendor on how to improve airflow to the pcie slots devices.

0 Kudos