Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Beginner
860 Views

Intel BD-NVV-N3000-2 issue

After installing new FPGA card on PCIe x16 Slot, when we try to enter in BIOS server is halting in BIOS with error "Nmi activated - system halted".

Note : It’s booting to OS without any issue.

Server Product Code : LWF2208IR848505
Server S/N : BQF900200624
FPGA Model : BD-NVV-N3000-2
FPGA S/N : 644C36123B68
Intel BIOS : 02.01.0012

OS : CentOS 7.7

Please find attached lspci output.

Need your help to resolve this issue, can you please escalate this issue to Intel core development team. We are assuming this is not an hardware failure issue, it could be due to compatibility issue or new modified/beta BIOS can resolve this issue.

0 Kudos
33 Replies
Employee
273 Views

Hi, I have sent you a private message.

0 Kudos
Beginner
262 Views

Hi Jonway,

I have tried same FPGA card in Supermicro 7049GP-TRT server and observed FPGA is not detecting. Please find attached "lspci" command output.

0 Kudos
Beginner
252 Views

HI Jonway,

Have you any solution on this.

0 Kudos
Employee
247 Views

Hi @samir_bhansali 

1) If you suspect, it is a card issue, can you try with a different card?

2) If it is the same on all cards, can you let me know if lspci not detect card happens immediately right after boot up

OR

it was detected at first but failed later on?

 

Also, is the LED at blinking at 1 sec interval. If yes, happens immediately right after boot up or failed later on?

0 Kudos
Beginner
235 Views

Hi Jonway,

I have checked again Intel FPGA card with Supermicro SYS-7049GP-TRT server and found, after powering on the server there is no led glowing on FPGA led. After booting to OS led blinking status is 1,3 blinking green and 2,4 blinking yellow.

Immediately after booting to OS and also later on, we tried with "lspci" command and now the card is showing in output list.

Please find attached output for your reference.

Kindly check attachment and let us know whether card is detecting properly or not.

Please suggest what could be the issue with Intel server for FPGA card detection.

0 Kudos
Employee
210 Views

HI @samir_bhansali 

Based on the supermicro lspci output, it shows the card is detected properly.

60:00.0 Ethernet controller: Intel Corporation Ethernet Controller XXV710 Intel(R) FPGA Programmable Acceleration Card N3000 for Networking (rev 02)
60:00.1 Ethernet controller: Intel Corporation Ethernet Controller XXV710 Intel(R) FPGA Programmable Acceleration Card N3000 for Networking (rev 02)
61:00.0 Processing accelerators: Intel Corporation Device 0b30
62:00.0 Ethernet controller: Intel Corporation Ethernet Controller XXV710 Intel(R) FPGA Programmable Acceleration Card N3000 for Networking (rev 02)
62:00.1 Ethernet controller: Intel Corporation Ethernet Controller XXV710 Intel(R) FPGA Programmable Acceleration Card N3000 for Networking (rev 02)
63:00.0 Processing accelerators: Intel Corporation Device 0b32

0 Kudos
Employee
207 Views

Given that the Supermicro works and the Wolfpass server does not. It is most probably that you need the Airduct recommended by the server, which should be used with the card: https://ark.intel.com/content/www/us/en/ark/products/125929/passive-airduct-kit-awfcoproductad.html

 

0 Kudos
Beginner
190 Views

Hi Jonway,

Below suggested accessories is already installed in Intel LWF2208IR848505 server.

Accessory: AWFCOPRODUCTBKT: High Air flow Air Duct Bracket Kit & A2UL16RISER2: 2-slot PCIe* Riser card.

Please suggest else what could be the issue in intel server.

0 Kudos
Employee
183 Views

1. Confirm that you are running at maximum fan speed

2. Does the card shutdown even at idling? if no, you may want to check with your workload developer how to reduce the workload.

 

0 Kudos
Beginner
172 Views

Hi Jonway,

Fans are running on normal mode.

Yes server is in idle mode.

0 Kudos
Beginner
157 Views

Hi Jonway,

Below is the attached output from supermicro server when immediately run the command after booted to OS for your reference.

But after OS booted later on the output is different. 

0 Kudos
Employee
150 Views

I think this further confirmed that you dont have enough airflow.

When you first booted into OS, the card is still working but very near to shutdown temperature.

(12) FPGA Die Temperature : 96.50 Celsius

After a while, when the temperature exceed 100C, it will shutdown, and you wont get any reading.

 

The card is fine and working as expected. The problem is that you dont have enough airflow to the card. You will need to check with the server vendor on how to improve airflow to the pcie slots devices.

0 Kudos
Beginner
77 Views

some questions:

1. how to confirm that you are running at maximum fan speed or modify it?

2. the fan referred to here is the fan of pac board or the fan or server machine?

thanks very much

0 Kudos
Employee
66 Views

hi @ahaa 

There is no fan on the pac card. It is passively cooled.

Some server vendors fan speed is changed in the BIOS and some are changed via the server BMC (maybe there are more other ways), you will need to check with the server vendor to know what speed you are running at, and how to change it.

0 Kudos