Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Anup_Agarwal
Beginner
523 Views

What do two yellow and two green blinking LEDs on the Intel N3000 PAC mean?

Jump to solution

I was trying to install Intel N3000 FPGA PAC on the Supermicro 2029GP-TR server. The FPGA shows 2 yellow and 2 green leds blinking. This behaviour is not described in the manual. The FPGA does not show up in lspci on the server. What can be done?

0 Kudos
1 Solution
wchiah
Employee
184 Views

Hi Anup,

Thanks for your help.
I will issue this info immediately to the related team for further procedure.
It might take approximately 3-4 days due to the weekend.

What I can help now,  is highlighting your urgency on this case
Will get back to you as soon as possible.

Regards,
WeiChuan_C_Intel

View solution in original post

18 Replies
wchiah
Employee
430 Views

Hi,

Not sure which LED you mention is blinking.
For detail, you may refer to Intel FPGA Programmable Acceleration Card N3000 DataSheet, page 17/31.

 

 

Do you see N3000 card enumerate in PCIe Bus ?

  • I suggest you can try to remove the card from slot and check if the slot and edge connector is clean
  • Please ensure that the server is running at maximum fan speed

 

Let me know if this helps.

Regards,

WeiChuan_C_Intel

Anup_Agarwal
Beginner
421 Views

LEDs:

Currently both the activity LEDs (for QSFP A, B) blink in green and both the connectivity LEDs blink yellow (see attached image to verify). This is when nothing is connected to the QSFP ports.

Note we have the Intel N3000-N PAC. This does not support 10G configuration and the data sheet never mentions connectivity activity with yellow. Only mentions all 4 leds yellow in case of power/fan issue.

 

PCIE Enumeration:

We cannot see the accelerator in `lspci | grep -i accel` or `lspci -d :0b30`. The server is brand new, no dust. We ensured server fans are at full speed through IPMI as well as by inspecting server fan noise.

 

Environment:

We are using Centos 7.9 and linux kernel 4.19. We started with Centos 7.6 compiled linux kernel 4.19 with real-time patch following instructions in the user guide. Somewhere in between a `sudo yum update` caused update from Centos 7.6 to 7.9. We then installed the runtime stack for N3000-N.

I would imagine we should still be able to see the board in lspci irrespective of Centos 7.6 or 7.9. Am I wrong here? If so, I can reinstall Centos 7.6 and ensure we don't update to Centos 7.9.

 

Data sheet: https://www.intel.com/content/www/us/en/programmable/documentation/dlq1585950463484.html

User guide: https://www.intel.com/content/www/us/en/programmable/documentation/zsf1588015530773.html#kdq15984767...

 

 

wchiah
Employee
380 Views

Hi,

Appreciate you can share with me any printscreen or error code for
"We cannot see the accelerator in `lspci | grep -i accel` or `lspci -d :0b30"

Besides, the error printscreen, I might need below item as well to narrow down the further

  1. to check the PCIe enumerate 
    $ lspci -vt
  2. to check the FPGA accelerator port status
    $ fpgainfo port
  3. to check PHY status
    $ fpgainfo phy
  4. to check the Mac status
    $ fpgainfo mac
  5. BMC status
    $ fpgainfo bmc

Looking forward to hear back from you
Regards,

WeiChuan_C_Intel

Anup_Agarwal
Beginner
363 Views

I am not sure what you mean by the error print screen. The system boots without any visible errors. But the N3000-N FPGA PAC does not show up in lspci/fpgainfo commands. The PAC LEDs blink in the fashion described above.

 

I have attached dmesg log instead of the error print screen.

I have also attached the lspci -vt output.

For all of the fpgainfo XXX commands. The output is the same: "No FPGA resources found."

 

The server is the same as in: https://community.intel.com/t5/Programmable-Devices/Is-the-Intel-N3000-FPGA-PAC-compatible-with-Supe...

wchiah
Employee
348 Views

Hi,

 

Since that both are the same server, let me answer your question here to avoid any other confusion.

 

  • For all of the fpgainfo XXX commands. The output is the same: "No FPGA resources found.
  • You do mention that the Fan is at max speed and no dust between the slot.
  • two yellow/green blinking LEDs 
  1. I assume it blinking 1 second inverval right ?
    • When was it happen ?
    • immediately after power on ?
    • or after the OS running for few minute ?
  2. If the LED is blinking interval at 1 second the card is shutdown, either due below reason
    • FPGA core temperature reaches 100 °C
    • Board temperature reaches 100 °C
    • 12 V Auxiliary or 12 V backplane supply voltage is below 10.46 V
      • Make sure you connect the 12V Aux Cable.Note: The Intel FPGA PAC N3000 follows PCIe standards for 150 W add-in cards where maximum current from the 12 V slot power source is 5.5 A (max) and the 12 V Auxilary connector is 6.25 A (max).
        Appreciate if you can double check the power as well.
  3. You mention that you try to put back the Card to another server (Dell) and the error is follow the card
    • Is it the fpgainfo output is the same: "No FPGA resources found”
      follow from Supermicro server to Dell server as well?
      **
      this information is important for me to identify next debug step

 

Looking forward to hear back from you

Regards,

WeiChuan_C_Intel

Anup_Agarwal
Beginner
320 Views

1. The blinking happens right after server boot

2. Our fans are at full speed, we use standard PCIE 8 pin to PCIE 6 pin for the Supermicro servers. And for Dell we use a cable from dell that converts its power port to PCIE 6 pin. Since the FPGA never really starts (if 1 sec LED blinking means card shutdown), then it never gets hot as well. The board does not get hot at all.

3. Yes the fpgainfo commands show No FPGA Resources found on the original Dell servers as well (The FPGA worked earlier on these machines).

wchiah
Employee
311 Views

Hi Anup,

Thanks for confirming this,
I don't see the card enumerate in "lspci -vt" script as well

  1. Is this is the only N3000 PAC card you have ?
    IF have, maybe you can try to slot into the server and see if the error follow
  2. Last, is this card still under warranty?

Looking forward to hear back from you.
Regards,
WeiChuan_C_Intel

Anup_Agarwal
Beginner
299 Views

1. We have another N3000 in our lab, but it is currently in use.

2. Our boards are less than 1 year old, so I would imagine that they are still under warranty. What would be the process for claiming warranty?

wchiah
Employee
283 Views

Hi Anup,

We might need to double confirm that the problem came from the card (not server)
before we can process the next step.

  • Appreciate it if you can try to slot in the extra N3000 card to the current Dell/Supermicro server.
  • And see if the error still follows.
  • Meanwhile do you able to access the card's outband interface I2c or PLDM ?

Looking forward to hear back from you.
Regards,
WeiChuan_C_Intel


Anup_Agarwal
Beginner
236 Views

I can see if we can confirm the cause of the issue using the extra N3000.

I am not sure what you mean by the card's outband interface I2C or PLDM. Are you referring to a physical port? a software channel? or a serial number? In any case I have physical access to the board. What information did you want from the I2C/PLDM?

wchiah
Employee
228 Views

Hi Anup,

Hoping to hear back from you for the confirmation of the cause issue.
I might need some information such as  (board temperature, voltage, and current) from the I2C/PLDM
That will be helpful for us to confirm the issue as well.

For detail about the BMC, you may refer to N3000 BMC User Guide
https://www.intel.co.jp/content/dam/www/programmable/us/en/pdfs/literature/ug/ug-pac-bmc-n3000-n.pdf


Hope to hear back from you.
Regards,
WeiChuan_C_Intel

Anup_Agarwal
Beginner
221 Views

I might need more help with I2C/PLDM.

 

For PLDM, the link just says `sudo fpgainfo bmc`. This command simply returns "No FPGA Resources found". Is there any other way to access the PLDM interface?

 

For I2C, according to the link you sent, a command like this is to be used:

`sudo ipmitool i2c bus=0x20 0xBC 4 0x01 0x00`

I couldn't find the master I2C bus address for either of the servers. Just assuming bus=0x20, I get:

On the Supermicro server I get this response:

I2C Master Write-Read command failed: NAK on Write
Unable to perform I2C Master Write-Read

On the dell server I get this response irrespective of the register offset:

ff ff ff ff
11111111 11111111 11111111 11111111

 

If this helps: the Supermicro server (with an X11 Motherboard) according to this link supports 14x I2c/SMBUS devices (https://supermicro.com.tr/www.supermicro.com/en/solutions/management-software/bmc-resources.html)

wchiah
Employee
209 Views

Hi Anup,

Thanks for your fast response. 
Meanwhile, I might need your help if possible.

  • put the extra N3000 card to either Dell or Supermicro server
    ***Dell server will do
  • and investigate the extra card can perform "fpgainfo" command or not

Apologize if there is so much debug command I have been asking you to run.
The error can be due to a myriad of reasons and might be difficult to pinpoint exactly.
Hence. it is really important for us to narrow down it.

Hope to hear back from you soon.
Regards,

WeiChuan_C_Intel

Anup_Agarwal
Beginner
204 Views

For reference, the extra N3000-N card on an identical Dell PowerEdge R720 (same OS/kernel image) works fine.

The `fpgainfo bmc` command shows:

```

Board Management Controller, MAX10 NIOS FW version D.2.1.24
Board Management Controller, MAX10 Build version D.2.0.7
//****** BMC SENSORS ******//
Object Id : 0xF100000
PCIe s:b:d.f : 0000:08:00.0
Device Id : 0x0b30
Numa Node : 0
Ports Num : 01
Bitstream Id : 0x23000010000000
Bitstream Version : 0.2.3
Pr Interface Id : 901dd697-ca79-4b05-b843-8138cefa2846
( 1) Board Power : 45.06 Watts
( 2) 12V Backplane Current : 2.04 Amps
( 3) 12V Backplane Voltage : 11.95 Volts
( 4) 1.2V Voltage : 1.20 Volts
( 6) 1.8V Voltage : 1.82 Volts
( 3.3V Voltage : 3.29 Volts
(10) FPGA Core Voltage : 0.90 Volts
(11) FPGA Core Current : 5.95 Amps
(12) FPGA Core Temperature : 45.00 Celsius
(13) Board Temperature : 31.00 Celsius
(14) QSFP A Voltage : N/A
(15) QSFP A Temperature : N/A
(24) 12V AUX Current : 1.73 Amps
(25) 12V AUX Voltage : 11.96 Volts
(37) QSFP B Voltage : N/A
(38) QSFP B Temperature : N/A
(44) Retimer A Core Temperature : 50.00 Celsius
(45) Retimer A Serdes Temperature : 51.00 Celsius
(46) Retimer B Core Temperature : 51.50 Celsius
(47) Retimer B Serdes Temperature : 52.00 Celsius

```

wchiah
Employee
185 Views

Hi Anup,

Thanks for your help.
I will issue this info immediately to the related team for further procedure.
It might take approximately 3-4 days due to the weekend.

What I can help now,  is highlighting your urgency on this case
Will get back to you as soon as possible.

Regards,
WeiChuan_C_Intel

View solution in original post

wchiah
Employee
39 Views

Hi Anup,

Close this case as already resolves this issue in another platform.
Feel free to reach out again to us if you have any other queries in the future.

Wish you have a nice day ahead.

Regards,
WeiChuan_C_Intel

Anup_Agarwal
Beginner
421 Views

One more thing, the user guide says:

 

  • Enable the following options in the BIOS:
    • Intel VT-x (Intel Virtualization Technology for IA-32 and Intel 64 Processors)
    • Intel VT-d (Intel Virtualization Technology for Directed I/O)

 

we couldn't find these in the BIOS so skipped this step.

Anup_Agarwal
Beginner
419 Views

Please forgive typo in my reply:

Corrected line: This does not support 10G configuration and the data sheet never mentions connectivity LED with yellow. 

Reply