Programmable Devices
CPLDs, FPGAs, SoC FPGAs, Configuration, and Transceivers
21148 Discussions

PCIe bridge crash on Icelake-SP running pktgen on multiple ports

PhaniNarasimham
1,403 Views

Sir,

We are running pktgen on multiple ports on ICELAKE.

*-pci:0
description: PCI bridge
product: Intel Corporation
vendor: Intel Corporation
physical id: 2
bus info: pci@0000:50:02.0
version: 04
width: 64 bits
clock: 33MHz
capabilities: pci pciexpress pm msi normal_decode bus_master cap_list
configuration: driver=pcieport
resources: iomemory:202f0-202ef irq:127 memory:202ffff20000-202ffff3ffff ioport:8000(size=4096) memory:d0b00000-d0efffff ioport:202ff4000000(size=173015040)
*-pci:1
description: PCI bridge
product: Intel Corporation
vendor: Intel Corporation
physical id: 4
bus info: pci@0000:50:04.0
version: 04
width: 64 bits
clock: 33MHz
capabilities: pci pciexpress pm msi normal_decode bus_master cap_list
configuration: driver=pcieport
resources: iomemory:202f0-202ef irq:128 memory:202ffff00000-202ffff1ffff ioport:9000(size=4096) memory:d0300000-d0afffff ioport:202fe0000000(size=307232768)

 

 

we are getting hardware crash.

Dec 16 14:02:34 HWHA2030006 kernel: BERT: Error records from previous boot:

Dec 16 14:02:34 HWHA2030006 kernel: [Hardware Error]: event severity: fatal

Dec 16 14:02:34 HWHA2030006 kernel: [Hardware Error]:  Error 0, type: fatal

Dec 16 14:02:34 HWHA2030006 kernel: [Hardware Error]:   section_type: PCIe error

Dec 16 14:02:34 HWHA2030006 kernel: [Hardware Error]:   port_type: 4, root port

Dec 16 14:02:34 HWHA2030006 kernel: [Hardware Error]:   version: 3.0

Dec 16 14:02:34 HWHA2030006 kernel: [Hardware Error]:   command: 0x0540, status: 0x0010

Dec 16 14:02:34 HWHA2030006 kernel: [Hardware Error]:   device_id: 0000:50:02.0

Dec 16 14:02:34 HWHA2030006 kernel: [Hardware Error]:   slot: 2

Dec 16 14:02:34 HWHA2030006 kernel: [Hardware Error]:   secondary_bus: 0x00

Dec 16 14:02:34 HWHA2030006 kernel: [Hardware Error]:   vendor_id: 0x8086, device_id: 0x347a

Dec 16 14:02:34 HWHA2030006 kernel: [Hardware Error]:   class_code: 000406

Dec 16 14:02:34 HWHA2030006 kernel: [Hardware Error]:   bridge: secondary_status: 0x2000, control: 0x0000

Dec 16 14:02:34 HWHA2030006 kernel: [Hardware Error]:   aer_uncor_status: 0x00000000, aer_uncor_mask: 0x00100020

Dec 16 14:02:34 HWHA2030006 kernel: [Hardware Error]:   aer_uncor_severity: 0x00463010

Dec 16 14:02:34 HWHA2030006 kernel: [Hardware Error]:   TLP Header: 0a000000 51030004 fd810000 00000000

 

Any idea how to debug and fix this issue sir

 

Labels (1)
0 Kudos
4 Replies
smt
Employee
1,315 Views

Hi, 


Which IP you are using? can you share the .ip settings?

 

Thanks.

0 Kudos
PhaniNarasimham
1,244 Views

[root@HWHA3300011 log]# modinfo ice
filename: /lib/modules/4.18.0-372.32.1.rt7.189.el8.x86_64/updates/drivers/net/ethernet/intel/ice/ice.ko
firmware: intel/ice/ddp/ice.pkg
version: 1.9.11
license: GPL v2
description: Intel(R) Ethernet Connection E800 Series Linux Driver
author: Intel Corporation, <linux.nics@intel.com>
rhelversion: 8.6
srcversion: 06B8AD97B187AB8A177D9BB
alias: pci:v00008086d00001888sv*sd*bc*sc*i*
alias: pci:v00008086d0000579Fsv*sd*bc*sc*i*
alias: pci:v00008086d0000579Esv*sd*bc*sc*i*
alias: pci:v00008086d0000579Dsv*sd*bc*sc*i*
alias: pci:v00008086d0000579Csv*sd*bc*sc*i*
alias: pci:v00008086d0000151Dsv*sd*bc*sc*i*
alias: pci:v00008086d0000124Fsv*sd*bc*sc*i*
alias: pci:v00008086d0000124Esv*sd*bc*sc*i*
alias: pci:v00008086d0000124Dsv*sd*bc*sc*i*
alias: pci:v00008086d0000124Csv*sd*bc*sc*i*
alias: pci:v00008086d0000189Asv*sd*bc*sc*i*
alias: pci:v00008086d00001899sv*sd*bc*sc*i*
alias: pci:v00008086d00001898sv*sd*bc*sc*i*
alias: pci:v00008086d00001897sv*sd*bc*sc*i*
alias: pci:v00008086d00001894sv*sd*bc*sc*i*
alias: pci:v00008086d00001893sv*sd*bc*sc*i*
alias: pci:v00008086d00001892sv*sd*bc*sc*i*
alias: pci:v00008086d00001891sv*sd*bc*sc*i*
alias: pci:v00008086d00001890sv*sd*bc*sc*i*
alias: pci:v00008086d0000188Esv*sd*bc*sc*i*
alias: pci:v00008086d0000188Dsv*sd*bc*sc*i*
alias: pci:v00008086d0000188Csv*sd*bc*sc*i*
alias: pci:v00008086d0000188Bsv*sd*bc*sc*i*
alias: pci:v00008086d0000188Asv*sd*bc*sc*i*
alias: pci:v00008086d0000159Bsv*sd*bc*sc*i*
alias: pci:v00008086d0000159Asv*sd*bc*sc*i*
alias: pci:v00008086d00001599sv*sd*bc*sc*i*
alias: pci:v00008086d00001593sv*sd*bc*sc*i*
alias: pci:v00008086d00001592sv*sd*bc*sc*i*
alias: pci:v00008086d00001591sv*sd*bc*sc*i*
depends:
name: ice
vermagic: 4.18.0-372.32.1.rt7.189.el8.x86_64 SMP preempt_rt mod_unload modversions
parm: debug:netif level (0=none,...,16=all) (int)
parm: fwlog_level:FW event level to log. All levels <= to the specified value are enabled. Values: 0=none, 1=error, 2=warning, 3=normal, 4=verbose. Invalid values: >=5
(ushort)
parm: fwlog_events:FW events to log (32-bit mask)

0 Kudos
Zhaoxuan1
Employee
1,210 Views

Hi,

 

Which IP did you choose and compile in Quartus? Could you share the IP parameters you set in platform designer? Which tile are you using? Could you share device messages before/after encounting errors with lspci -vvvs B:D.F command?

 

Best Regards

Zhao Xuan

0 Kudos
KhaiChein_Y_Intel
1,150 Views

Hi,

We do not receive any response from you to the previous question. This thread will be transitioned to community support. 

If you have a new question, feel free to open a new thread to get the support from Intel experts. 

Otherwise, the community users will continue to help you on this thread. 

Thank you.


Best regards,

Khai


0 Kudos
Reply