So we've been testing a custom hardware card connected to a server with Ice Lake CPUs, and found it's been disconnecting over the PCIe connection. This is compared to an identical configuration but with Cascade Lake CPUs, where it managed to work. This was done using CentOS 8 Stream. I was wondering if there've been any changes to the PCIe capabilities, or anything else relevant that could help explain this.
I'll also include an error message that might help, on sending a command to the hardware that causes it to fail:
Hardware error from APEI Generic Hardware Error Source: 5
event severity: recoverable
Error 0, type: fatal
section_type: PCIe error
port_type: 4, root port
command: 0x0547, status: 0x4010
vendor_id: 0x8086, device_id: 0x347c
bridge: secondary_status: 0x2000, control: 0x0003
aer_uncor_status: 0x00004000, aer_uncor_mask: 0x01310000
TLP Header: 00000001 fc22fe01 dbc04000 00000000
Thank you for posting on the Intel® communities.
To continue with this request of yours can you please provide the following:
- What’s exactly this custom hardware card?
- Are you the developer of this card?
- What is the exact model of the CPUs that you test this card with?
- Is the motherboard used with this card an Intel server board? If yes, please provide the model
- Are contacting Intel on behalf of a company? If yes, please provide as many details as possible about the company that your work for.
- Have you tested a similar piece of hardware card from any other third-party brand? If yes, did it work?
Intel Technical Support Technician
Hi Victor thanks for the reply, I'll add the info you asked for.
What’s exactly this custom hardware card?
It's an FPGA readout card called FELIX used to buffer data for data acquisition for the DUNE experiment.
Are you the developer of this card?
No, I'm not the developer. If you'd like more info you can look up ATLAS FELIX. There's a user manual here for example: https://twiki.nevis.columbia.edu/twiki/pub/ATLAS/SliceTestboard/felix-user-manual.pdf
What is the exact model of the CPUs that you test this card with?
Intel Xeon Gold 6348. It was tested successfully with Intel WolfPass C628 chipset and Xeon Gold 6242 previously I believe.
Is the motherboard used with this card an Intel server board? If yes, please provide the model
Dell poweredge R750 motherboard.
Are contacting Intel on behalf of a company? If yes, please provide as many details as possible about the company that your work for.
This isn't on behalf of a company, I'm a researcher for the experiment.
Have you tested a similar piece of hardware card from any other third-party brand? If yes, did it work?
I'm not sure about this.
Thank you for the information. It seems the server board and the processor are compatible, so we couldn't immediately confirm why your configuration is not working only with that specific processor. Please allow us more time for research. We will let you know if we find a workaround for you.
Thank you for your patience.
Your query will be best answered by our Field Programmable Gate Array (FPGA) support team; therefore, we will help you to move this post to the designated team to further assist you.