I have an application which receives data using RoCEv2. It works fine using either a Broadcom or Mellanox nic but I am having an issue with the E810 nic. It will receive data successfully for a short period time but I will then receive a IBV_WC_WR_FLUSH_ERR when processing the completion queue. The id of the completed work request points to a valid buffer which had been queued successfully multiple times before (I use a fixed number of buffers which get re-queued after the work item is completed). I also receive an async event with error code IBV_EVENT_QP_FATAL which points to the only QP that I am using.
Looking through the packet trace, I don't see any issues. No NAKs from either side. Size of data transfers are good. Is there a way to find out more information on what caused these errors or further steps I can take to troubleshoot this to determine what specifically is the problem?
I am using:
Firmware v2.00 0x80003e53 1.2751.0
ice driver v1.9.11, irdma v1.9.30
Ubuntu 20.04 5.15.0-52