Ethernet Products
Determine ramifications of Intel® Ethernet products and technologies
5457 Discussions

RoCEv2 between FPGA and E810

Yorick-REDS
Beginner
202 Views

Hello,

I'm trying to develop a software to receive data through RDMA.
The goal is to have a FPGA-based system that provides data via WRITE operations and a PC-based system that receives and processes the data.
For that I connected an FPGA with an already developed and tested bitstream that sends data with RoCEv2 protocol. On the other side, I installed a Intel E810-CQDA2 on a PC which runs Ubuntu 24.04 with a downgraded kernel 6.8, ice 1.16.3 and irdma 1.16.10.
RoCEv2 is enabled with "options irdma roce_ena=1" in "/etc/modprobe.d/irdma.conf".

The driver seems to work as expected.

 

$ ibv_devices 
    device          	   node GUID
    ------          	----------------
    rocep1s0f0      	6efe54fffe403af8
    rocep1s0f1      	6efe54fffe403af9

$ ibv_devinfo -d rocep1s0f1
hca_id:	rocep1s0f1
	transport:			InfiniBand (0)
	fw_ver:				1.54
	node_guid:			6efe:54ff:fe40:3af9
	sys_image_guid:			6efe:54ff:fe40:3af9
	vendor_id:			0x8086
	vendor_part_id:			5522
	hw_ver:				0x2
	phys_port_cnt:			1
		port:	1
			state:			PORT_ACTIVE (4)
			max_mtu:		4096 (5)
			active_mtu:		4096 (5)
			sm_lid:			0
			port_lid:		1
			port_lmc:		0x00
			link_layer:		Ethernet

I can ping the FPGA network interface and I can run rping locally on the IP address assigned to "rocep1s0f1".

The C program I wrote to set up the RDMA reception does not seem to work though.
I heavily inspired me from "rc_pingpong" that is available in the examples of "libibverbs".

Here is some log

Get device list ...
Get device rocep1s0f1 ... found
Allocate context ...
Allocate buffer ...
Allocate Protection Domain ...
Allocate Memory Region ...
Allocate Completion Queue ...
Create Queue Pair ...
Modify Queue Pair to INIT ...
Query port ...
        state ............ ACTIVE
        max_mtu .......... 4096
        active_mtu ....... 4096
        max_msg_sz ....... 2147483647
        bad_pkey_cntr .... 0
        qkey_viol_cntr ... 0
        pkey_tbl_len ..... 1
        lid .............. 1
        lmc .............. 0
        phys_state ....... LinkUp
        link_layer ....... Ethernet
Get GID ...
Extract port details ...
        entity is ... local
        QPN ......... 0x000004
        PSN ......... 0xf8d193
        ADDR ........ 0x648277bc1000
        LID ......... 0x0001
        GID ......... ::ffff:192.168.100.5
Post WR to RQ ...
Modify Queue Pair to RTR ...
        entity is ... remote
        QPN ......... 0x0000b8
        PSN ......... 0x000000
        ADDR ........ 0x00000000
        LID ......... 0xffff
        GID ......... ::ffff:192.168.100.10
Query QP ...
Cannot query QP: No space left on device
Send QP info ...
        Packet QP Info sent
        Packet TX Meta sent
Wait incoming data ...
^C

 

I configure the QP in RTR mode with (note that `fpga` struct contains constant value of the FPGA side):

 

    struct ibv_qp_attr qp_mod_attr_rtr = {
        .qp_state           = IBV_QPS_RTR,
        .path_mtu           = IBV_MTU_2048,
        .dest_qp_num        = fpga.qpn,
        .rq_psn             = fpga.psn,
        .max_dest_rd_atomic = 1,
        .min_rnr_timer      = 12,
        .ah_attr            = {
            .is_global      = 1,
            .dlid           = fpga.lid,
            .sl             = 0,
            .src_path_bits  = 0,
            .port_num       = ib_port,
            .grh = {
                .hop_limit = 1,
                .dgid = fpga.gid,
                .sgid_index = 1,
            },
        },
    };
    if (ibv_modify_qp(ctx->qp, &qp_mod_attr_rtr,
                        IBV_QP_STATE              |
                        IBV_QP_AV                 |
                        IBV_QP_PATH_MTU           |
                        IBV_QP_DEST_QPN           |
                        IBV_QP_RQ_PSN             |
                        IBV_QP_MAX_DEST_RD_ATOMIC |
                        IBV_QP_MIN_RNR_TIMER)) {
        perror("Failed to modify QP to RTR");
        [...]
    }

 

This configuration is accepted (no error raised) but then when I query the QP with `ibv_query_qp` I get the error "No space left on device" (which is strange due to the read-only access) and I receive no rx notification when polling the work completion.

There is obviously an error somewhere but I cannot find it.

Any idea ?

Thanks.
Yorick

0 Kudos
4 Replies
Sazirah
Employee
187 Views

Hi Yorick-REDS,


Thank you for posting in Intel Communities Forum.


Regarding the issue reported above, we would like to obtain some details from you:

1) The system that you are currently using with the Ethernet Adapter E810-CQDA2

2) Have you purchased this adapter together with the system or separately?


Additionally, you may want to refer to the datasheet of the product and also the Adapter User Guide. Kindly refer below:


Product datasheet:

https://www.intel.com/content/www/us/en/products/sku/192558/intel-ethernet-network-adapter-e810cqda2/specifications.html


Supplemental Information> Datasheet> View now


or,


https://www.intel.com/content/www/us/en/content-details/613875/intel-ethernet-controller-e810-datasheet.html?wapkw=e810%20datasheet&DocID=613875


Adapter User Guide for Intel Ethernet Adapters:

https://www.intel.com/content/www/us/en/download/19373/adapter-user-guide-for-intel-ethernet-adapters.html


If you have any other concerns, kindly let us know.


Regards,

Sazzy_Intel

Intel Customer Support Technician


0 Kudos
Yorick-REDS
Beginner
150 Views

Hi Sazirah,

 

Thank you for your answer!

 

1) The system that you are currently using with the Ethernet Adapter E810-CQDA2

 

The system is composed of:

 

  • Hardware
    • Motherboard AsRock Z790 Taichi (The Ethernet Adapter E810-CQDA2 is installed on PCIE1)
    • Kingston FURY Beast 2 x 32GB, 5600 MHz, DDR5 RAM
    • Intel i9 13900K
  • Software
    • Ubuntu 24.04 with kernel 6.8 (as specified on irdma README); note that the system was previously installed with Ubuntu 22.04 and the exact same problems as reported above were experienced

2) Have you purchased this adapter together with the system or separately?

 

The adapter was purchased separately.

However, the adapter has been used for other projects using the 100 Gbit/s capability (without RDMA) successfully. Thus we know that the adapter is working (at least for 100 Gbit/s) on our system.

 

Additional information

 

What I'm really missing is a capability to debug RDMA and understand the issue. Is it coming from the hardware ? Or from the driver's configuration ? Or from my piece of software that interacts with `libibverbs` ?

I also know that the FPGA is working because it has been tested by a partner. However, I don't have access to the receive part that the partner used, thus I'm now trying to build it.

How can I debug RoCEv2 with the `E810-CQDA2` and `irdma` ?
I have tried modprobe irdma roce_ena=1 dyndbg='+p' but this does not give me much details. I'd like to have a log saying where the incoming packet is discarded and why.

 

Thanks for your help.

 

Best regards,

Yorick

0 Kudos
Simon-Intel
Employee
138 Views

Hi Yorick-REDS,

 

Thank you for your response.

 

I would like to let you know that we have a dedicated team that can assist you regarding your query.

 

Here's the link to connect with them: FPGA Team Community

 

Best regards,

Simon

Intel Customer Support Technician


0 Kudos
Sazirah
Employee
75 Views

Hi Yorick-REDS,


Greetings.


As per advised by my colleague, you may contact FPGA team by posting in FPGA community forum. Since you will be contacting them to get assistance, we will proceed with closing this case at our end. If you need any additional information, please submit a new question as this thread will no longer be monitored.


Regards,

Sazzy_Intel

Intel Customer Support Technician


0 Kudos
Reply