Ethernet Products
Determine ramifications of Intel® Ethernet products and technologies
4866 Discussions

X710 seems not responded during i40e driver probing to wait for a global reset.

LeiLi
Employee
1,830 Views

 

We are from virtualization Beijing team of IAGS. We are working on part of seamless update, which is aiming to upgrade Linux kernel on the host via kexec reboot while the guest is suspended and instantly resumed after kexec reboot. We call it full VMM fast restart. 

Now we get the PoC to work successfully. If the device is passthrough to the guest, we will get it keepalive across reboot. It means that the passthrough device and the associated bridges will bypass shutdown before reboot and will not touch device hardware registers when the new kernel is booted up (e.g. PCI enumeration, etc).

We tested the PoC with X540 NIC and X710 NIC. X540 seems good now. X710 NIC can be able to finish one round VMM fast restart. After the first found is finished, I tried to do the second consecutive VMM fast restart. I saw an issue that it’s hanging to wait for global reset when i40e driver is trying to probe the other physical ports which are not passthrough to the guest. After the attempts, the guest can be able to resume. However, both the passthrough physical port and the other remaining native ports don't work at all.

But there is no problem for the first VMM fast restart. The X540 doesn’t have this issue.

Specifically, it’s hanging at this function i40e_pf_reset. The code snippet is below. From the code comments, it seems waiting for a global reset.

    if (reg & I40E_GLGEN_RSTAT_DEVSTATE_MASK) {

        hw_dbg(hw, "Global reset polling failed to complete.\n");

        return I40E_ERR_RESET_FAILED;

}

I tried to do the global reset in the code after pf reset failure. It seems not taking effect☹

If the device is working normally, I tried to send the global reset command, the log shows that the global reset is working for the normal device.

The command to send global reset is below:

$ echo “globr” > /sys/kernel/debug/i40e/0000:81:00.0/command

The code snippet of doing global reset is below:

static i40e_status i40e_pf_loop_reset(struct i40e_pf *pf)

{

        const unsigned short MAX_CNT = 1000;

        const unsigned short MSECS = 10;

        struct i40e_hw *hw = &pf->hw;

        i40e_status ret;

        int cnt;

        for (cnt = 0; cnt < MAX_CNT; ++cnt) {

                ret = i40e_pf_reset(hw);

                if (!ret)

                        break;

+              i40e_do_reset_safe(pf, BIT(__I40E_GLOBAL_RESET_REQUESTED));

+              pci_info(pf->pdev, "global reset is requested.\n");

                msleep(MSECS);

        }

 

It seems to me that the hardware doesn’t work at all after two VMM fast restart.

The basic card information is below:

driver: i40e
version: 2.8.20-k
firmware-version: 5.05 0x800029eb 1.1313.0
bus-info: 0000:81:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

Do you have any insights or suggestions about this issue? Thanks so much!

0 Kudos
1 Solution
LeiLi
Employee
1,804 Views

Hi Alfred, 

      Yesterday, it seems that we resolved this issue. We will keep an eye whether it will appear again in the future. Thanks for helping closing it. 

View solution in original post

0 Kudos
3 Replies
AlfredoS_Intel
Moderator
1,811 Views

Hi Leili,

Thank you for posting in our Intel® Ethernet Communities Page.

We would like to get some information to know the configuration of your system.

Please download and run our Intel® System Support Utility from this page, https://downloadcenter.intel.com/download/26735/Intel-System-Support-Utility-for-the-Linux-Operating-System#:~:text=Intel%20SSU%20for%20the%20Linux,and%20shared%20by%20the%20user. After running it, you will be given an option to save the logs to a text file, please do so and attach the file on your reply.

Please also provide us the results of this command: ethtool -i ethx where ethx is the Ethernet port.

We look forward to hearing from you. If we do not get your reply, we will follow up after 3 business days.



Best Regards,

Alfred S

Intel® Customer Support


LeiLi
Employee
1,805 Views

Hi Alfred, 

      Yesterday, it seems that we resolved this issue. We will keep an eye whether it will appear again in the future. Thanks for helping closing it. 

0 Kudos
AlfredoS_Intel
Moderator
1,793 Views

Hi Leili,

Thank you for the update.

It is a great joy to know that you have already resolved the issue.

If you would like to give us an update or if you have further questions, please submit a new question as this thread will no longer being monitored.

Thank you for contacting Intel® and have a great week!


Best Regards,

Alfred S

Intel® Customer Support


0 Kudos
Reply