cancel
Showing results for 
Search instead for 
Did you mean: 

I/O request timeouts on Linux with Intel P3520/P4600 NVMe PCIe SSDs

SGoel5
New Contributor

Hi,

We are experiencing persistent I/O request timeouts on Linux with P3520/P4600 SSDs. We have tried multiple different kernels (3.10, 4.4, 4.9) and see the timeouts on all of them. The P4600 seems to be more prone to these than the P3520 though we see them on the latter as well. We have the latest firmware installed on both drives which are housed in the same machine (Supermicro 5018R-WR with X10SRW-F motherboard and E5-1650 V4 CPU). We can reproduce the timeouts by simply running mkfs -t xfs on the drive.

Here is the output from isdct (version isdct-3.0.9.400-17.x86_64):

- Intel SSD DC P3520 Series CVPF717100L01P2JGN -

Bootloader : MB1B0105

DevicePath : /dev/nvme0n1

DeviceStatus : Healthy

Firmware : MDV10271

FirmwareUpdateAvailable : The selected Intel SSD contains current firmware as of this tool release.

Index : 0

ModelNumber : INTEL SSDPEDMX012T7

ProductFamily : Intel SSD DC P3520 Series

SerialNumber : CVPF717100L01P2JGN

- Intel SSD DC P4600 Series BTLE736007F54P0KGN -

Bootloader : 0110

DevicePath : /dev/nvme1n1

DeviceStatus : Healthy

Firmware : QDV10150

FirmwareUpdateAvailable : The selected Intel SSD contains current firmware as of this tool release.

Index : 1

ModelNumber : INTEL SSDPEDKE040T7

ProductFamily : Intel SSD DC P4600 Series

SerialNumber : BTLE736007F54P0KGN

Here are the messages the 4.9 kernel prints when using the P4600

[ 151.297903] nvme nvme1: I/O 568 QID 1 timeout, aborting

[ 151.303130] nvme nvme1: I/O 569 QID 1 timeout, aborting

[ 151.308347] nvme nvme1: I/O 570 QID 1 timeout, aborting

[ 151.313562] nvme nvme1: I/O 571 QID 1 timeout, aborting

[ 151.355465] nvme nvme1: completing aborted command with status: 0000

[ 151.411273] nvme nvme1: completing aborted command with status: 0000

[ 151.466903] nvme nvme1: completing aborted command with status: 0000

[ 151.522609] nvme nvme1: completing aborted command with status: 0000

[ 151.578226] nvme nvme1: completing aborted command with status: 0000

...

[ 165.395295] nvme nvme1: Abort status: 0x0

[ 165.399296] nvme nvme1: Abort status: 0x0

[ 165.403299] nvme nvme1: Abort status: 0x0

[ 165.407304] nvme nvme1: Abort status: 0x0

We would appreciate your help in resolving this issue.

Regards,

Shantanu Goel

25 REPLIES 25

idata
Esteemed Contributor III

Hello Shantanu,

Thank you for your feedback. I'll inform the corresponding team of the current state of the issue you are experiencing, and if there is an explanation for this kind of discrepancy in the firmware version. I'll contact you as soon as I have more information. Thank you for your patience. Regards,Andres V.

idata
Esteemed Contributor III

Hello Shantanu,

I would like to inform you that the discrepancy was due to a documentation error that has already been fixed. As you can see here https://downloadcenter.intel.com/download/27497?v=t https://downloadcenter.intel.com/download/27497?v=t, under the What's new? section: Intel® SSD DC P4500/P4600 Series products; latest firmware revision QDV10170 This means that you have installed the adequate firmware version. Regarding the timeouts issue, we are still investigating, and I will get in touch with you when we find something relevant. Regards,Andres V.

SGoel5
New Contributor

Hi Andres,

Thank you for the information and please keep us posted.

Regards,

Shantanu

idata
Esteemed Contributor III

Hello Shantanu,

I just wanted to let you know that all the information associated with the issue you are experiencing has been escalated to our engineering team.As soon as I receive any update from them I'll contact you.Regards,Andres V.

SGoel5
New Contributor

Hi Andres,

Thanks for escalating this issue and please keep us posted.

Regards,

Shantanu