cancel
Showing results for 
Search instead for 
Did you mean: 

my Intel SSD 750 (400GB) suddenly only does 2MB/s writes, and kernel reports aborted commands etc.

TVond
New Contributor

About 6 months ago I bought the Intel SSD 750 400GB, and have been using it for various database-related benchmarking tasks and such. It was working fine until this week, when the kernel suddenly started reporting strange issues about aborted commands:

Jan 12 13:10:27 bench2 kernel: nvme nvme0: I/O 0 QID 12 timeout, aborting

Jan 12 13:10:27 bench2 kernel: nvme nvme0: I/O 1 QID 12 timeout, aborting

Jan 12 13:10:27 bench2 kernel: nvme nvme0: I/O 2 QID 12 timeout, aborting

Jan 12 13:10:27 bench2 kernel: nvme nvme0: I/O 3 QID 12 timeout, aborting

Jan 12 13:10:27 bench2 kernel: nvme nvme0: completing aborted command with status: 0000

Jan 12 13:10:27 bench2 kernel: nvme nvme0: Abort status: 0x0

Jan 12 13:10:27 bench2 kernel: nvme nvme0: completing aborted command with status: 0000

Jan 12 13:10:27 bench2 kernel: nvme nvme0: completing aborted command with status: 0000

Jan 12 13:10:27 bench2 kernel: nvme nvme0: completing aborted command with status: 0000

Jan 12 13:10:27 bench2 kernel: nvme nvme0: completing aborted command with status: 0000

Jan 12 13:10:27 bench2 kernel: nvme nvme0: completing aborted command with status: 0000

...

Jan 12 13:10:33 bench2 kernel: nvme nvme0: completing aborted command with status: 0000

Jan 12 13:10:33 bench2 kernel: nvme nvme0: completing aborted command with status: 0000

Jan 12 13:10:33 bench2 kernel: nvme nvme0: I/O 196 QID 12 timeout, aborting

Jan 12 13:10:33 bench2 kernel: nvme nvme0: I/O 212 QID 12 timeout, aborting

Jan 12 13:10:33 bench2 kernel: nvme nvme0: I/O 273 QID 12 timeout, aborting

Jan 12 13:10:33 bench2 kernel: nvme nvme0: I/O 275 QID 12 timeout, aborting

Jan 12 13:10:33 bench2 kernel: nvme nvme0: completing aborted command with status: 0000

Jan 12 13:10:33 bench2 kernel: nvme nvme0: completing aborted command with status: 0000

Jan 12 13:10:33 bench2 kernel: nvme nvme0: completing aborted command with status: 0000

...

Jan 12 13:16:59 bench2 kernel: nvme nvme0: completing aborted command with status: 0000

Jan 12 13:16:59 bench2 kernel: nvme nvme0: completing aborted command with status: 0000

Jan 12 13:16:59 bench2 kernel: nvme nvme0: completing aborted command with status: 0000

Jan 12 13:17:00 bench2 kernel: nvme nvme0: completing aborted command with status: fffffffc

Jan 12 13:17:00 bench2 kernel: blk_update_request: I/O error, dev nvme0n1, sector 422162944

Jan 12 13:17:00 bench2 kernel: Buffer I/O error on dev nvme0n1p1, logical block 52770079, lost async page write

Jan 12 13:17:00 bench2 kernel: Buffer I/O error on dev nvme0n1p1, logical block 52770080, lost async page write

Jan 12 13:17:00 bench2 kernel: Buffer I/O error on dev nvme0n1p1, logical block 52770081, lost async page write

Jan 12 13:17:00 bench2 kernel: Buffer I/O error on dev nvme0n1p1, logical block 52770082, lost async page write

Jan 12 13:17:00 bench2 kernel: Buffer I/O error on dev nvme0n1p1, logical block 52770083, lost async page write

I'm regularly testing new kernels / distributions, so at first I thought it's a bug in one of these, but after a lot of experiments I doubt that - I can reproduce the same issue even with older kernels that I've used without any issue.

Interestingly enough, this only affects writes - the reads seem to be working just fine (easily >2GB/s in sequential workload), but only 2MB/s in writes. Not a filesystem issue either - this happens even with simple dd writing /dev/nvme0n1 directly.

I've tried to install the newest firmware using the isdct tool (v 3.0.0), and `isdct show` now reports this:

[root@bench2 ~]# isdct show -a -intelssd 0

- Intel SSD 750 Series CVCQ55020067400AGN -

AggregationThreshold : 0

AggregationTime : 0

ArbitrationBurst : 0

Bootloader : 8B1B0131

CoalescingDisable : 1

DevicePath : /dev/nvme0n1

Device...

1 ACCEPTED SOLUTION

idata
Esteemed Contributor III

Hello Tomas_V,

Thanks for posting in our forum. We would like to review the information you've sent to us and try to replicate the situation.In the meantime, we can recommend you to install the latest firmware update, which you can find https://downloadcenter.intel.com/download/26491/Intel-SSD-Firmware-Update-Tool here.The other program says you have the latest version, but it is because that one does not include the latest one. FW: 8EV101F0 with Bootloader 8B1B0133Let us know if after the firmware update it fixes, if not we will be checking the information provided.Regards,NC

View solution in original post

8 REPLIES 8

TVond
New Contributor

Hi, thanks. I've been running some tests on the SSD the whole week, and it seems to be working fine. So the firmware upgrade likely resolved the issues.

Thanks!

idata
Esteemed Contributor III

Hi Tomas_V,

Those are great news, we are glad to hear it is been running fine now.Regards,NC

IIanM
New Contributor II

I noticed that AvgNandEraseCycles has reached 3148 and Percentage Used was 105%, which means the NAND Flash was significantly worn out.

TVond
New Contributor

While I understand the basic theory behind erase cycles, I don't see any details in the 750 specs what are the limits, so I can't judge if 3148 is high or not 😞

That being said, it'd be surprising (and sad) if the 750 SSD worn out so quickly - I only have it for ~6 months, and while I occasionally do write-heavy benchmarking (I'm ad database engineer), I do have a bunch of S3700 drives that I've used for the same thing, and those seem to be perfectly fine after a few years.