I am using an Intel DC P3700 400GB SSD. As per the Intel factsheet i had tested the SSD on windows using IO meter for the request blocksize of 64K with 128 as IO queue depth and got close to 2800MBps sequential read performance.
When i use the same SSD drive on a linux system and use fio benchmarking tool to benchmark it i get a sequential read performance of only 812MBps. Parameters used for fio benchmark job were direct=1, invalidate=1, engine=libaio, iodepth=128, rw=read, blocksize=64K.
The Linux drivers are the mainline drivers for nvme (drivers/block/nvme-core.c and drivers/block/nvme-scsi.c version 0.8). The number of IO queues is fixed as 128 and i am not able to change them using the nr_requests sysfs file for the SSD block device. The driver decides the q_depth in nvme-core.c in function nvme_dev_map at the following line of code
dev->q_depth = min_t(int, NVME_CAP_MQES(cap) + 1, NVME_Q_DEPTH);
Increasing NVME_Q_DEPTH does not help in this case and the queue depth remains at 128.
In what way can i increase the nvme io queue depth for this driver.
Is it feasible to change the queue depth capability of device by changing the Intel SSD properties using nvme user utilities ?
Or do i need to correct my benchmarking procedure used to test this SSD drive.
Thank you for sharing all this information.
I am going to research on this based on all the information you provided and I will be back as soon as possible with updates.
Can you try a block size of 128K instead of 64K?
That is the only difference we see between your command line and our internal command line.
The full command line we use internally is:
fio --output=RD_128K_Seq_Read.txt --ioengine=libaio --direct=1 --norandommap --randrepeat=0 --blocksize=128K --rw=read --iodepth=128
Let me know.
Thanks for your reply. Actually the issue now is root caused to the PCIe Slot being used as Gen 2.0 (Sorry for that !). We reconfigured the PCIe slot to Gen 3.0 and now with the same commandline which i used before i am able to get ~2100MBps for 64K blocksize seq read. For 128K blocksize seq read i am able to get ~2460MBps.
But again the PCIe Gen 2.0 x4 should support upto 2000MBps. Having the Slot downgraded to Gen 2.0 has downgraded the speed from ~2100MBps to ~700MBps as per my previous results. What would be the reason for this. Do you have results for your tests on a PCIe Gen 2.0 Slot ?
- Murali Mohan
I am afraid to say we have not done any performance testing on PCIe 2.0. We did a little compatibility testing but since the board is spec'd for 3.0 we did not see any value in performance testing 2.0 but I will pass this information.