I just got an NVMe PCIe 750 SSD (400G), and use 'fio' to test its bandwidth under Linux.
Surprisingly, I can see the random read performance of ~640K IOPS (During my testing, the performance varies between 450K and 800K, and 640K is the overall result).
However, according to its spec, the random read performance is about 430K IOPS.
My system's info
CPU: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz (6 physical cores, 12 with HT)
OS: Linux 3.13.0-43-generic # 72-Ubuntu SMP
The following is the command I used to test (fio-2.1.3).
Basically, I do a random read on the raw disk, without cache(io direct), with AIO, 4K block, io depth is 32, with 4 concurrent jobs. The testing runs more than 25 minutes.
sudo fio --filename=/dev/nvme0n1p1 --direct=1 --rw=randread --ioengine=libaio --bs=4k --iodepth=32 --numjobs=4 --size=1024G --runtime=2400 --group_reporting --name test
The result is as follows:
test: (groupid=0, jobs=4): err= 0: pid=4730: Tue Jun 23 00:29:45 2015
read : io=4471.4GB, bw=2508.8MB/s, iops=642235, runt=1825081msec
slat (usec): min=1, max=4053, avg= 2.93, stdev= 1.60
clat (usec): min=0, max=7372, avg=194.05, stdev=128.28
lat (usec): min=6, max=7377, avg=197.12, stdev=128.34
clat percentiles (usec):
| 1.00th=[ 10], 5.00th=[ 17], 10.00th=[ 24], 20.00th=[ 45],
| 30.00th=[ 131], 40.00th=[ 207], 50.00th=[ 241], 60.00th=[ 253],
| 70.00th=[ 262], 80.00th=[ 274], 90.00th=[ 290], 95.00th=[ 302],
| 99.00th=[ 510], 99.50th=[ 644], 99.90th=[ 1288], 99.95th=[ 1320],
| 99.99th=[ 2128]
bw (KB /s): min=341352, max=1369744, per=25.10%, avg=644916.05, stdev=95629.97
lat (usec) : 2=0.01%, 4=0.01%, 10=0.71%, 20=6.03%, 50=14.66%
lat (usec) : 100=6.69%, 250=29.55%, 500=41.31%, 750=0.57%, 1000=0.08%
lat (msec) : 2=0.38%, 4=0.01%, 10=0.01%
cpu : usr=34.05%, sys=50.87%, ctx=85646027, majf=0, minf=248
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued : total=r=1172131080/w=0/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
READ: io=4471.4GB, aggrb=2508.8MB/s, minb=2508.8MB/s, maxb=2508.8MB/s, mint=1825081msec, maxt=1825081msec
Disk stats (read/write):
nvme0n1: ios=1172101701/0, merge=0/0, ticks=188947512/0, in_queue=213525656, util=100.00%
Any idea what's the problem?
We are reviewing the information you provided and will get back to you with details soon. There may be something in your configuration or test parameters that cause the SSD performance results to be higher than the specification.
Please allow us more time to investigate.
We checked with our additional resources and would like to let you know the following information:
- There is no problem with your system. The test results are dependent of the settings and the usage of the performance characteristics of the drive.
- The advertised performance is obtained using IOMeter in a Windows compatible system. We have not tested nor published performance values using FIO and Linux.
- In the labs, the SSD's are pre-conditioned so that they have data on them before the actual test is performed. If the LBA's are blank, then the test results will vary significantly. You can pre-condition the drive by doing a full (or many) sequential write across the drive before the read test.
- In your test, you used the span --size=1024G , but the drive is 400GB. This might have an effect in the results as well.
- For your reference, here is a command used for a read test of an 800GB Intel NVMe drive.
fio –output=test_result.txt –name=myjob –filename=/dev/nvme0n1 –ioengine=libaio –direct=1 –norandommap –randrepeat=0 –runtime=600 –blocksize=4K –rw=randread –iodepth=32 –numjobs=4 –group_reporting
Thanks. The pre-conditioning is the problem. After I do a full sequential write, and use your fio testing command, I got the performance close to the advertised random read performance 430K IOPS.
I thought pre-conditioning only affects write performance significantly, but not read. Apparently, I was wrong.
Besides, as for fio --size=1024G option, my understanding is, it specifies the total size of I/O tested, but not the span. I guess --filesize option is for the span??
I will do all seq/random read/write test later, and report the result here.
I am glad to know we were able to help, and that now you are getting the results closer to the expected values.
Feel free to let us know the results of other tests, or in case you need additional information.
I conducted performance test, in the order shown below.
1. Sequential Write
sudo fio --filename=/dev/nvme0n1 --rw=write --ioengine=libaio --direct=1 --blocksize=128K --size=370G --iodepth=32 --group_reporting --name=myjob
2. Sequential Read
sudo fio --filename=/dev/nvme0n1 --rw=read --ioengine=libaio --direct=1 --blocksize=128K --runtime=300 --iodepth=32 --group_reporting --name=myjob
3. Random Write
sudo fio --filename=/dev/nvme0n1 --ioengine=libaio --direct=1 --norandommap --randrepeat=0 --runtime=1800 --blocksize=4K --rw=randwrite --iodepth=32 --numjobs=8 --group_reporting --name=myjob
4. Random Read
sudo fio --filename=/dev/nvme0n1 --ioengine=libaio --direct=1 --norandommap --randrepeat=0 --runtime=600 --blocksize=4K --rw=randread --iodepth=32 --numjobs=8 --group_reporting --name=myjob
The performance I got is
Sequential Read (128K) 2243 M/s
Sequential Write (128K) 963 M/s
Random Read (4K) 436K IOPS
Random Write (4K) 32K IOPS
The first 3 numbers are close to the advertised numbers.
However, the Random Write performance is much slower than the advertised one (32K vs 230K).
Even on a *clean* SSD, I can only see ~240K IOPS random writes in the first few minutes; after running 30minutes, the overall performance is about 32K.
What random write performance can you see, using fio under Linux?
Any problem in my measurement procedure?
FIO and Linux are not used in our lab to test performance. The advertised performance is measured using IOMeter* on Intel provided NVMe driver under Windows. The values are published after testing multiple runs in different systems.
Since the other 3 tests you ran show results close the advertised values, there may be a setting in the command used for random write that is causing this result. Since FIO in Linux is not the standard used for testing, we will need more time to check on this.
In the http://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/ssd-750-spec.pdf Intel® SSD 750 Product Specification, you can find some of the values used for benchmarking:
Queue Depth 128 (QD=32, workers=4).
Measurements are performed on a contiguous 8GB span of the drive on a full SSD. Power mode set at 25W
Comparing it to the command you used, you may try reducing the number of jobs (workers) to "4", also, you might want to test a lower runtime, maybe "300" or "600" like you used in other tests.
Also, remember to do TRIM on the SSD to discard any unused blocks, as this improves write performance. You can find more information about this in the following thread: