cancel
Showing results for 
Search instead for 
Did you mean: 

Serious performance regression with DC P3700 2TB AIC drive

RStup
New Contributor

Hi,

we've got a couple of servers, each with one of the DC P3700 2TB AIC drives and we see a serious performance regression after a couple of hours.

Initially, before we run the application tests, we quickly tested the I/O performance using `dd` writing 100 8GB files with a consistent rate of 2GB/s. After that, another `dd´ run reading these 100 files with direct IO and resulting in a consistent rate of 1.1GB/s. File system is XFS - but also tested ext4. (We're aware that this is not a solid test - but good enough to get some indication that the drive's working with a consistent write and read throughput.)

The actual test is an application, that usually just writes at 100MB/s and no reads, periodically peaking to 600MB/s writes and 150MB/s reads - everything's sequential I/O. This works for a couple of hours. But after that, IO performance degrades to a few MB/s. Even after the application has been stopped and the same `dd` tests show that write throughput's degraded to maybe 200MB/s and reads to 100MB/s.

We would have expected the drive to eventually degrade a bit, but not to 200/100 MB/s.

ext4 resulted in a generally worse performance than XFS. But the general behaviour (throughput regression) is the same on all machines.

Generally, we can also reproduce kernel panics in combination with isdct. One way to cause a kernel panic is to issue "isdct delete -intelssd"; the command completes but shortly after that, the kernel panic occurs.

Do you have any idea what may cause these behaviours and how to fix these?

11 REPLIES 11

idata
Esteemed Contributor III

BreakStuff,

You're correct, we do expect our drives to perform as advertised. Although in some real life applications, it's normal for the drives to show results at around 80% of the advertised numbers.The reason for this is that when we test our SSDs in the lab, this is done on clean drives, under ideal conditions.In the real world the SSD may already have data, or the test may not be able to access the full bandwidth due to background tasks or OS services also using the drive while you measure it's performance. This and many other factors can affect how reliable your benchmark results turn out.As far as over provisioning goes, I wouldn't worry too much about this. Our data center drives already come with 20% over provisioned area to be automatically used as NAND blocks fail. This can be monitored under the "E9" Smart Attribute (The "Media Wear-out Indicator"). Due to this, manually under using the drive or making adjustments to the Host Protected Area are not usually necessary.Unfortunately I don't have much of an answer in regards to why one firmware has Interrupt Coalescing enabled and the other one doesn't. OEM firmware versions are modified by them (Lenovo*, in your case) to address specific issues and often have a completely different release cadences than the ones we publish. They may add or disable features depending on what they believe works best with their systems.Best regards,Carlos A.

RStup
New Contributor

Hi Carlos,

I cannot find any special Intel NVMe driver in the downloads. The ZIP files only contain Windows or VMWare drivers and the PDF refers to the standard Linux kernel sources, which is probably exactly the driver we currently use.

idata
Esteemed Contributor III

Hello BreakStuff,

It's normal for SSD performance to decrease at least to some extend as the drive becomes full, but this should only become noticeable once the drive is reaching full capacity. If you'd like, you may review the evaluation guide for the P3700 that we created for Fujitsu*:- http://manuals.ts.fujitsu.com/file/12176/fujitsu_intel-ssd-dc-pcie-eg-en.pdf http://manuals.ts.fujitsu.com/file/12176/fujitsu_intel-ssd-dc-pcie-eg-en.pdfNOTE: Any links provided for third party sites are offered for your convenience and should not be viewed as an endorsement by Intel® of the content, products, or services offered there.Additionally, you may like to review the following document. There's a section on how to bypass the buffer (aka using Direct IO) in Linux*, which may help with your performance issues:- http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ssd-server-storage-applicat... Intel® Solid-State Drives in Server Storage Applications. Section 3.2, page 16.We hope this information helps.Best regards,Carlos A.

RStup
New Contributor

Yea - that documents definitely helps. Thanks for the link!

I still got no idea what happens in that stack though. At least, the kernel panic after an `isdct delete ...` does not happen with recent Linux kernels but with 3.12.59 in SLES's 12.1.

Another thing (which we could not cross-check with a newer kernel until now) is, that writes (and fstrim) seem to be prioritized over reads - e.g. for two `dd`s, 1 writing and one reading, the writing `dd` can write at full speed but the reading one can just read with a few MB/s (both are not maxing out IOPS though).

Thanks for the useful help so far!

I might keep you busy

idata
Esteemed Contributor III

Hello BreakStuff,

It's very difficult for copy commands to be a fair benchmarking tool. Generally speaking, solid state drives will have much faster read speeds as compared to their write speeds. If a file is copied or moved within the SSD, this write operation will be optimized. If a file is moved from one disk to another, then you will only be measuring the slowest of the two drives. However, I'm not familiar enough with how the 'dd' command is actually excecuted, so I cannot say if it's a good benchmarking reference or not.Speaking of fstrim, keep in mind that our data center drives perform garbage collection automatically at a firmware level. If additional trimming is necessary, we recommend queueing the command so it isn't excecuted at a time which may affect your drive's performance.As for the kernel panics, I'm not sure what exactly could be the cause. We've been looking into this, but were unable to reproduce the issue. It might be a good idea to contact SUSE* support or post to their forums on this subject.Best regards,Carlos A.