Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Novice
4,434 Views

S3610 SSDs have failed "READ/WRITE FPDMA QUEUED" ATA commands, frozen, then link reset

Hi,

I have a new Linux machine with two DC S3610 1.6TB SSDs. It's Debian jessie so kernel 3.6.17. Since around one month after installation these errors started appearing:

Jul 30 16:30:59 snaps kernel: [186914.249429] ata1.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x6 frozen

Jul 30 16:30:59 snaps kernel: [186914.250465] ata1.00: failed command: WRITE FPDMA QUEUED

Jul 30 16:30:59 snaps kernel: [186914.251505] ata1.00: cmd 61/08:00:39:db:8e/00:00:09:00:00/40 tag 0 ncq 4096 out

Jul 30 16:30:59 snaps kernel: [186914.251505] res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

Jul 30 16:30:59 snaps kernel: [186914.253613] ata1.00: status: { DRDY }

Jul 30 16:30:59 snaps kernel: [186914.254781] ata1.00: failed command: WRITE FPDMA QUEUED

Jul 30 16:30:59 snaps kernel: [186914.255810] ata1.00: cmd 61/08:08:71:fc:4e/00:00:66:00:00/40 tag 1 ncq 4096 out

Jul 30 16:30:59 snaps kernel: [186914.255810] res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

Jul 30 16:30:59 snaps kernel: [186914.257940] ata1.00: status: { DRDY }

Jul 30 16:30:59 snaps kernel: [186914.259086] ata1: hard resetting link

Jul 30 16:31:00 snaps kernel: [186914.577366] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

Jul 30 16:31:00 snaps kernel: [186914.578307] ata1.00: configured for UDMA/133

Jul 30 16:31:00 snaps kernel: [186914.578310] ata1.00: device reported invalid CHS sector 0

Jul 30 16:31:00 snaps kernel: [186914.578311] ata1.00: device reported invalid CHS sector 0

Jul 30 16:31:00 snaps kernel: [186914.578316] ata1: EH complete

The error is always the same, and the only thing on ata1.00 is one of the SSDs. I switched the two SSDs around and the problem followed the same SSD.

I can't force the error to happen on demand, it just seems to happen every other day or so, though not at the same time of day. All IO is held up briefly while the link is reset. The drive passes a SMART long self-test.

So is this drive faulty? If not, what can I try to fix this? If so, is there an easy way to prove it for RMA purposes?

Jul 27 05:59:30 snaps kernel: [ 33.054376] ata1.00: ATA-9: INTEL SSDSC2BX016T4, G2010110, max UDMA/133

Jul 27 05:59:30 snaps kernel: [ 33.054474] ata1.00: 3125627568 sectors, multi 1: LBA48 NCQ (depth 31/32)

Jul 27 05:59:30 snaps kernel: [ 33.054567] ata2.00: ATA-9: INTEL SSDSC2BX016T4, G2010110, max UDMA/133

Jul 27 05:59:30 snaps kernel: [ 33.054657] ata2.00: 3125627568 sectors, multi 1: LBA48 NCQ (depth 31/32)

$ sudo smartctl -i /dev/sda

smartctl 6.4 2014-10-07 r4002 [x86_64-linux-3.16.0-4-amd64] (local build)

Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===

Device Model: INTEL SSDSC2BX016T4

Serial Number: BTHC511604V41P6PGN

LU WWN Device Id: 5 5cd2e4 04b7b1bfa

Firmware Version: G2010110

User Capacity: 1,600,321,314,816 bytes [1.60 TB]

Sector Sizes: 512 bytes logical, 4096 bytes physical

Rotation Rate: Solid State Device

Form Factor: 2.5 inches

Device is: Not in smartctl database [for details use: -P showall]

ATA Version is: ACS-2 T13/2015-D revision 3

SATA Version is: SATA 2.6, 6.0 Gb/s (current: 6.0 Gb/s)

Local Time is: Fri Jul 31 11:04:09 2015 UTC

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

$ sudo smartctl -i /dev/sdb

smartctl 6.4 2014-10-07 r4002 [x86_64-linux-3.16.0-4-amd64] (local build)

Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===

Device Model: INTEL SSDSC2BX016T4

Serial Number: BTHC511604SD1P6PGN

LU WWN Device Id: 5 5cd2e4 04b7b1ba2

Firmware Version: G2010110

User Capacity: 1,600,321,314,816 bytes [1.60 TB]

Sector Sizes: 512 bytes logical, 4096 bytes physical

Rotation Rate: Solid State Device

Form Factor: 2.5 inches

Device is: Not in smartctl database [for details use: -P showall]

ATA Version is: ACS-2 T13/2015-D revision 3

SATA Version is: SATA 2.6, 6.0 Gb/s (current: 6.0 Gb/s)

Local Time is: Fri Jul 31 11:04:35 2015 UTC

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

Message was edited by: Andy Smith Now seeing same problems with other SSD, so this is not restricted to a single drive.

45 Replies
Highlighted
Honored Contributor II
223 Views

Hello grifferz,

We are going to check on this and will provide you a reply as soon as possible.

Highlighted
Honored Contributor II
223 Views

Hello grifferz,

Please make sure the BIOS of your system is up-to-date, and that you are using the drivers recommended by the system manufacturer.

If the issue persists, please let us know the following:

- Smart Attributes output (smartctl -A)

- PC make and model

- Motherboard model

- BIOS version

- Type of Storage controller where the drive is plugged into.

0 Kudos
Highlighted
Novice
223 Views

Hi Jonathan,

> Please make sure the BIOS of your system is up-to-date

Yes, it is the latest BIOS.

> and that you are using the drivers recommended by the system manufacturer.

Well, this is a Debian Linux 8.0 system, with the latest kernel package, so I don't think there are any other recommended drivers.

> Smart Attributes output (smartctl -A)

$ sudo smartctl -A /dev/sda

smartctl 6.4 2014-10-07 r4002 [x86_64-linux-3.16.0-4-amd64] (local build)

Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===

SMART Attributes Data Structure revision number: 1

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE

5 Reallocated_Sector_Ct 0x0032 099 099 000 Old_age Always - 0

9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 630

12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 17

170 Unknown_Attribute 0x0033 100 100 010 Pre-fail Always - 0

171 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0

172 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0

174 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 2

175 Program_Fail_Count_Chip 0x0033 100 100 010 Pre-fail Always - 5164180714

183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0

184 End-to-End_Error 0x0033 100 100 090 Pre-fail Always - 0

187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0

190 Airflow_Temperature_Cel 0x0022 076 071 000 Old_age Always - 24 (Min/Max 24/30)

192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 2

194 Temperature_Celsius 0x0022 100 100 000 Old_age Always - 24

197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0

199 UDMA_CRC_Error_Count 0x003e 100 100 000 Old_age Always - 0

225 Unknown_SSD_Attribute 0x0032 100 100 000 Old_age Always - 80726

226 Unknown_SSD_Attribute 0x0032 100 100 000 Old_age Always - 20

227 Unknown_SSD_Attribute 0x0032 100 100 000 Old_age Always - 62

228 Power-off_Retract_Count 0x0032 100 100 000 Old_age Always - 37677

232 Available_Reservd_Space 0x0033 100 100 010 Pre-fail Always - 0

233 Media_Wearout_Indicator 0x0032 100 100 000 Old_age Always - 0

234 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0

241 Total_LBAs_Written 0x0032 100 100 000 Old_age Always - 80726

242 Total_LBAs_Read 0x0032 100 100 000 Old_age Always - 131822

> PC make and model

A Supermicro server

> Motherboard model

Supermicro X10SDV-F

> BIOS version

AMI BIOS R 1.0a

> Type of Storage controller where the drive is plugged into.

Directly into motherboard SATA.

0 Kudos
Highlighted
Beginner
223 Views

Same issue for me, but with brand new S3710s, seemingly all our samples are 'defective' and tends to reset bus once or twice per day with very moderate workload applied. S3700 and S3500 worked at the same place (SATA port, M/B revision and BIOS # ) just flawless previously. Had to ask both SuperMicro and Intel support privately for possible actions, though most likely the issue is specific to a 22nm SSD generation.

Edit: would be very grateful for RMA hints as well, possibly with direct communication with a retailer involved. The risk of using those devices is too high right now, we`d prefer to replace entire party with well-known S3700 over return by defect and start detailed investigation on a selected samples after that.

0 Kudos
Highlighted
Honored Contributor II
223 Views

Hello andreykorolyov,

If the new SSD's are not working as expected in your system and you would like to exchange them, we would advise you contact the place of purchase, even more if you obtained them as samples or for testing purposes.

Please take into consideration that for warranty issues, you should http://www.intel.com/p/en_US/support/contactsupport Contact Support to engage a support agent in your nearest support center.

0 Kudos
Highlighted
Beginner
223 Views

Thank you Jonathan, will contact the SC next day,

would Intel engineering team be interested in a further investigation of the issue? I can easily reproduce the problem on a 20-minute FIO test run on any SSD from set we bought. Firmware updater says that the running version is latest, so the problem is bound to the specific SSD hardware I suppose. Again, the issue belongs at least to ten disks from our part and I strongly believe that the rest is affected as well, so I`d like to help to fix this issue instead of only giving those back. For now it looks that both C602 and C220 chipsets are affected, and I can confirm that both SATA and SAS downlinks are exposing the issue on C602.

0 Kudos
Highlighted
Novice
223 Views

I've since seem the same problems with the other drive in the pair, so I now find it hard to believe that this is a single faulty drive. I've edited the post title to reflect this.

I do not now know how to proceed. I need to know if the problem is a bug in the Linux kernel, in the SATA chipset or in the drives themselves.

It seems I can make the problem go away by disabling NCQ, but this reduces the performance of the drive to around 25% of max IOPS so is not a long term solution.

This server has an Intel C220 SATA chipset:

00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] (rev 05) (prog-if 01 [AHCI 1.0])

Subsystem: Super Micro Computer Inc Device 086d

Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+

Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR-

Latency: 0

Interrupt: pin A routed to IRQ 164

Region 0: I/O ports at f070 [size=8]

Region 1: I/O ports at f060 [size=4]

Region 2: I/O ports at f050 [size=8]

Region 3: I/O ports at f040 [size=4]

Region 4: I/O ports at f020 [size=32]

Region 5: Memory at fb312000 (32-bit, non-prefetchable) [size=2K]

Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-

Address: fee002b8 Data: 0000

Capabilities: [70] Power Management version 3

Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)

Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-

Capabilities: [a8] SATA HBA v1.0 BAR4 Offset=00000004

Kernel driver in use: ahci

00: 86 80 02 8c 07 04 b0 02 05 01 06 01 00 00 00 00

10: 71 f0 00 00 61 f0 00 00 51 f0 00 00 41 f0 00 00

20: 21 f0 00 00 00 20 31 fb 00 00 00 00 d9 15 6d 08

30: 00 00 00 00 80 00 00 00 00 00 00 00 0b 01 00 00

Are you aware of any problems with C220 chipset and S3610 drives?

0 Kudos
Highlighted
Honored Contributor II
223 Views

We are very interested in this issue and we'll need to do more research about it. We will contact you via Private Message individually with further details and to request any additional information.

0 Kudos
Highlighted
Beginner
223 Views

Hi guys,

Please post here when you have some progress on the subject.

I am having similar problem with S3710 800G, connected to LSI MegaRAID SAS 9271-4i via Supermicro expander backplane. The problems appear at almost zero load.

I have 4x800G S3710 in RAID10 array.

On two of the ports I was getting errors like this (errors are from LSI storage manager):

Aug 2 05:07:06 h19 MR_MONITOR[3772]: Controller ID: 0 PD Reset: PD # 012= -:-:3, Critical # 012= 3, Path =# 012 0x5003048000F3BE0F# 012Event ID:268Aug 2 05:07:07 h19 MR_MONITOR[3772]: Controller ID: 0 Command timeout on PD: PD # 012= -:-:3No addtional sense information, CDB =0x48 0xd0 0xc0 0x00 0x00 0x00 0x00 0x00 0x08 0x00, Sense = , Path =# 012 0x5003048000F3BE0F# 012Event ID:267Aug 2 05:07:07 h19 MR_MONITOR[3772]: Controller ID: 0 Command timeout on PD: PD # 012= -:-:3No addtional sense information, CDB =0x58 0xd0 0xc0 0x00 0x00 0x00 0x00 0x00 0x08 0x00, Sense = , Path =# 012 0x5003048000F3BE0F# 012Event ID:267Aug 2 05:07:07 h19 MR_MONITOR[3772]: Controller ID: 0 Unexpected sense: PD # 012= -:-:3Power on, reset, or bus device reset occurred, CDB =0x2a 0x00 0x00 0xc0 0xd0 0x58 0x00 0x00 0x08 0x00, Sense =0x70 0x00 0x06 0x00 0x00 0x00 0x00 0x0a 0x00 0x00 0x00 0x00 0x29 0x00 0x00 0x00 0x00 0x00

Contacted our vendor and they recommended to flash the firmware of the SSD disks. However just to make sure that everything with the backplane is ok we swapped the bays of all of the four disks: swapped port0 with port2, port1 with port3, and the problem somehow disappeared at least for the last 3-4 days.

0 Kudos
Highlighted
Honored Contributor II
223 Views

Hello dchepishev,

We are looking into this issue and an update will be provided once we have more information.

Please keep us informed in case the issue reappears.

0 Kudos
Highlighted
Novice
223 Views

We're seeing this same issue on 5 identical servers with Supermicro X10SLM+-LN4F motherboards in Supermicro 813MT-350CB 1U chassis, with one S3610 SSD in each machine, connected via the hot-swap backplane on the chassis to the onboard Intel C224 chipset 6 Gbps SATA ports.

One of the 5 machines has one additional spare S3610 - this has not shown any failures/resets, but there's no I/O being performed on it.

These S3610 SSDs were installed in the beginning of July, and the first command failure/bus reset occurred within a couple of days. It doesn't occur every day, and at the most a couple of times per day (currently we only have logs for 4-5 weeks back). I/O load is not high.

The machines also have DC S3500 series SSDs, which have been working flawlessly for the last year.

OS: Debian 7 (Wheezy), 64-bit

BIOS: AMI BIOS, version 1.1a.

There is currently no SMART status monitoring running on these machines.

Except from Linux kernel log output:

Aug 14 11:07:06 hotel kernel: [3273761.737966] ata2.00: exception Emask 0x0 SAct 0x30000000 SErr 0x0 action 0x6 frozen

Aug 14 11:07:06 hotel kernel: [3273761.738054] ata2.00: failed command: WRITE FPDMA QUEUED

Aug 14 11:07:06 hotel kernel: [3273761.738103] ata2.00: cmd 61/10:e0:c0:70:05/00:00:10:00:00/40 tag 28 ncq 8192 out

Aug 14 11:07:06 hotel kernel: [3273761.738105] res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)

Aug 14 11:07:06 hotel kernel: [3273761.738238] ata2.00: status: { DRDY }

Aug 14 11:07:06 hotel kernel: [3273761.738281] ata2.00: failed command: WRITE FPDMA QUEUED

Aug 14 11:07:06 hotel kernel: [3273761.738334] ata2.00: cmd 61/10:e8:c0:70:25/00:00:13:00:00/40 tag 29 ncq 8192 out

Aug 14 11:07:06 hotel kernel: [3273761.738336] res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

Aug 14 11:07:06 hotel kernel: [3273761.738467] ata2.00: status: { DRDY }

Aug 14 11:07:06 hotel kernel: [3273761.738512] ata2: hard resetting link

Aug 14 11:07:07 hotel kernel: [3273762.057688] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

Aug 14 11:07:07 hotel kernel: [3273762.058814] ata2.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded

Aug 14 11:07:07 hotel kernel: [3273762.058825] ata2.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out

Aug 14 11:07:07 hotel kernel: [3273762.058833] ata2.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out

Aug 14 11:07:07 hotel kernel: [3273762.060141] ata2.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded

Aug 14 11:07:07 hotel kernel: [3273762.060149] ata2.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out

Aug 14 11:07:07 hotel kernel: [3273762.060155] ata2.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out

Aug 14 11:07:07 hotel kernel: [3273762.060500] ata2.00: configured for UDMA/133

Aug 14 11:07:07 hotel kernel: [3273762.060510] ata2.00: device reported invalid CHS sector 0

Aug 14 11:07:07 hotel kernel: [3273762.060515] ata2.00: device reported invalid CHS sector 0

Aug 14 11:07:07 hotel kernel: [3273762.060528] ata2: EH complete

SMART info & attributes (from one of the machines):

root@hotel:~# smartctl -iA /dev/sdb

smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-4-amd64] (local build)

Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===

Device Model: INTEL SSDSC2BX400G4

Serial Number: BTHC514101W7400VGN

LU WWN Device Id: 5 5cd2e4 04b7ca92d

Firmware Version: G2010110

User Capacity: 400,088,457,216 bytes [400 GB]

Sector Sizes: 512 bytes logical, 4096 bytes physical

Device is: Not in smartctl database [for details use: -P showall]

ATA Version is: 8

ATA Standard is: ACS-2 revision 3

Local Time is: Tue Aug 18 15:28:28 2015 UTC

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===

SMART Attributes Data Structure revision number: 1

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE

5 Reallocated_Sector_Ct 0x0032 100 100 000 Old_age Always - 0

9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 1028

12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 2

170 Unknown_Attribute 0x0033 100 100 010 Pre-fail Always - 0

171 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0

172 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0

174 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 1

175 Program_Fail_Count_Chip 0x0033 100 100 010 Pre-fail Always - 21563578034

183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0

184 End-to-End_Error 0x0033 100 100 090 Pre-fail Always - 0

187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0

190 Airflow_Temperature_Cel 0x0022 079 077 000 Old_age Always - 21 (Min/Max 20/24)

192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 1

194 Temperature_Celsius 0x0022 100 100 000 Old_age Always - 21

197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0

199 UDMA_CRC_Error_Count 0x003e 100 100 000 Old_age Always - 0

225 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 3207

226 Load-in_Time 0x0032 100 100 000 Old_age Always - 0

227 Torq-amp_Count 0x0032 100 100 000 Old_age Always - 13

228 Power-off_Retract_Count 0x0032 100 100 000 Old_age Always - 61697

232 Available_Reservd_Space 0x0033 100 100 010 Pre-fail Always - 0

233 Media_Wearout_Indicator 0x0032 100 100 000 Old_age Always - 0

234 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0

241 Total_LBAs_Written 0x0032 100 100 000 Old_age Always - 3207

242 Total_LBAs_Read 0x0032 100 100 000 Old_age Always - 506

PCI info for SATA controller (from lspci):

00:1f.2 SATA controller: Intel Corporation Lynx Point 6-port SATA Controller 1 [AHCI mode] (rev 05) (prog-if 01 [AHCI 1.0])

Subsystem: Super Micro Computer Inc Device 0806

Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+

Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR-

Latency: 0

Interrupt: pin B routed to IRQ 51

Region 0: I/O ports at f050 [size=8]

Region 1: I/O ports at f040 [size=4]

Region 2: I/O ports at f030 [size=8]

Region 3: I/O ports at f020 [size=4]

Region 4: I/O ports at f000 [size=32]...

Highlighted
Beginner
223 Views

In a meantime you may disable NCQ via libata: libata.force=X:noncq for the specific link. Reducing queue_length to 1 was not helpful for me, instead you probably should completely eliminate possibility of issuing NCQ tags. Hopefully the new firmware with fix will hit the public this week and this bad hack can be thrown out.

0 Kudos
Highlighted
Novice
223 Views

Hi Andrey,

Have you had any indication that there is a fix for this in forthcoming firmware then?

So far I've had no response to asking for updates on this and I need to make a decision as to whether I'm going to wait or return for refund and buy something else.

0 Kudos
Highlighted
Beginner
223 Views

Hi Andy,

in a phone conversation support engineer indicated an approximate date of the firmware release as an end of the current week a week ago, though the could be obviously delayed a little, the corresponding ticket is still open as I asked to hold it until the complete resolution. I am relatively fine with the "workaround" for now because our hot caches are not likely to generate more than 2K IOPS per caching device ever, so I changed my mind over the possibility of utilizing buggy devices as is without issuing an RMA. Over couple of months the tiering scheme in our datacenter is a subject to change and a single-queued SSD cannot be an option anymore for a tier-1 "iops-dampeners". Please share your RMA experience if you decide not to wait for a FW release.

0 Kudos
Highlighted
Novice
223 Views

According to:

https://downloadmirror.intel.com/18455/eng/Intel_SSD_Toolbox_3_3_1_Release_Notes_325993-020US.pdf https://downloadmirror.intel.com/18455/eng/Intel_SSD_Toolbox_3_3_1_Release_Notes_325993-020US.pdf

"This release of the Intel® SSD Toolbox includes firmware updates for the Intel® SSD Pro 2500 and

Intel® SSD 535 Series and the Intel® SSD DC S3710, DC S3610, DC S3510, DC S3500 M.2 and DC S3500 HD Series products"

This appears to be a new firmware update released this week.

However, SSD toolbox is Windows-only software. Intel SSD Data Center Tool which I would normally use on Linux does not yet show an update.

1) Is an update coming for this?

2) If not, is there some way to download the firmware update and place it somewhere that ISDCT will find it?

3) Is there any known fix in this firmware update for the problem we are discussing in this thread?

Thanks,

Andy

0 Kudos
Highlighted
Honored Contributor II
223 Views

Hello,

https://downloadcenter.intel.com/download/18455/Intel-Solid-State-Drive-Toolbox Intel® Solid-State Drive Toolbox version 3.3.1 was recently released. It contains firmware updates to prevent the behavior mentioned in this thread.

Currently, you need a computer with the Intel® SSD Toolbox installed to perform the update. We expect future versions of Intel® Solid-State Drive Data Center Tool and Intel® Firmware Update Tool to contain the firmware updates as well, however, we do not have a specific date for this yet.

0 Kudos
Highlighted
Novice
223 Views

So can you give me a solution for a firmware upgrade that works in Linux please?

Thanks,

Andy

0 Kudos
Highlighted
Novice
223 Views

We need a solution that does not depend on Windows.

Thank you.

Regards,

Daniel

0 Kudos
Highlighted
Beginner
223 Views

Hi Jonathan,

"It contains firmware updates to prevent the behavior mentioned in this thread."

I see nothing in the release notes at https://downloadmirror.intel.com/18455/eng/Intel_SSD_Toolbox_3_3_1_Release_Notes_325993-020US.pdf https://downloadmirror.intel.com/18455/eng/Intel_SSD_Toolbox_3_3_1_Release_Notes_325993-020US.pdf to support this statement, can you clarify whether you're referring to a change which isn't mentioned in the release notes, or which change in the release notes indicates a fix if it is in there.

Thanks

James

0 Kudos