Community
cancel
Showing results for 
Search instead for 
Did you mean: 
WSier
Beginner
1,971 Views

Intel 530 SSD mSATA 240GB in Intel DQ77MK Motherboard disconnects

An Intel 530-Series mSATA SSD 240GB is installed in an Intel DQ77MK motherboard mounted in an Antec Sonata-II tower case. It has FreeBSD 10.1-Release (amd64) installed on it.

The SSD will randomly disconnect resulting in a series of ahci timeout messages and ultimately a kernel panic.

Usually, if not always, on reboot the SSD is no longer visible in the BIOS until the system is power-cycled. (One small variation here is that if the system is powered down before FreeBSD panics, the device boot order is retained in the BIOS - there are other non-SSD SATA disks attached - but if the system is allowed to panic and reboot then the device boot order is lost - the SSD is no longer listed as the first boot device).

Initially in an attempt to mitigate this problem the SATA channel for the SSD was configured to reduced speed operation in FreeBSD, (in /boot/loader.conf):

hint.ahcich.4.sata_rev=1

(kernel log, /var/log/messages):

kernel: ada2 at ahcich4 bus 0 scbus5 target 0 lun 0

kernel: ada2: ATA-9 SATA 3.x device

kernel: ada2: Serial Number CVDA414203E8240M

kernel: ada2: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes)

kernel: ada2: Command Queueing enabled

kernel: ada2: 228936MB (468862128 512 byte sectors: 16H 63S/T 16383C)

kernel: ada2: Previously was known as ad12

This is still the current setting, but the issue persists.

The system was also booted into linux from a DVD. Creating an "EXT3" filesystem on the SSD and writing to it was similarly unsuccessful.

Since finding the other reports of problems with this device I have taken a look at the temperatures using 'smartctl'. It seems there may indeed be the overheating issue that has been identified for this device, as evidenced from this excerpt:

# smartctl -l scttemp /dev/ada2

Current Temperature: 34 Celsius

Power Cycle Min/Max Temperature: -20/67 Celsius

Lifetime Min/Max Temperature: -20/76 Celsius

Under/Over Temperature Limit Count: 0/0

The SSD sits in the exhaust airflow from the graphics card heatsink/fan. A thermometer showed this exhaust air to be around 45C. The SSD probably does not benefit greatly from the existing case cooling - PSU exhaust fan and 120mm case rear exhaust fan - due to its position within the case and its position and orientation on the motherboard (lies parallel and close to board, air crossflow is low due to large internal volume of case and is also significantly occluded to the SSD by SATA connectors etc.).

I have now installed (loosely sat) an 80mm fan to blow air from the bottom of the case directly onto the SSD. From occasional checking of the "Current" temperature using 'smartctl', it seems to be effectively reducing the average temperature of the SSD. (For some reason the temperature history log as displayed by the 'smartctl -l scttemp' command doesn't seem to now be updating reliably, however the last four temperatures shown in the attached ada2_smartctl_scttemp.log were recorded after the additional fan was installed and seem to reflect the success of the additional cooling).

Despite the additional cooling this problem persists - it recurred about 24 hours after installing the fan and there were a succession of faults today triggered repeatedly by the same action. That action was using the FreeBSD 'pkg' command to update/install a particular port which required a ~38MB download. Each attempt to run the pkg command to install that port resulted in the SSD disconnecting before the 'fetch' of the pkg file completed - perhaps between 50-80% completion. After 3 consecutive instances of repeated failure, the 'pkg fetch' command was used to write the downloaded pkg file to a different partition (to the /usr partition rather than the default /var partition), but the SSD again disconnected. The port was ultimately successfully installed by downloading the pkg file to a non-SSD drive and completing the installation from there.

The most curious aspect of this scenario is that each download attempt of the ~38MB file occurred at ~150kB/s with a correspondingly low average write rate to the SSD. In the final, successful, install of the pkg file, over 200MB of files were written to the SSD, very rapidly, but the drive did not fault. Besides which many dozens of other port installs have been completed without incident.

One final point of interest is that prior to the acquisition of the Intel SSD, a KingSpec 32GB device was installed and exhibited essentially the same symptoms, although probably much more rapidly. At the time I put this down to it being a poor quality product, but the experience with the Intel device perhaps suggests that something else is at play?

I have a SATA->mSATA adapter on order and will try the SSD with that to see if eliminating the mSATA port provides any improvement. In the meantime is there anything else I can do to validate the condition of the SSD or resolve this problem?

Thanks.

Tags (1)
0 Kudos
7 Replies
ASouz7
Honored Contributor II
82 Views

Hello ovirt,

We are going to try to recreate this behavior. Please answer the questions below.

1 - When the drive is working properly (still seen by the OS), can you put the system to sleep or restart it and have the drive be seen on wake from sleep or restart finishes?

2 - What Bios version do you have installed?

WSier
Beginner
82 Views

Hi Aleki,

1. "Sleep" mode isn't used for this system. So long as the SSD has not "disconnected" then on any kind of restart (O/S reboot, hard reset, power removed) the drive remains visible after restart finishes.

2. BIOS version is 0067, see below excerpt from FreeBSD "kenv" command.

smbios.bios.reldate="07/03/2014"

smbios.bios.vendor="Intel Corp."

smbios.bios.version="MKQ7710H.86A.0067.2014.0703.1149"

smbios.memory.enabled="16777216"

smbios.planar.maker="Intel Corporation"

smbios.planar.product="DQ77MK"

smbios.planar.serial="BTMK22301K9J"

smbios.planar.version="AAG39642-400"

smbios.socket.enabled="1"

smbios.socket.populated="1"

smbios.version="2.7"

ASouz7
Honored Contributor II
82 Views

Hello ovirt,

We are going to analyze it and will get back to you soon.

ASouz7
Honored Contributor II
82 Views

Hello ovirt,

Please check your private messages.

WSier
Beginner
82 Views

Another incident yesterday.

Approx 206 hours continuous uptime elapsed since the previous incident, it occurred between approx 13:30 and 14:30 local time. The PC was not being actively used at the time. The desktop (gnome3) applications running were Chromium browser and Evolution email. It is not clear that there should have been any particularly large writes to the SSD, however this is difficult to assess due to potential background activity particularly of gnome3.

Also, not surprisingly, the "pkg" command that I originally reported as triggering the SSD disconnects has since been tried again twice and did not trigger the disconnect. In all probability that was simply a fortuitous coincidence.

ASouz7
Honored Contributor II
82 Views

Hello ovirt,

Thank you for your last update. We have escalated this to our engineering department. We will get back to you soon.

DSarf
Beginner
82 Views

Did you ever get this solved?

I have a similar issue, but my issue is with a regular 2.5" SSD installed in the system. It appears that only the system disk loses the connectivity, then the system will restart and displays "non-system disk or disk error". If I power cycle the system it works fine. In my case I'm running Windows 2012 R2, and I'm booting in BIOS mode (not UEFI).

Was wondering if you were able to get this resolved... I know this was from a while ago but I've been struggling with this issue for a very long time.

Thanks!