Known issues

From ata Wiki
Jump to: navigation, search

OBSOLETE CONTENT

This wiki has been archived and the content is no longer updated.

Contents

Hardware compatibility issues

Drives which perform frequent head unloads under Linux

Problem description

Some ATA harddrives perform very frequent head unloads under Linux significantly shortening their lifespans.

Root cause

The inactivity timer for head unload is configured too aggressively either via ATA APM (Advanced Power Management) feature or other non-standard means. Such aggressive settings are very fragile to changes in IO pattern and under Linux many such drives unload their heads only to re-load them shortly. Note that this relentless unloading/reloading cycle can also be triggered under Windows by installing programs which can alter the IO pattern (e.g. certain vaccine programs which runs in background).

How to determine whether a machine has this problem

Many drives make small clunking noise when they unload and/or reload their heads, so if you hear such noise frequently (say, several times in a minute), it would be worth investigation. Drives usually implement Start_Stop_Count or similarly named counter in their SMART attributes which can be accessed by running "smartctl -a /dev/[hs]dX" where "/dev/[hs]dX" is the drive in question. If the counter is too high compared to the hours the drive has been running (which is also often available in the SMART output), your hardware is likely to have this problem.

Note that modern laptop drives are supposed to unload frequently to save power. Unless the unloading is excessive, disabling powersaving is not a good idea. It seems that most modern drives are rated for 600,000 load/unload cycles which translates to about two years of uptime at 35 unloads per hour. Even when assuming continuous 12 hours of usage everyday, this means the drive will only reach its rated load/unload cycle limit after four years and shouldn't be considered malfunctioning. Please only report cases where the expected uptime is significantly lower than two years.

Known affected devices

This list is far from complete. If you have a hardware affected by this problem, please write to linux-ide and attach the outputs of "dmidecode", "hdparm -I /dev/[hs]dX" and "smartctl -a /dev/[hs]dX" where dev/[hs]dX is the affected drive. Also, please include how many times the drive unloads the head per-hour under nominal usage without any adjustment.


dmidecode hdparm -I report workaround
IBM ThinkPad T43 dmidecode hdparm -I report set APM to 255
Lenovo ThinkPad T60 w/ Hitachi drives dmidecode hdparm -I set APM to 255
HP Compaq 2710p in report in report (Toshiba MK8009GAH with APM 128) report set APM to 254
HP Compaq nx6325 dmidecode hdparm -i report set APM to 255
HP Compaq nx6325 (w/ different drive) dmidecode hdparm -i report set APM to 255
HP Compaq nx6325 (w/ different drive) dmidecode hdparm -i report set APM to 254
HP Compaq nx7400 dmidecode hdparm -i report set APM to 255
HP Pavilion dv6500 Notebook PC dmidecode hdparm -i set APM to 255
HP Pavilion dv9500 Notebook PC dmidecode hdparm -I sda hdparm -I sdb report set APM to 254
Dell e1505 dmidecode hdparm -I set APM to 255
Western Digital Green Drives WD5000AACS, WD7500AYPS, WD1000FYPS Requires vendor specific tool wdidle3.exe to change the configuration. The tool can only officially be obtained via WD support.
Western Digital Scorpio WD400UE-22HCT0 Requires vendor specific tool wdidle3.exe to change the configuration. The tool can only officially be obtained via WD support.
Dell Vostro 1400 dmidecode hdparm -I report set APM to 254 (or 252)
Dell Vostro 1500 dmidecode hdparm -I report set APM to 254
Dell XPS M1330 dmidecode hdparm -I report set APM to 254
Dell XPS M1530 dmidecode hdparm -I report set APM to 254
Dell Inspiron 1318 dmidecode hdparm -I report set APM to 254
Dell Inspiron 1525 dmidecode -I report set APM to 255
Dell Insprion 1705 dmidecode hdparm -I report set APM to 255
Dell Latitude E6410 dmidecode hdparm -I report set APM to 255
Samsung Q45 dmidecode hdparm -I report set APM to 254
Mac mini 1,1 dmidecode hdparm -I report set APM to 255
Acer Aspire 1691wlmi dmidecode hdparm -I report set APM to 255
Acer Aspire 2930z dmidecode hdparm -I report set APM to 255
MSI Notebook EX600 dmidecode hdparm -I report set APM to 254
MSI Notebook S425 dmidecode hdparm -I report set APM to 254
MSI Wind U-100 dmidecode hdparm -I report set APM to 254
ASUS F6S dmidecode hdparm -I report set APM to 254
ASUS M50SV dmidecode hdparm -I report set APM to 254
Sony VGN-FW31E dmidecode hdparm -I report set APM to 254

storage-fixup

storage-fixup is a script which uses dmidecode and hdparm outputs to match blacklisted devices and execute appropriate workaround. The script should be run early during the boot and while the system is coming out of sleep. Please read comments on top of the script and the configuration file.

Note: all the links to storage-fixup are broken as of 2013 Jan 7. As a hacky fix on my HP Compaq 2710p notebook running CentOS 6.3, I added

 hdparm -B 254 /dev/sda

to /etc/rc.d/rc.local

TSSTCorp TS-L632D

Problem description

There is a known hardware issue when libata is used with certain TSSTcorp TS-L623D drives. The user might experience random system freezes for a few minutes periodically. The problem mostly occurs on Acer and Asus laptop computers.

Root cause

It seems that the firmware of the TS-L623D stops responding after being continuously polled by hald-addon-storage.

Affected firmware versions

  • Acer: All firmware versions including AC00 and AC01
  • Asus: All firmware versions including AS05 and AS99
  • Samsung: SC02

Good firmware versions

Samsung SC03

Workaround solution

  1. keep a CD in the TS-L632D drive or
  2. kill the hald-addon-storage process or
  3. cross-flash the drive firmware of TS-L632D to the SC03 version.

How to cross-flash

Be warned that you are at your own risk to cross-flash the firmware of the drive. Also it might void the warranty.

  1. Download the Samsung firmware update utility
  2. Download the Samsung firmware
  3. Run "sfdnwin -nocheck" to crossflash.

Seagate harddrives which time out FLUSH CACHE when NCQ is being used

Problem description

On certain Seagate drives sold during late 2008, FLUSH CACHE sometimes times out when used in combination with NCQ commands.

Root cause

Firmware bug.

Affected devices

capacity model product number firmware revisions remarks
1.5TB ST31500341AS 9JU138 SD1[5-9] problem happens most frequently on this model
1.0TB ST31000333AS 9FZ136 SD1[5-9] happens less frequently
640GB ST3640623AS 9FZ164 SD1[5-9] happens less frequently
640GB ST3640323AS 9FZ134 SD1[5-9] happens less frequently
320GB ST3320813AS 9FZ182 SD1[5-9] happens less frequently
320GB ST3320613AS 9FZ162 SD1[5-9] happens less frequently

Solution

libata will automatically disable NCQ when any of the above devices is detected with any of the above firmware revisions and emit a warning message. Firmware update solves the problem. Contact Seagate for details.


SATA hard drives which show poor performance with sequential reads (e.g. hdparm -t)

Problem description

Poor sequential read performance - "hdparm -t" measurements reach only 20% to 90% of the expected speed. Similar performance problems have been observed during software RAID checks and rebuilds etc.

Root cause

The problem only occurs with NCQ enabled in AHCI mode (Firmware bug)

Affected devices

vendor model name capacity family actual speed expected speed affected firmware revision patched firmware revision affected NCQ depths
Seagate ST3320620AS 320GB Barracuda 7200.10 56MB/s 76MB/s 3.AAE N/A  ?
Seagate ST3320620AS 320GB Barracuda 7200.10 65MB/s 72MB/s 3.AAK 3.AAM  ?
Seagate ST3500830AS 500GB Barracuda 7200.10 43MB/s 78MB/s 3.AFD N/A  ?
Seagate ST3320613AS 320GB Barracuda 7200.11 43MB/s >110MB/s SD22 SD2B  ?
Seagate ST3500320AS 500GB Barracuda 7200.11 53MB/s >100MB/s SD15 SD1A  ?
Seagate ST3250310NS 250GB Barracuda ES.2 ~45MB/s >100MB/s MA08(dell)  ? >2 (MA08)
Seagate ST3750330NS 750GB Barracuda ES.2 50MB/s >100MB/s SN04 SN05  ?
Seagate ST31000340NS 1TB Barracuda ES.2 ~50MB/s >100MB/s SN04 MA0D SN05 SN16 AN05 >2 (MA0D)
Western Digital WDC WD800JD-00MSA1 80GB Caviar SE SATAII 26MB/s 59MB/s 10.01E01 N/A  ?
Western Digital WDC WD1600JS-22NCB1 160GB Caviar SE SATAII 29MB/s 60MB/s 10.02E02 N/A  ?
Western Digital WDC WD740ADFD-00NLR5 74GB Raptor EL150 18MB/s 85MB/s 21.07QR5  ? all

Solution

To restore performance, update to the patched firmware or request patched firmware from the hard drive vendor if it is not already available. Disabling or limiting NCQ with "echo Number > /sys/block/sdX/device/queue_depth" but may degrade performance in other scenarios. Experiment with different values e.g. 2, or 1.

Samsung HD501LJ may freeze

Problem description

The drive can hang randomly (but very infrequent, with 20 identical drives you can see about 1 hang per month)

Root cause

Firmware bug, probably smartctl may alleviate the bug

Affected devices

SAMSUNG HD501LJ (CW026) CR100-12

SAMSUNG HD321KJ (DN133) CP100-12

Solution

Update firmware to CR100-13 / CP100-13.

WD MyBook Studio Edition does not work with JMicron eSATA

The Western Digital MyBook Studio Edition external hard drive does not work (it is not detected) when connected to a JMicron adapter like JMB363. See kernel bug 9913.

The problem is workarounded, by switching the eSATA link speed down to 1,5Gbps.

Recent HP laptops fail disk detection after resuming from suspend

Problem description

As of 2009-05-18, many recent HP laptops fail to detect disk after resuming from suspend if the ATA controller is in AHCI mode.

The problem has been reported to HP and it seems that HP already released firmware updates which fix the problem on some of the affected machines. libata developer contacted HP but hasn't been successful at getting information to work around the problem. Please read bko#12276 for details.

Affected devices

The problem has been reported on HP Pavilion dv5, DV5t-1000, HDX16t, dv3507, HDX18, dv6-1030us, dv5-1120el, DV4 1050el and dv7. Note that not all reports have been verified and the list may not be complete.

Solution

The solution is to update the bios of the affected Hp machines. Hp is rolling out in these days (end of May 2009) newer releases.

For hp dv5-1xxx (intel) the bios that fix the issue is F.16A For hp dv4-1050el is F.34A For hp HDX18t is F23 (the list is not complete, add other models..)

Samsung Spinpoint F3 and Spinpoint F3EG are not detected on AMD SB850

Problem description

The drive will clunck every 2 seconds for 10 seconds and then power off itself when attached on the AMD SB850 southbridge.

Root cause

Firmware bug.

Affected devices

Samsung Spinpoint F3

Samsung Spinpoint F3EG

Solution

Update disk firmware.

Samsung Spinpoint F4EG suffer from silent data corruption

Problem description

Disks will develop corrupted sectors from write errors. Those errors come to surface if IDENTIFY DEVICE command is issued(by smartctl or hdparm for example) and at the same time writes are occuring with NCQ enabled. Erroneously no error messages are logged anywhere despite the actual corruption.

Root cause

The problem only occurs with NCQ enabled (Firmware bug)

Affected devices

Samsung Spinpoint F4EG HD155UI, HD155UI/Z4, HD155UI/UZ4 1,5TB & HD204UI, HD204UI/Z4, HD204UI/UZ4 2TB and HD204UI/JP1, HD204UI/JP2 2TB

Solution

Update disk firmware.

WARNING:Patched firmware cannot be identified as it shares the same revision number with the original broken firmware! All owners of those disks must execute the update utility to verify that their disk runs the patched firmware.

NOTICE: Disks will suffer data corruption even if their owner does not use smartctl or hdparm as those utilities are shipped with linux distributions and smartctl is used at every boot in order to protect users from failing or failed disks.

Attached storage devices may not be detected on Intel ICH5 series southbridge

Problem description

When all six storage devices are populated on ICH5(4 PATA and 2 SATA) only four of them will be detected by BIOS and OS on some mainboards (e.g. Albatron 865 with ICH5).

Root cause

Odd BIOS implementation uses legacy mode with compatible configuration option 3 by setting PATA and SATA controller in combined mode. This limit was introduced in the name of BIOS code forward compatibility with future designs such as ICH6 which have 4 SATA.

Affected southbridge

ICH5

Solution

None, contact mainboard manufacturer and request BIOS update that will provide an option to use ICH5 in non combined mode with enhanced configuration by setting the PATA controller in legacy mode and the SATA controller in native mode.

SATA 3.0 (6Gb/s) devices may not be detected on Intel 5 series and mobile 5 series southbridge

Problem description

Handshake between SATA device and SATA controller on Ibex Peak PCH southbridge may not complete after booting from ACPI G2, G3 state or after resuming from ACPI S3, S4 state and thus device will not down-shift to 3Gb/s speed resulting in undetected device.

Root cause

Southbridge erratum #21.

Affected southbridge

3400, 3420, 3450, B55, H55, HM55, P55, PM55, H57, HM57, Q57, QM57, QS57

Solution

None, reset the system until device is detected.

SATA devices may not be detected at the 3 Gb/s ports of B2 stepping Intel 6 series and mobile 6 series southbridge

Problem description

SATA link(s) performance may degrade over time on some B2 stepping Cougar Point PCH southbridge. Links will develop increased bit error rates and failed transfers have to be retried upon error detection by the SATA controller. As the wear out continues performance will get worse as the SATA controller will spend more time retrying failed transfers than it will spend on sending actual data. At some point things will get so bad that attached devices will be disconnected because of unreliable link from unstable clock and will not be detected at all.

Root cause

Southbridge erratum #14.

Affected southbridge

H67, P67, HM65, HM67

Solution

None, use only the two 6 Gb/s ports until manufacturer replaces existing B2 PCH systems exhibiting problems at the 3 Gb/s SATA ports with B3 stepping. B3 stepping will be available on late April though.

Hardware design issues

IDE port poor performance

Problem description

IDE disks have poor performance when installed on a couple MSI boards.

Root cause

Odd board design, to overcome southbridge limitation(no IDE bus). While there are 5 unused PCIe ports on the southbridge a USB to IDE IC is used (JM20335) instead of a decent PCIe to IDE IC. USB 2.0 is a half duplex bus that can only sustain up to 30MB/s. What is worse, if any USB device is attached at the front USB ports(5, 6, 7), that device will share the bandwidth with the JM20335 that is attached at port 8. If multiple USB devices are working at the same time, the performance penalty will get bigger.

Affected boards

vendor model
MSI G965M (MS-7241)
MSI G965MDH (MS-7276)

Solution

A PCI or PCIe adapter will provide decent performance needed for recent disks. A PATA to SATA bridge will also do the same job if only one disk is used.

SATA port sequence shift on different SATA modes

Problem description

Some Intel chipsets change SATA port sequence while switching from AHCI or RAID to IDE mode and vice versa. This change can break settings based on specific configurations when disks are attached at the affected ports. For example it can alter the boot order which leads to either wrong disk booting or boot failure.

Root cause

Odd hardware design, affected chips set primary slave at SATA port 2 and secondary master at SATA port 1

Affected chipsets

chipset SATA port ordering in IDE mode SATA port ordering in AHCI or RAID mode
ICH6R 0, 2, 1, 3 0, 1, 2, 3
ICH7R, ICH7DH 0, 2, 1, 3 0, 1, 2, 3
631xESB, 632xESB, IICH(3100) 0, 2, 1, 3 0, 1, 2, 3, 4, 5
ICH8
ICH8R, ICH8DH, ICH8DO
0, 2, 1, 3
0, 2, 1, 3, 4, 5
0, 1, 2, 3
0, 1, 2, 3, 4, 5
ICH9R, ICH9DH, ICH9DO 0, 2, 1, 3, 4, 5 0, 1, 2, 3, 4, 5
ICH10R, ICH10D, ICH10DO 0, 2, 1, 3, 4, 5 0, 1, 2, 3, 4, 5
PCH(3420, 3450, B55, H55, HM55, P55, PM55, H57, HM57, Q57, QM57, QS57) 0, 2, 1, 3, 4, 5 0, 1, 2, 3, 4, 5

Solution

Use device UUID or label based device nodes which is default on most modern distros anyway. If traditional device nodes need to be used, choose the desired mode and stick to it.

ATA 4 KiB sector issues

This issue is discussed in a separate page - ATA 4 KiB sector issues.

Personal tools