Known issues
OBSOLETE CONTENT
This wiki has been archived and the content is no longer updated.
Hardware compatibility issues
Drives which perform frequent head unloads under Linux
Problem description
Some ATA harddrives perform very frequent head unloads under Linux significantly shortening their lifespans.
Root cause
The inactivity timer for head unload is configured too aggressively either via ATA APM (Advanced Power Management) feature or other non-standard means. Such aggressive settings are very fragile to changes in IO pattern and under Linux many such drives unload their heads only to re-load them shortly. Note that this relentless unloading/reloading cycle can also be triggered under Windows by installing programs which can alter the IO pattern (e.g. certain vaccine programs which runs in background).
How to determine whether a machine has this problem
Many drives make small clunking noise when they unload and/or reload their heads, so if you hear such noise frequently (say, several times in a minute), it would be worth investigation. Drives usually implement Start_Stop_Count or similarly named counter in their SMART attributes which can be accessed by running "smartctl -a /dev/[hs]dX" where "/dev/[hs]dX" is the drive in question. If the counter is too high compared to the hours the drive has been running (which is also often available in the SMART output), your hardware is likely to have this problem.
Note that modern laptop drives are supposed to unload frequently to save power. Unless the unloading is excessive, disabling powersaving is not a good idea. It seems that most modern drives are rated for 600,000 load/unload cycles which translates to about two years of uptime at 35 unloads per hour. Even when assuming continuous 12 hours of usage everyday, this means the drive will only reach its rated load/unload cycle limit after four years and shouldn't be considered malfunctioning. Please only report cases where the expected uptime is significantly lower than two years.
Known affected devices
This list is far from complete. If you have a hardware affected by this problem, please write to linux-ide and attach the outputs of "dmidecode", "hdparm -I /dev/[hs]dX" and "smartctl -a /dev/[hs]dX" where dev/[hs]dX is the affected drive. Also, please include how many times the drive unloads the head per-hour under nominal usage without any adjustment.
dmidecode | hdparm -I | report | workaround | |
---|---|---|---|---|
IBM ThinkPad T43 | dmidecode | hdparm -I | report | set APM to 255 |
Lenovo ThinkPad T60 w/ Hitachi drives | dmidecode | hdparm -I | set APM to 255 | |
HP Compaq 2710p | in report | in report (Toshiba MK8009GAH with APM 128) | report | set APM to 254 |
HP Compaq nx6325 | dmidecode | hdparm -i | report | set APM to 255 |
HP Compaq nx6325 (w/ different drive) | dmidecode | hdparm -i | report | set APM to 255 |
HP Compaq nx6325 (w/ different drive) | dmidecode | hdparm -i | report | set APM to 254 |
HP Compaq nx7400 | dmidecode | hdparm -i | report | set APM to 255 |
HP Pavilion dv6500 Notebook PC | dmidecode | hdparm -i | set APM to 255 | |
HP Pavilion dv9500 Notebook PC | dmidecode | hdparm -I sda hdparm -I sdb | report | set APM to 254 |
Dell e1505 | dmidecode | hdparm -I | set APM to 255 | |
Western Digital Green Drives | WD5000AACS, WD7500AYPS, WD1000FYPS | Requires vendor specific tool wdidle3.exe to change the configuration. The tool can only officially be obtained via WD support. | ||
Western Digital Scorpio | WD400UE-22HCT0 | Requires vendor specific tool wdidle3.exe to change the configuration. The tool can only officially be obtained via WD support. | ||
Dell Vostro 1400 | dmidecode | hdparm -I | report | set APM to 254 (or 252) |
Dell Vostro 1500 | dmidecode | hdparm -I | report | set APM to 254 |
Dell XPS M1330 | dmidecode | hdparm -I | report | set APM to 254 |
Dell XPS M1530 | dmidecode | hdparm -I | report | set APM to 254 |
Dell Inspiron 1318 | dmidecode | hdparm -I | report | set APM to 254 |
Dell Inspiron 1525 | dmidecode | -I | report | set APM to 255 |
Dell Insprion 1705 | dmidecode | hdparm -I | report | set APM to 255 |
Dell Latitude E6410 | dmidecode | hdparm -I | report | set APM to 255 |
Samsung Q45 | dmidecode | hdparm -I | report | set APM to 254 |
Mac mini 1,1 | dmidecode | hdparm -I | report | set APM to 255 |
Acer Aspire 1691wlmi | dmidecode | hdparm -I | report | set APM to 255 |
Acer Aspire 2930z | dmidecode | hdparm -I | report | set APM to 255 |
MSI Notebook EX600 | dmidecode | hdparm -I | report | set APM to 254 |
MSI Notebook S425 | dmidecode | hdparm -I | report | set APM to 254 |
MSI Wind U-100 | dmidecode | hdparm -I | report | set APM to 254 |
ASUS F6S | dmidecode | hdparm -I | report | set APM to 254 |
ASUS M50SV | dmidecode | hdparm -I | report | set APM to 254 |
Sony VGN-FW31E | dmidecode | hdparm -I | report | set APM to 254 |
storage-fixup
storage-fixup is a script which uses dmidecode and hdparm outputs to match blacklisted devices and execute appropriate workaround. The script should be run early during the boot and while the system is coming out of sleep. Please read comments on top of the script and the configuration file.
Note: all the links to storage-fixup are broken as of 2013 Jan 7. As a hacky fix on my HP Compaq 2710p notebook running CentOS 6.3, I added
hdparm -B 254 /dev/sda
to /etc/rc.d/rc.local
TSSTCorp TS-L632D
Problem description
There is a known hardware issue when libata is used with certain TSSTcorp TS-L623D drives. The user might experience random system freezes for a few minutes periodically. The problem mostly occurs on Acer and Asus laptop computers.
Root cause
It seems that the firmware of the TS-L623D stops responding after being continuously polled by hald-addon-storage.
Affected firmware versions
- Acer: All firmware versions including AC00 and AC01
- Asus: All firmware versions including AS05 and AS99
- Samsung: SC02
Good firmware versions
Samsung SC03
Workaround solution
- keep a CD in the TS-L632D drive or
- kill the hald-addon-storage process or
- cross-flash the drive firmware of TS-L632D to the SC03 version.
How to cross-flash
Be warned that you are at your own risk to cross-flash the firmware of the drive. Also it might void the warranty.
- Download the Samsung firmware update utility
- Download the Samsung firmware
- Run "sfdnwin -nocheck" to crossflash.
Seagate harddrives which time out FLUSH CACHE when NCQ is being used
Problem description
On certain Seagate drives sold during late 2008, FLUSH CACHE sometimes times out when used in combination with NCQ commands.
Root cause
Firmware bug.
Affected devices
capacity | model | product number | firmware revisions | remarks |
---|---|---|---|---|
1.5TB | ST31500341AS | 9JU138 | SD1[5-9] | problem happens most frequently on this model |
1.0TB | ST31000333AS | 9FZ136 | SD1[5-9] | happens less frequently |
640GB | ST3640623AS | 9FZ164 | SD1[5-9] | happens less frequently |
640GB | ST3640323AS | 9FZ134 | SD1[5-9] | happens less frequently |
320GB | ST3320813AS | 9FZ182 | SD1[5-9] | happens less frequently |
320GB | ST3320613AS | 9FZ162 | SD1[5-9] | happens less frequently |
Solution
libata will automatically disable NCQ when any of the above devices is detected with any of the above firmware revisions and emit a warning message. Firmware update solves the problem. Contact Seagate for details.
SATA hard drives which show poor performance with sequential reads (e.g. hdparm -t)
Problem description
Poor sequential read performance - "hdparm -t" measurements reach only 20% to 90% of the expected speed. Similar performance problems have been observed during software RAID checks and rebuilds etc.
Root cause
The problem only occurs with NCQ enabled in AHCI mode (Firmware bug)
Affected devices
vendor | model name | capacity | family | actual speed | expected speed | affected firmware revision | patched firmware revision | affected NCQ depths |
---|---|---|---|---|---|---|---|---|
Seagate | ST3320620AS | 320GB | Barracuda 7200.10 | 56MB/s | 76MB/s | 3.AAE | N/A | ? |
Seagate | ST3320620AS | 320GB | Barracuda 7200.10 | 65MB/s | 72MB/s | 3.AAK | 3.AAM | ? |
Seagate | ST3500830AS | 500GB | Barracuda 7200.10 | 43MB/s | 78MB/s | 3.AFD | N/A | ? |
Seagate | ST3320613AS | 320GB | Barracuda 7200.11 | 43MB/s | >110MB/s | SD22 | SD2B | ? |
Seagate | ST3500320AS | 500GB | Barracuda 7200.11 | 53MB/s | >100MB/s | SD15 | SD1A | ? |
Seagate | ST3250310NS | 250GB | Barracuda ES.2 | ~45MB/s | >100MB/s | MA08(dell) | ? | >2 (MA08) |
Seagate | ST3750330NS | 750GB | Barracuda ES.2 | 50MB/s | >100MB/s | SN04 | SN05 | ? |
Seagate | ST31000340NS | 1TB | Barracuda ES.2 | ~50MB/s | >100MB/s | SN04 MA0D | SN05 SN16 AN05 | >2 (MA0D) |
Western Digital | WDC WD800JD-00MSA1 | 80GB | Caviar SE SATAII | 26MB/s | 59MB/s | 10.01E01 | N/A | ? |
Western Digital | WDC WD1600JS-22NCB1 | 160GB | Caviar SE SATAII | 29MB/s | 60MB/s | 10.02E02 | N/A | ? |
Western Digital | WDC WD740ADFD-00NLR5 | 74GB | Raptor EL150 | 18MB/s | 85MB/s | 21.07QR5 | ? | all |
Solution
To restore performance, update to the patched firmware or request patched firmware from the hard drive vendor if it is not already available. Disabling or limiting NCQ with "echo Number > /sys/block/sdX/device/queue_depth" but may degrade performance in other scenarios. Experiment with different values e.g. 2, or 1.
Samsung HD501LJ may freeze
Problem description
The drive can hang randomly (but very infrequent, with 20 identical drives you can see about 1 hang per month)
Root cause
Firmware bug, probably smartctl may alleviate the bug
Affected devices
SAMSUNG HD501LJ (CW026) CR100-12
SAMSUNG HD321KJ (DN133) CP100-12
Solution
Update firmware to CR100-13 / CP100-13.
WD MyBook Studio Edition does not work with JMicron eSATA
The Western Digital MyBook Studio Edition external hard drive does not work (it is not detected) when connected to a JMicron adapter like JMB363. See kernel bug 9913.
The problem is workarounded, by switching the eSATA link speed down to 1,5Gbps.
Recent HP laptops fail disk detection after resuming from suspend
Problem description
As of 2009-05-18, many recent HP laptops fail to detect disk after resuming from suspend if the ATA controller is in AHCI mode.
The problem has been reported to HP and it seems that HP already released firmware updates which fix the problem on some of the affected machines. libata developer contacted HP but hasn't been successful at getting information to work around the problem. Please read bko#12276 for details.
Affected devices
The problem has been reported on HP Pavilion dv5, DV5t-1000, HDX16t, dv3507, HDX18, dv6-1030us, dv5-1120el, DV4 1050el and dv7. Note that not all reports have been verified and the list may not be complete.
Solution
The solution is to update the bios of the affected Hp machines. Hp is rolling out in these days (end of May 2009) newer releases.
For hp dv5-1xxx (intel) the bios that fix the issue is F.16A For hp dv4-1050el is F.34A For hp HDX18t is F23 (the list is not complete, add other models..)
Samsung Spinpoint F3 and Spinpoint F3EG are not detected on AMD SB850
Problem description
The drive will clunck every 2 seconds for 10 seconds and then power off itself when attached on the AMD SB850 southbridge.
Root cause
Firmware bug.
Affected devices
Samsung Spinpoint F3
Samsung Spinpoint F3EG
Solution
Update disk firmware.
Samsung Spinpoint F4EG suffer from silent data corruption
Problem description
Disks will develop corrupted sectors from write errors. Those errors come to surface if IDENTIFY DEVICE command is issued(by smartctl or hdparm for example) and at the same time writes are occuring with NCQ enabled. Erroneously no error messages are logged anywhere despite the actual corruption.
Root cause
The problem only occurs with NCQ enabled (Firmware bug)
Affected devices
Samsung Spinpoint F4EG HD155UI, HD155UI/Z4, HD155UI/UZ4 1,5TB & HD204UI, HD204UI/Z4, HD204UI/UZ4 2TB and HD204UI/JP1, HD204UI/JP2 2TB
Solution
Update disk firmware.
WARNING:Patched firmware cannot be identified as it shares the same revision number with the original broken firmware! All owners of those disks must execute the update utility to verify that their disk runs the patched firmware.
NOTICE: Disks will suffer data corruption even if their owner does not use smartctl or hdparm as those utilities are shipped with linux distributions and smartctl is used at every boot in order to protect users from failing or failed disks.
Attached storage devices may not be detected on Intel ICH5 series southbridge
Problem description
When all six storage devices are populated on ICH5(4 PATA and 2 SATA) only four of them will be detected by BIOS and OS on some mainboards (e.g. Albatron 865 with ICH5).
Root cause
Odd BIOS implementation uses legacy mode with compatible configuration option 3 by setting PATA and SATA controller in combined mode. This limit was introduced in the name of BIOS code forward compatibility with future designs such as ICH6 which have 4 SATA.
Affected southbridge
ICH5
Solution
None, contact mainboard manufacturer and request BIOS update that will provide an option to use ICH5 in non combined mode with enhanced configuration by setting the PATA controller in legacy mode and the SATA controller in native mode.
SATA 3.0 (6Gb/s) devices may not be detected on Intel 5 series and mobile 5 series southbridge
Problem description
Handshake between SATA device and SATA controller on Ibex Peak PCH southbridge may not complete after booting from ACPI G2, G3 state or after resuming from ACPI S3, S4 state and thus device will not down-shift to 3Gb/s speed resulting in undetected device.
Root cause
Southbridge erratum #21.
Affected southbridge
3400, 3420, 3450, B55, H55, HM55, P55, PM55, H57, HM57, Q57, QM57, QS57
Solution
None, reset the system until device is detected.
SATA devices may not be detected at the 3 Gb/s ports of B2 stepping Intel 6 series and mobile 6 series southbridge
Problem description
SATA link(s) performance may degrade over time on some B2 stepping Cougar Point PCH southbridge. Links will develop increased bit error rates and failed transfers have to be retried upon error detection by the SATA controller. As the wear out continues performance will get worse as the SATA controller will spend more time retrying failed transfers than it will spend on sending actual data. At some point things will get so bad that attached devices will be disconnected because of unreliable link from unstable clock and will not be detected at all.
Root cause
Southbridge erratum #14.
Affected southbridge
H67, P67, HM65, HM67
Solution
None, use only the two 6 Gb/s ports until manufacturer replaces existing B2 PCH systems exhibiting problems at the 3 Gb/s SATA ports with B3 stepping. B3 stepping will be available on late April though.
Hardware design issues
IDE port poor performance
Problem description
IDE disks have poor performance when installed on a couple MSI boards.
Root cause
Odd board design, to overcome southbridge limitation(no IDE bus). While there are 5 unused PCIe ports on the southbridge a USB to IDE IC is used (JM20335) instead of a decent PCIe to IDE IC. USB 2.0 is a half duplex bus that can only sustain up to 30MB/s. What is worse, if any USB device is attached at the front USB ports(5, 6, 7), that device will share the bandwidth with the JM20335 that is attached at port 8. If multiple USB devices are working at the same time, the performance penalty will get bigger.
Affected boards
vendor | model |
---|---|
MSI | G965M (MS-7241) |
MSI | G965MDH (MS-7276) |
Solution
A PCI or PCIe adapter will provide decent performance needed for recent disks. A PATA to SATA bridge will also do the same job if only one disk is used.
SATA port sequence shift on different SATA modes
Problem description
Some Intel chipsets change SATA port sequence while switching from AHCI or RAID to IDE mode and vice versa. This change can break settings based on specific configurations when disks are attached at the affected ports. For example it can alter the boot order which leads to either wrong disk booting or boot failure.
Root cause
Odd hardware design, affected chips set primary slave at SATA port 2 and secondary master at SATA port 1
Affected chipsets
chipset | SATA port ordering in IDE mode | SATA port ordering in AHCI or RAID mode |
---|---|---|
ICH6R | 0, 2, 1, 3 | 0, 1, 2, 3 |
ICH7R, ICH7DH | 0, 2, 1, 3 | 0, 1, 2, 3 |
631xESB, 632xESB, IICH(3100) | 0, 2, 1, 3 | 0, 1, 2, 3, 4, 5 |
ICH8 ICH8R, ICH8DH, ICH8DO |
0, 2, 1, 3 0, 2, 1, 3, 4, 5 |
0, 1, 2, 3 0, 1, 2, 3, 4, 5 |
ICH9R, ICH9DH, ICH9DO | 0, 2, 1, 3, 4, 5 | 0, 1, 2, 3, 4, 5 |
ICH10R, ICH10D, ICH10DO | 0, 2, 1, 3, 4, 5 | 0, 1, 2, 3, 4, 5 |
PCH(3420, 3450, B55, H55, HM55, P55, PM55, H57, HM57, Q57, QM57, QS57) | 0, 2, 1, 3, 4, 5 | 0, 1, 2, 3, 4, 5 |
Solution
Use device UUID or label based device nodes which is default on most modern distros anyway. If traditional device nodes need to be used, choose the desired mode and stick to it.
ATA 4 KiB sector issues
This issue is discussed in a separate page - ATA 4 KiB sector issues.