Libata error messages

From ata Wiki
Revision as of 03:37, 28 March 2008 by Jgarzik (Talk | contribs)

Jump to: navigation, search

Contents

Overview

All libata error messages produced by the kernel use a standard format:

ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata3.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
         res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata3.00: status: { DRDY }

Prefix

The prefix

ata3.00:

decodes as

ata prefix, indicating this is a libata port or device message
3 port number, counting from one (1)
00 device number, usually zero unless Port Multiplier or PATA master/slave is involved

Exception line

The exception line gives an overview of the EH (Error Handler) state.

exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
Emask Error classification bitmask (AC_ERR_xxx in source code)
SAct SATA SActive register
SErr SATA SError register
action ATA_EH_xxx actions, like revalidate, softreset, hardreset (see source code)
frozen if present, indicates the port was frozen for EH
t<number> number of retries

Input taskfile

The "cmd" line gives the ATA command (taskfile) sent to the device:

cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0

This lists ATA registers in the following order:

Command
(separator)
Feature
NSect
LBA L
LBA M
LBA H
(separator)
HOB Feature
HOB NSect
HOB LBA L
HOB LBA M
HOB LBA H
tag NCQ tag number, or listed as zero if NCQ is not active/applicable.

Output taskfile, error summary

The next line contains a current dump of the ATA device's registers, along with an error summary:

res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)

In order:

Status
(separator)
Error
NSect
LBA L
LBA M
LBA H
(separator)
HOB Error
HOB NSect
HOB LBA L
HOB LBA M
HOB LBA H
Emask ATA command's internal error mask (AC_ERR_xxx in source code)
(summary) An English summary of the error, such as
  • timeout
  • HSM violation
  • media error

[#Error_classes See below] for a full list.

ATA status expansion

The final line

status: { DRDY }

expands the ATA status register returned in the output taskfile into its component bits:

Busy Device busy (all other bits invalid)
DRDY Device ready. Normally 1, when all is OK.
DRQ Data ready to be sent/received via PIO
DF Device fault
ERR Error (see Error register for more info)

Error classes

These are the possible values for the (summary) in each error message, above.

host bus error Host<->chip bus error (i.e. PCI, if on PCI bus)
ATA bus error Host<->device bus error
timeout Controller failed to respond to an active ATA command. This could be any number of causes. Most often this is due to an unrelated interrupt subsystem bug (try booting with 'pci=nomsi' or 'acpi=off' or 'noapic'), which failed to deliver an interrupt when we were expecting one from the hardware.
HSM violation Hardware failed to respond in an expected manner
internal error Hardware flagged an impossible condition, most likely due to software misprogramming.
media error Software detected a media error
invalid argument Software marked ATA command as invalid, for some reason
device error Hardware indicates an error with last command
unknown error Uncategorized error (should never happen)
Personal tools