| APEI Error INJection |
| ~~~~~~~~~~~~~~~~~~~~ |
| |
| EINJ provides a hardware error injection mechanism |
| It is very useful for debugging and testing of other APEI and RAS features. |
| |
| To use EINJ, make sure the following are enabled in your kernel |
| configuration: |
| |
| CONFIG_DEBUG_FS |
| CONFIG_ACPI_APEI |
| CONFIG_ACPI_APEI_EINJ |
| |
| The user interface of EINJ is debug file system, under the |
| directory apei/einj. The following files are provided. |
| |
| - available_error_type |
| Reading this file returns the error injection capability of the |
| platform, that is, which error types are supported. The error type |
| definition is as follow, the left field is the error type value, the |
| right field is error description. |
| |
| 0x00000001 Processor Correctable |
| 0x00000002 Processor Uncorrectable non-fatal |
| 0x00000004 Processor Uncorrectable fatal |
| 0x00000008 Memory Correctable |
| 0x00000010 Memory Uncorrectable non-fatal |
| 0x00000020 Memory Uncorrectable fatal |
| 0x00000040 PCI Express Correctable |
| 0x00000080 PCI Express Uncorrectable fatal |
| 0x00000100 PCI Express Uncorrectable non-fatal |
| 0x00000200 Platform Correctable |
| 0x00000400 Platform Uncorrectable non-fatal |
| 0x00000800 Platform Uncorrectable fatal |
| |
| The format of file contents are as above, except there are only the |
| available error type lines. |
| |
| - error_type |
| This file is used to set the error type value. The error type value |
| is defined in "available_error_type" description. |
| |
| - error_inject |
| Write any integer to this file to trigger the error |
| injection. Before this, please specify all necessary error |
| parameters. |
| |
| - flags |
| Present for kernel version 3.13 and above. Used to specify which |
| of param{1..4} are valid and should be used by BIOS during injection. |
| Value is a bitmask as specified in ACPI5.0 spec for the |
| SET_ERROR_TYPE_WITH_ADDRESS data structure: |
| Bit 0 - Processor APIC field valid (see param3 below) |
| Bit 1 - Memory address and mask valid (param1 and param2) |
| Bit 2 - PCIe (seg,bus,dev,fn) valid (param4 below) |
| If set to zero, legacy behaviour is used where the type of injection |
| specifies just one bit set, and param1 is multiplexed. |
| |
| - param1 |
| This file is used to set the first error parameter value. Effect of |
| parameter depends on error_type specified. For example, if error |
| type is memory related type, the param1 should be a valid physical |
| memory address. [Unless "flag" is set - see above] |
| |
| - param2 |
| This file is used to set the second error parameter value. Effect of |
| parameter depends on error_type specified. For example, if error |
| type is memory related type, the param2 should be a physical memory |
| address mask. Linux requires page or narrower granularity, say, |
| 0xfffffffffffff000. |
| |
| - param3 |
| Used when the 0x1 bit is set in "flag" to specify the APIC id |
| |
| - param4 |
| Used when the 0x4 bit is set in "flag" to specify target PCIe device |
| |
| - notrigger |
| The EINJ mechanism is a two step process. First inject the error, then |
| perform some actions to trigger it. Setting "notrigger" to 1 skips the |
| trigger phase, which *may* allow the user to cause the error in some other |
| context by a simple access to the cpu, memory location, or device that is |
| the target of the error injection. Whether this actually works depends |
| on what operations the BIOS actually includes in the trigger phase. |
| |
| BIOS versions based in the ACPI 4.0 specification have limited options |
| to control where the errors are injected. Your BIOS may support an |
| extension (enabled with the param_extension=1 module parameter, or |
| boot command line einj.param_extension=1). This allows the address |
| and mask for memory injections to be specified by the param1 and |
| param2 files in apei/einj. |
| |
| BIOS versions using the ACPI 5.0 specification have more control over |
| the target of the injection. For processor related errors (type 0x1, |
| 0x2 and 0x4) the APICID of the target should be provided using the |
| param1 file in apei/einj. For memory errors (type 0x8, 0x10 and 0x20) |
| the address is set using param1 with a mask in param2 (0x0 is equivalent |
| to all ones). For PCI express errors (type 0x40, 0x80 and 0x100) the |
| segment, bus, device and function are specified using param1: |
| |
| 31 24 23 16 15 11 10 8 7 0 |
| +-------------------------------------------------+ |
| | segment | bus | device | function | reserved | |
| +-------------------------------------------------+ |
| |
| An ACPI 5.0 BIOS may also allow vendor specific errors to be injected. |
| In this case a file named vendor will contain identifying information |
| from the BIOS that hopefully will allow an application wishing to use |
| the vendor specific extension to tell that they are running on a BIOS |
| that supports it. All vendor extensions have the 0x80000000 bit set in |
| error_type. A file vendor_flags controls the interpretation of param1 |
| and param2 (1 = PROCESSOR, 2 = MEMORY, 4 = PCI). See your BIOS vendor |
| documentation for details (and expect changes to this API if vendors |
| creativity in using this feature expands beyond our expectations). |
| |
| Example: |
| # cd /sys/kernel/debug/apei/einj |
| # cat available_error_type # See which errors can be injected |
| 0x00000002 Processor Uncorrectable non-fatal |
| 0x00000008 Memory Correctable |
| 0x00000010 Memory Uncorrectable non-fatal |
| # echo 0x12345000 > param1 # Set memory address for injection |
| # echo 0xfffffffffffff000 > param2 # Mask - anywhere in this page |
| # echo 0x8 > error_type # Choose correctable memory error |
| # echo 1 > error_inject # Inject now |
| |
| |
| For more information about EINJ, please refer to ACPI specification |
| version 4.0, section 17.5 and ACPI 5.0, section 18.6. |