Intel Architecture Software Developer's Manual
Volume 3 : System Programming
CHAPTER 5 INTERRUPT AND EXCEPTION HANDLING
This chapter describes the processor's interrupt and exception-handling mechanism, when operating
in protected mode. Most of the information provided here also applies to the interrupt and
exception mechanism used in real-address or virtual-8086 mode. Refer to Chapter 16, 8086
Emulation for a description of the differences in the interrupt and exception mechanism for realaddress
and virtual-8086 mode.
5.1. INTERRUPT AND EXCEPTION OVERVIEW
Interrupts and exceptions are forced transfers of execution from the currently running program
or task to a special procedure or task called a handler. Interrupts typically occur at random times
during the execution of a program, in response to signals from hardware. They are used to handle
events external to the processor, such as requests to service peripheral devices. Software can also
generate interrupts by executing the INT n instruction. Exceptions occur when the processor
detects an error condition while executing an instruction, such as division by zero. The processor
detects a variety of error conditions including protection violations, page faults, and internal
machine faults. The machine-check architecture of the P6 family and PentiumR processors
also permits a machine-check exception to be generated when internal hardware errors and bus
errors are detected.
The processor's interrupt and exception-handling mechanism allows interrupts and exceptions
to be handled transparently to application programs and the operating system or executive.
When an interrupt is received or an exception is detected, the currently running procedure or
task is automatically suspended while the processor executes an interrupt or exception handler.
When execution of the handler is complete, the processor resumes execution of the interrupted
procedure or task. The resumption of the interrupted procedure or task happens without loss of
program continuity, unless recovery from an exception was not possible or an interrupt caused
the currently running program to be terminated.
This chapter describes the processor's interrupt and exception-handling mechanism, when operating
in protected mode. A detailed description of the exceptions and the conditions that cause
them to be generated is given at the end of this chapter. Refer to Chapter 16, 8086 Emulation for
a description of the interrupt and exception mechanism for real-address and virtual-8086 mode.
5.1.1. Sources of Interrupts
The processor receives interrupts from two sources:
External (hardware generated) interrupts.
Software-generated interrupts.
5.1.1.1. EXTERNAL INTERRUPTS
External interrupts are received through pins on the processor or through the local APIC serial
bus. The primary interrupt pins on a P6 family or PentiumR processor are the LINT[1:0] pins,
which are connected to the local APIC (refer to Section 7.5., "Advanced Programmable Interrupt
Controller (APIC)" in Chapter 7, Multiple-Processor Management). When the local APIC
is disabled, these pins are configured as INTR and NMI pins, respectively. Asserting the INTR
pin signals the processor that an external interrupt has occurred, and the processor reads from
the system bus the interrupt vector number provided by an external interrupt controller, such as
an 8259A (refer to Section 5.2., "Exception and Interrupt Vectors"). Asserting the NMI pin
signals a nonmaskable interrupt (NMI), which is assigned to interrupt vector 2.
When the local APIC is enabled, the LINT[1:0] pins can be programmed through the APIC's
vector table to be associated with any of the processor's exception or interrupt vectors.
The processor's local APIC can be connected to a system-based I/O APIC. Here, external interrupts
received at the I/O APIC's pins can be directed to the local APIC through the APIC serial
bus (pins PICD[1:0]). The I/O APIC determines the vector number of the interrupt and sends
this number to the local APIC. When a system contains multiple processors, processors can also
send interrupts to one another by means of the APIC serial bus.
The LINT[1:0] pins are not available on the Intel486T processor and the earlier PentiumR
processors that do not contain an on-chip local APIC. Instead these processors have dedicated
NMI and INTR pins. With these processors, external interrupts are typically generated by a
system-based interrupt controller (8259A), with the interrupts being signaled through the INTR
pin.
Note that several other pins on the processor cause a processor interrupt to occur; however, these
interrupts are not handled by the interrupt and exception mechanism described in this chapter.
These pins include the RESET#, FLUSH#, STPCLK#, SMI#, R/S#, and INIT# pins. Which of
these pins are included on a particular Intel Architecture processor is implementation dependent.
The functions of these pins are described in the data books for the individual processors. The
SMI# pin is also described in Chapter 12, System Management Mode (SMM).
5.1.1.2. MASKABLE HARDWARE INTERRUPTS
Any external interrupt that is delivered to the processor by means of the INTR pin or through
the local APIC is called a maskable hardware interrupt. The maskable hardware interrupts
that can be delivered through the INTR pin include all Intel Architecture defined interrupt
vectors from 0 through 255; those that can be delivered through the local APIC include interrupt
vectors 16 through 255.
All maskable hardware interrupts can be masked as a group. Use the single IF flag in the
EFLAGS register (refer to Section 5.6.1., "Masking Maskable Hardware Interrupts") to mask
these maskable interrupts. Note that when interrupts 0 through 15 are delivered through the local
APIC, the APIC indicates the receipt of an illegal vector.
5.1.1.3. SOFTWARE-GENERATED INTERRUPTS
The INT n instruction permits interrupts to be generated from within software by supplying the
interrupt vector number as an operand. For example, the INT 35 instruction forces an implicit
call to the interrupt handler for interrupt 35.
Any of the interrupt vectors from 0 to 255 can be used as a parameter in this instruction. If the
processor's predefined NMI vector is used, however, the response of the processor will not be
the same as it would be from an NMI interrupt generated in the normal manner. If vector number
2 (the NMI vector) is used in this instruction, the NMI interrupt handler is called, but the
processor's NMI-handling hardware is not activated.
Note that interrupts generated in software with the INT n instruction cannot be masked by the
IF flag in the EFLAGS register.
5.1.2. Sources of Exceptions
The processor receives exceptions from three sources:
Processor-detected program-error exceptions.
Software-generated exceptions.
Machine-check exceptions.
5.1.2.1. PROGRAM-ERROR EXCEPTIONS
The processor generates one or more exceptions when it detects program errors during the
execution in an application program or the operating system or executive. The Intel Architecture
defines a vector number for each processor-detectable exception. The exceptions are further
classified as faults, traps, and aborts (refer to Section 5.3., "Exception Classifications").
5.1.2.2. SOFTWARE-GENERATED EXCEPTIONS
The INTO, INT 3, and BOUND instructions permit exceptions to be generated in software.
These instructions allow checks for specific exception conditions to be performed at specific
points in the instruction stream. For example, the INT 3 instruction causes a breakpoint exception
to be generated.
The INT n instruction can be used to emulate a specific exception in software, with one limitation.
If the n operand in the INT n instruction contains a vector for one of the Intel Architecture
exceptions, the processor will generate an interrupt to that vector, which will in turn invoke the
exception handler associated with that vector. Because this is actually an interrupt, however, the
processor does not push an error code onto the stack, even if a hardware-generated exception for
that vector normally produces one. For those exceptions that produce an error code, the exception
handler will attempt to pop an error code from the stack while handling the exception. If the
INT n instruction was used to emulate the generation of an exception, the handler will pop off
and discard the EIP (in place of the missing error code), sending the return to the wrong location.
5.1.2.3. MACHINE-CHECK EXCEPTIONS
The P6 family and PentiumR processors provide both internal and external machine-check
mechanisms for checking the operation of the internal chip hardware and bus transactions.
These mechanisms constitute extended (implementation dependent) exception mechanisms.
When a machine-check error is detected, the processor signals a machine-check exception
(vector 18) and returns an error code. Refer to "Interrupt 18-Machine Check Exception
(#MC)" at the end of this chapter and Chapter 13, Machine-Check Architecture, for a detailed
description of the machine-check mechanism.
5.2. EXCEPTION AND INTERRUPT VECTORS
The processor associates an identification number, called a vector, with each exception and
interrupt. Table 5-1 shows the assignment of exception and interrupt vectors. This table also
gives the exception type for each vector, indicates whether an error code is saved on the stack
for an exception, and gives the source of the exception or interrupt.
The vectors in the range 0 through 31 are assigned to the exceptions and the NMI interrupt. Not
all of these vectors are currently used by the processor. Unassigned vectors in this range are
reserved for possible future uses. Do not use the reserved vectors.
The vectors in the range 32 to 255 are designated as user-defined interrupts. These interrupts are
not reserved by the Intel Architecture and are generally assigned to external I/O devices and to
permit them to signal the processor through one of the external hardware interrupt mechanisms
described in Section 5.1.1., "Sources of Interrupts"
5.3. EXCEPTION CLASSIFICATIONS
Exceptions are classified as faults, traps, or aborts depending on the way they are reported and
whether the instruction that caused the exception can be restarted with no loss of program or task
continuity.
Faults
A fault is an exception that can generally be corrected and that, once corrected,
allows the program to be restarted with no loss of continuity. When a fault is
reported, the processor restores the machine state to the state prior to the beginning
of execution of the faulting instruction. The return address (saved contents
of the CS and EIP registers) for the fault handler points to the faulting instruction,
rather than the instruction following the faulting instruction.
Note: There are a small subset of exceptions that are normally reported as
faults, but under architectural corner cases, they are not restartable and some
processor context will be lost. An example of these cases is the execution of the
POPAD instruction where the stack frame crosses over the the end of the stack
segment. The exception handler will see that the CS:EIP has been restored as
if the POPAD instruction had not executed however internal processor state
(general purpose registers) will have been modified. These corner cases are
considered programming errors and an application causeing this class of
exceptions will likely be terminated by the operating system.
Traps
A trap is an exception that is reported immediately following the execution of
the trapping instruction. Traps allow execution of a program or task to be
continued without loss of program continuity. The return address for the trap
handler points to the instruction to be executed after the trapping instruction.
Aborts
An abort is an exception that does not always report the precise location of the
instruction causing the exception and does not allow restart of the program or
task that caused the exception. Aborts are used to report severe errors, such as
hardware errors and inconsistent or illegal values in system tables.
NOTES:
1. The UD2 instruction was introduced in the PentiumR Pro processor.
2. Intel Architecture processors after the Intel386T processor do not generate this exception.
3. This exception was introduced in the Intel486T processor.
4. This exception was introduced in the PentiumR processor and enhanced in the P6 family processors.
5. This exception was introduced in the PentiumR III processor.
5.4. PROGRAM OR TASK RESTART
To allow restarting of program or task following the handling of an exception or an interrupt, all
exceptions except aborts are guaranteed to report the exception on a precise instruction
boundary, and all interrupts are guaranteed to be taken on an instruction boundary.
For fault-class exceptions, the return instruction pointer that the processor saves when it generates
the exception points to the faulting instruction. So, when a program or task is restarted
following the handling of a fault, the faulting instruction is restarted (re-executed). Restarting
the faulting instruction is commonly used to handle exceptions that are generated when access
to an operand is blocked. The most common example of a fault is a page-fault exception (#PF)
that occurs when a program or task references an operand in a page that is not in memory. When
a page-fault exception occurs, the exception handler can load the page into memory and resume
execution of the program or task by restarting the faulting instruction. To insure that this instruction
restart is handled transparently to the currently executing program or task, the processor
saves the necessary registers and stack pointers to allow it to restore itself to its state prior to the
execution of the faulting instruction.
For trap-class exceptions, the return instruction pointer points to the instruction following the
trapping instruction. If a trap is detected during an instruction which transfers execution, the
return instruction pointer reflects the transfer. For example, if a trap is detected while executing
a JMP instruction, the return instruction pointer points to the destination of the JMP instruction,
not to the next address past the JMP instruction. All trap exceptions allow program or task restart
with no loss of continuity. For example, the overflow exception is a trapping exception. Here,
the return instruction pointer points to the instruction following the INTO instruction that tested
the OF (overflow) flag in the EFLAGS register. The trap handler for this exception resolves the
overflow condition. Upon return from the trap handler, program or task execution continues at
the next instruction following the INTO instruction.
The abort-class exceptions do not support reliable restarting of the program or task. Abort
handlers generally are designed to collect diagnostic information about the state of the processor
when the abort exception occurred and then shut down the application and system as gracefully
as possible.
Interrupts rigorously support restarting of interrupted programs and tasks without loss of continuity.
The return instruction pointer saved for an interrupt points to the next instruction to be
executed at the instruction boundary where the processor took the interrupt. If the instruction
just executed has a repeat prefix, the interrupt is taken at the end of the current iteration with the
registers set to execute the next iteration.
The ability of a P6 family processor to speculatively execute instructions does not affect the
taking of interrupts by the processor. Interrupts are taken at instruction boundaries located
during the retirement phase of instruction execution; so they are always taken in the "in-order"
instruction stream. Refer to Chapter 2, Introduction to the Intel Architecture, in the Intel Architecture
Software Developer's Manual, Volume 1, for more information about the P6 family
processors' microarchitecture and its support for out-of-order instruction execution.
Note that the PentiumR processor and earlier Intel Architecture processors also perform varying
amounts of prefetching and preliminary decoding of instructions; however, here also exceptions
and interrupts are not signaled until actual "in-order" execution of the instructions. For a given
code sample, the signaling of exceptions will occur uniformly when the code is executed on any
family of Intel Architecture processors (except where new exceptions or new opcodes have been
defined).
5.5. NONMASKABLE INTERRUPT (NMI)
The nonmaskable interrupt (NMI) can be generated in either of two ways:
External hardware asserts the NMI pin.
The processor receives a message on the APIC serial bus of delivery mode NMI.
When the processor receives a NMI from either of these sources, the processor handles it immediately
by calling the NMI handler pointed to by interrupt vector number 2. The processor also
invokes certain hardware conditions to insure that no other interrupts, including NMI interrupts,
are received until the NMI handler has completed executing (refer to Section 5.5.1., "Handling
Multiple NMIs").
Also, when an NMI is received from either of the above sources, it cannot be masked by the IF
flag in the EFLAGS register.
It is possible to issue a maskable hardware interrupt (through the INTR pin) to vector 2 to invoke
the NMI interrupt handler; however, this interrupt will not truly be an NMI interrupt. A true NMI
interrupt that activates the processor's NMI-handling hardware can only be delivered through
one of the mechanisms listed above.
5.5.1. Handling Multiple NMIs
While an NMI interrupt handler is executing, the processor disables additional calls to the NMI
handler until the next IRET instruction is executed. This blocking of subsequent NMIs prevents
stacking up calls to the NMI handler. It is recommended that the NMI interrupt handler be
accessed through an interrupt gate to disable maskable hardware interrupts (refer to Section
5.6.1., "Masking Maskable Hardware Interrupts").
5.6. ENABLING AND DISABLING INTERRUPTS
The processor inhibits the generation of some interrupts, depending on the state of the processor
and of the IF and RF flags in the EFLAGS register, as described in the following sections.
5.6.1. Masking Maskable Hardware Interrupts
The IF flag can disable the servicing of maskable hardware interrupts received on the
processor's INTR pin or through the local APIC (refer to Section 5.1.1.2., "Maskable Hardware
Interrupts"). When the IF flag is clear, the processor inhibits interrupts delivered to the INTR
pin or through the local APIC from generating an internal interrupt request; when the IF flag is
set, interrupts delivered to the INTR or through the local APIC pin are processed as normal
external interrupts. The IF flag does not affect nonmaskable interrupts (NMIs) delivered to the
NMI pin or delivery mode NMI messages delivered through the APIC serial bus, nor does it
affect processor generated exceptions. As with the other flags in the EFLAGS register, the
processor clears the IF flag in response to a hardware reset.
The fact that the group of maskable hardware interrupts includes the reserved interrupt and
exception vectors 0 through 32 can potentially cause confusion. Architecturally, when the IF
flag is set, an interrupt for any of the vectors from 0 through 32 can be delivered to the processor
through the INTR pin and any of the vectors from 16 through 32 can be delivered through the
local APIC. The processor will then generate an interrupt and call the interrupt or exception
handler pointed to by the vector number. So for example, it is possible to invoke the page-fault
handler through the INTR pin (by means of vector 14); however, this is not a true page-fault
exception. It is an interrupt. As with the INT n instruction (refer to Section 5.1.2.2., "Software-
Generated Exceptions"), when an interrupt is generated through the INTR pin to an exception
vector, the processor does not push an error code on the stack, so the exception handler may not
operate correctly.
The IF flag can be set or cleared with the STI (set interrupt-enable flag) and CLI (clear interruptenable
flag) instructions, respectively. These instructions may be executed only if the CPL is
equal to or less than the IOPL. A general-protection exception (#GP) is generated if they are
executed when the CPL is greater than the IOPL. (The effect of the IOPL on these instructions
is modified slightly when the virtual mode extension is enabled by setting the VME flag in
control register CR4, refer to Section 16.3., "Interrupt and Exception Handling in Virtual-8086
Mode" in Chapter 16, 8086 Emulation.)
The IF flag is also affected by the following operations:
The PUSHF instruction stores all flags on the stack, where they can be examined and
modified. The POPF instruction can be used to load the modified flags back into the
EFLAGS register.
Task switches and the POPF and IRET instructions load the EFLAGS register; therefore,
they can be used to modify the setting of the IF flag.
When an interrupt is handled through an interrupt gate, the IF flag is automatically cleared,
which disables maskable hardware interrupts. (If an interrupt is handled through a trap
gate, the IF flag is not cleared.)
Refer to the descriptions of the CLI, STI, PUSHF, POPF, and IRET instructions in Chapter 3,
Instruction Set Reference, of the Intel Architecture Software Developer's Manual, Volume 2, for
a detailed description of the operations these instructions are allowed to perform on the IF flag.
5.6.2. Masking Instruction Breakpoints
The RF (resume) flag in the EFLAGS register controls the response of the processor to instruction-
breakpoint conditions (refer to the description of the RF flag in Section 2.3., "System Flags
and Fields in the EFLAGS Register" in Chapter 2, System Architecture Overview). When set, it
prevents an instruction breakpoint from generating a debug exception (#DB); when clear,
instruction breakpoints will generate debug exceptions. The primary function of the RF flag is
to prevent the processor from going into a debug exception loop on an instruction-breakpoint.
Refer to Section 15.3.1.1., "Instruction-Breakpoint Exception Condition", in Chapter 15,
Debugging and Performance Monitoring, for more information on the use of this flag.
5.6.3. Masking Exceptions and Interrupts When Switching
Stacks
To switch to a different stack segment, software often uses a pair of instructions, for example:
MOV SS, AX
MOV ESP, StackTop
If an interrupt or exception occurs after the segment selector has been loaded into the SS register
but before the ESP register has been loaded, these two parts of the logical address into the stack
space are inconsistent for the duration of the interrupt or exception handler.
To prevent this situation, the processor inhibits interrupts, debug exceptions, and single-step trap
exceptions after either a MOV to SS instruction or a POP to SS instruction, until the instruction
boundary following the next instruction is reached. All other faults may still be generated. If the
LSS instruction is used to modify the contents of the SS register (which is the recommended
method of modifying this register), this problem does not occur.
5.7. PRIORITY AMONG SIMULTANEOUS EXCEPTIONS AND
INTERRUPTS
If more than one exception or interrupt is pending at an instruction boundary, the processor
services them in a predictable order. Table 5-3 shows the priority among classes of exception
and interrupt sources. While priority among these classes is consistent throughout the architecture,
exceptions within each class are implementation-dependent and may vary from processor
to processor. The processor first services a pending exception or interrupt from the class which
has the highest priority, transferring execution to the first instruction of the handler. Lower
priority exceptions are discarded; lower priority interrupts are held pending. Discarded exceptions
are re-generated when the interrupt handler returns execution to the point in the program
or task where the exceptions and/or interrupts occurred.
The PentiumR III processor added the SIMD floating-point execution unit. The SIMD floatingpoint
execution unit can generate exceptions as well. Since the SIMD floating-point execution
unit utilizes a 4-wide register set an exception may result from more than one operand within a
SIMD floating-point register. Hence the PentiumR III processor handles these exceptions
according to a predetermined precedence. When a sub-operand of a packed instruction generates
two or more exception conditions, the exception precedence sometimes results in the higher
priority exception being handled and the lower priority exceptions being ignored. Prioritization
of exceptions is performed only on a sub-operand basis, and not between suboperands. For
example, an invalid exception generated by one sub-operand will not prevent the reporting of a
divide-by-zero exception generated by another sub-operand. Table 5-2 shows the precedence for
Streaming SIMD Extensions numeric exceptions. The table reflects the order in which interrupts
are handled upon simultaneous recognition by the processor (for example, when multiple interrupts
are pending at an instruction boundary). However, the table does not necessarily reflect the
order in which interrupts will be recognized by the processor if received simultaneously at the
processor pins.
1. Though this is not an exception, the handling of a QNaN operand has precedence over lower priority
exceptions. For example, a QNaN divided by zero results in a QNaN, not a zero-divide exception.
2. If masked, then instruction execution continues, and a lower priority exception can occur as well.
5.8. INTERRUPT DESCRIPTOR TABLE (IDT)
The interrupt descriptor table (IDT) associates each exception or interrupt vector with a gate
descriptor for the procedure or task used to service the associated exception or interrupt. Like
the GDT and LDTs, the IDT is an array of 8-byte descriptors (in protected mode). Unlike the
GDT, the first entry of the IDT may contain a descriptor. To form an index into the IDT, the
processor scales the exception or interrupt vector by eight (the number of bytes in a gate
descriptor). Because there are only 256 interrupt or exception vectors, the IDT need not contain
more than 256 descriptors. It can contain fewer than 256 descriptors, because descriptors are
required only for the interrupt and exception vectors that may occur. All empty descriptor slots
in the IDT should have the present flag for the descriptor set to 0.
Table 5-3. Priority Among Simultaneous Exceptions and Interrupts
Priority |
Descriptions |
1 (Highest) |
Hardware Reset and Machine Checks - RESET - Machine Check |
2 |
Trap on Task Switch - T flag in TSS is set |
3 |
External Hardware Interventions - FLUSH - STOPCLK - SMI - INIT |
4 |
Traps on the Previous Instruction - Breakpoints -Debug Trap Exceptions (TF flag set or data/I-O breakpoint) |
5 |
External Interrupts - NMI Interrupts - Maskable Hardware Interrupts |
6 |
Faults from Fetching Next Instruction -Code Breakpoint Fault -Code-Segment Limit Violation1 -Code Page Fault1 |
7 |
Faults from Decoding the Next Instruction - Instruction length > 15 bytes - Illegal Opcode - Coprocessor Not Available |
8 (Lowest) |
Faults on Executing an Instruction - Floating-point exception - Overflow - Bound error - Invalid TSS - Segment Not Present - Stack fault - General Protection - Data Page Fault - Alignment Check - SIMD floating-point exception |
NOTE:
1. For the PentiumR and Intel486T processors, the Code Segment Limit Violation and the Code Page Fault
exceptions are assigned to the priority 7.
The base addresses of the IDT should be aligned on an 8-byte boundary to maximize performance
of cache line fills. The limit value is expressed in bytes and is added to the base address
to get the address of the last valid byte. A limit value of 0 results in exactly 1 valid byte. Because
IDT entries are always eight bytes long, the limit should always be one less than an integral
multiple of eight (that is, 8N - 1).
The IDT may reside anywhere in the linear address space. As shown in Figure 5-1, the processor
locates the IDT using the IDTR register. This register holds both a 32-bit base address and 16-bit
limit for the IDT.
Figure 5-1
The LIDT (load IDT register) and SIDT (store IDT register) instructions load and store the
contents of the IDTR register, respectively. The LIDT instruction loads the IDTR register with
the base address and limit held in a memory operand. This instruction can be executed only
when the CPL is 0. It normally is used by the initialization code of an operating system when
creating an IDT. An operating system also may use it to change from one IDT to another. The
SIDT instruction copies the base and limit value stored in IDTR to memory. This instruction can
be executed at any privilege level.
If a vector references a descriptor beyond the limit of the IDT, a general-protection exception
(#GP) is generated.
5.9. IDT DESCRIPTORS
The IDT may contain any of three kinds of gate descriptors:
Task-gate descriptor
Interrupt-gate descriptor
Trap-gate descriptor
Figure 5-2 shows the formats for the task-gate, interrupt-gate, and trap-gate descriptors. The
format of a task gate used in an IDT is the same as that of a task gate used in the GDT or an LDT
(refer to Section 6.2.4., "Task-Gate Descriptor" in Chapter 6, Task Management). The task gate
contains the segment selector for a TSS for an exception and/or interrupt handler task.
Interrupt and trap gates are very similar to call gates (refer to Section 4.8.3., "Call Gates" in
Chapter 4, Protection). They contain a far pointer (segment selector and offset) that the
processor uses to transfer execution to a handler procedure in an exception- or interrupt-handler
code segment. These gates differ in the way the processor handles the IF flag in the EFLAGS
register (refer to Section 5.10.1.2., "Flag Usage By Exception- or Interrupt-Handler Procedure").
5.10. EXCEPTION AND INTERRUPT HANDLING
The processor handles calls to exception- and interrupt-handlers similar to the way it handles
calls with a CALL instruction to a procedure or a task. When responding to an exception or interrupt,
the processor uses the exception or interrupt vector as an index to a descriptor in the IDT.
If the index points to an interrupt gate or trap gate, the processor calls the exception or interrupt
handler in a manner similar to a CALL to a call gate (refer to Section 4.8.2., "Gate Descriptors"
through Section 4.8.6., "Returning from a Called Procedure" in Chapter 4, Protection). If index
points to a task gate, the processor executes a task switch to the exception- or interrupt-handler
task in a manner similar to a CALL to a task gate (refer to Section 6.3., "Task Switching" in
Chapter 6, Task Management).
5.10.1. Exception- or Interrupt-Handler Procedures
An interrupt gate or trap gate references an exception- or interrupt-handler procedure that runs
in the context of the currently executing task (refer to Figure 5-3). The segment selector for the
gate points to a segment descriptor for an executable code segment in either the GDT or the
current LDT. The offset field of the gate descriptor points to the beginning of the exception- or
interrupt-handling procedure.
When the processor performs a call to the exception- or interrupt-handler procedure, it saves the
current states of the EFLAGS register, CS register, and EIP register on the stack (refer to Figure
5-4). (The CS and EIP registers provide a return instruction pointer for the handler.) If an exception
causes an error code to be saved, it is pushed on the stack after the EIP value.
If the handler procedure is going to be executed at the same privilege level as the interrupted
procedure, the handler uses the current stack.
If the handler procedure is going to be executed at a numerically lower privilege level, a stack
switch occurs. When a stack switch occurs, a stack pointer for the stack to be returned to is also
saved on the stack. (The SS and ESP registers provide a return stack pointer for the handler.)
The segment selector and stack pointer for the stack to be used by the handler is obtained from
the TSS for the currently executing task. The processor copies the EFLAGS, SS, ESP, CS, EIP,
and error code information from the interrupted procedure's stack to the handler's stack.
To return from an exception- or interrupt-handler procedure, the handler must use the IRET (or
IRETD) instruction. The IRET instruction is similar to the RET instruction except that it restores
the saved flags into the EFLAGS register. The IOPL field of the EFLAGS register is restored
only if the CPL is 0. The IF flag is changed only if the CPL is less than or equal to the IOPL.
Refer to "IRET/IRETD-Interrupt Return" in Chapter 3 of the Intel Architecture Software
Developer's Manual, Volume 2, for the complete operation performed by the IRET instruction.
If a stack switch occurred when calling the handler procedure, the IRET instruction switches
back to the interrupted procedure's stack on the return.
Figure 5-3
5.10.1.1. PROTECTION OF EXCEPTION- AND INTERRUPT-HANDLER
PROCEDURES
The privilege-level protection for exception- and interrupt-handler procedures is similar to that
used for ordinary procedure calls when called through a call gate (refer to Section 4.8.4.,
"Accessing a Code Segment Through a Call Gate" in Chapter 4, Protection). The processor does
not permit transfer of execution to an exception- or interrupt-handler procedure in a less privileged
code segment (numerically greater privilege level) than the CPL. An attempt to violate this
rule results in a general-protection exception (#GP). The protection mechanism for exceptionand
interrupt-handler procedures is different in the following ways:
Because interrupt and exception vectors have no RPL, the RPL is not checked on implicit
calls to exception and interrupt handlers.
The processor checks the DPL of the interrupt or trap gate only if an exception or interrupt
is generated with an INT n, INT 3, or INTO instruction. Here, the CPL must be less than or
equal to the DPL of the gate. This restriction prevents application programs or procedures
running at privilege level 3 from using a software interrupt to access critical exception
handlers, such as the page-fault handler, providing that those handlers are placed in more
privileged code segments (numerically lower privilege level). For hardware-generated
interrupts and processor-detected exceptions, the processor ignores the DPL of interrupt
and trap gates.
Because exceptions and interrupts generally do not occur at predictable times, these privilege
rules effectively impose restrictions on the privilege levels at which exception and interrupthandling
procedures can run. Either of the following techniques can be used to avoid privilegelevel
violations.
The exception or interrupt handler can be placed in a conforming code segment. This
technique can be used for handlers that only need to access data available on the stack (for
example, divide error exceptions). If the handler needs data from a data segment, the data
segment needs to be accessible from privilege level 3, which would make it unprotected.
The handler can be placed in a nonconforming code segment with privilege level 0. This
handler would always run, regardless of the CPL that the interrupted program or task is
running at.
5.10.1.2. FLAG USAGE BY EXCEPTION- OR INTERRUPT-HANDLER
PROCEDURE
When accessing an exception or interrupt handler through either an interrupt gate or a trap gate,
the processor clears the TF flag in the EFLAGS register after it saves the contents of the
EFLAGS register on the stack. (On calls to exception and interrupt handlers, the processor also
clears the VM, RF, and NT flags in the EFLAGS register, after they are saved on the stack.)
Clearing the TF flag prevents instruction tracing from affecting interrupt response. A subsequent
IRET instruction restores the TF (and VM, RF, and NT) flags to the values in the saved contents
of the EFLAGS register on the stack.
The only difference between an interrupt gate and a trap gate is the way the processor handles
the IF flag in the EFLAGS register. When accessing an exception- or interrupt-handling procedure
through an interrupt gate, the processor clears the IF flag to prevent other interrupts from
interfering with the current interrupt handler. A subsequent IRET instruction restores the IF flag
to its value in the saved contents of the EFLAGS register on the stack. Accessing a handler
procedure through a trap gate does not affect the IF flag.
5.10.2. Interrupt Tasks
When an exception or interrupt handler is accessed through a task gate in the IDT, a task switch
results. Handling an exception or interrupt with a separate task offers several advantages:
The entire context of the interrupted program or task is saved automatically.
A new TSS permits the handler to use a new privilege level 0 stack when handling the
exception or interrupt. If an exception or interrupt occurs when the current privilege level 0
stack is corrupted, accessing the handler through a task gate can prevent a system crash by
providing the handler with a new privilege level 0 stack.
The handler can be further isolated from other tasks by giving it a separate address space.
This is done by giving it a separate LDT.
The disadvantage of handling an interrupt with a separate task is that the amount of machine
state that must be saved on a task switch makes it slower than using an interrupt gate, resulting
in increased interrupt latency.
A task gate in the IDT references a TSS descriptor in the GDT (refer to Figure 5-5). A switch to
the handler task is handled in the same manner as an ordinary task switch (refer to Section 6.3.,
"Task Switching" in Chapter 6, Task Management). The link back to the interrupted task is
stored in the previous task link field of the handler task's TSS. If an exception caused an error
code to be generated, this error code is copied to the stack of the new task.
When exception- or interrupt-handler tasks are used in an operating system, there are actually
two mechanisms that can be used to dispatch tasks: the software scheduler (part of the operating
system) and the hardware scheduler (part of the processor's interrupt mechanism). The software
scheduler needs to accommodate interrupt tasks that may be dispatched when interrupts are
enabled.
5.11. ERROR CODE
When an exception condition is related to a specific segment, the processor pushes an error code
onto the stack of the exception handler (whether it is a procedure or task). The error code has
the format shown in Figure 5-6. The error code resembles a segment selector; however, instead
of a TI flag and RPL field, the error code contains 3 flags:
EXT
External event (bit 0). When set, indicates that an event external to the
program caused the exception, such as a hardware interrupt.
IDT
Descriptor location (bit 1). When set, indicates that the index portion of the
error code refers to a gate descriptor in the IDT; when clear, indicates that the
index refers to a descriptor in the GDT or the current LDT.
TI
GDT/LDT (bit 2). Only used when the IDT flag is clear. When set, the TI flag
indicates that the index portion of the error code refers to a segment or gate
descriptor in the LDT; when clear, it indicates that the index refers to a
descriptor in the current GDT.
The segment selector index field provides an index into the IDT, GDT, or current LDT to the
segment or gate selector being referenced by the error code. In some cases the error code is null
(that is, all bits in the lower word are clear). A null error code indicates that the error was not
caused by a reference to a specific segment or that a null segment descriptor was referenced in
an operation.
The format of the error code is different for page-fault exceptions (#PF), refer to "Interrupt
14-Page-Fault Exception (#PF)" in this chapter.
The error code is pushed on the stack as a doubleword or word (depending on the default interrupt,
trap, or task gate size). To keep the stack aligned for doubleword pushes, the upper half of
the error code is reserved. Note that the error code is not popped when the IRET instruction is
executed to return from an exception handler, so the handler must remove the error code before
executing a return.
Error codes are not pushed on the stack for exceptions that are generated externally (with the
INTR or LINT[1:0] pins) or the INT n instruction, even if an error code is normally produced
for those exceptions.
5.12. EXCEPTION AND INTERRUPT REFERENCE
The following sections describe conditions which generate exceptions and interrupts. They are
arranged in the order of vector numbers. The information contained in these sections are as
follows:
Exception Class
Indicates whether the exception class is a fault, trap, or abort type.
Some exceptions can be either a fault or trap type, depending on
when the error condition is detected. (This section is not applicable
to interrupts.)
Description
Gives a general description of the purpose of the exception or interrupt
type. It also describes how the processor handles the exception
or interrupt.
Exception Error Code
Indicates whether an error code is saved for the exception. If one is
saved, the contents of the error code are described. (This section is
not applicable to interrupts.)
Saved Instruction Pointer
Describes which instruction the saved (or return) instruction pointer
points to. It also indicates whether the pointer can be used to restart
a faulting instruction.
Program State Change
Describes the effects of the exception or interrupt on the state of the
currently running program or task and the possibilities of restarting
the program or task without loss of continuity.
Interrupt 0-Divide Error Exception (#DE)
Exception Class Fault.
Description
Indicates the divisor operand for a DIV or IDIV instruction is 0 or that the result cannot be represented
in the number of bits specified for the destination operand.
Exception Error Code
None.
Saved Instruction Pointer
Saved contents of CS and EIP registers point to the instruction that generated the exception.
Program State Change
A program-state change does not accompany the divide error, because the exception occurs
before the faulting instruction is executed.
Interrupt 1-Debug Exception (#DB)
Exception Class
Trap or Fault. The exception handler can distinguish between traps or
faults by examining the contents of DR6 and the other debug registers.
Description
Indicates that one or more of several debug-exception conditions has been detected. Whether the
exception is a fault or a trap depends on the condition, as shown below:
Refer to Chapter 15, Debugging and Performance Monitoring, for detailed information about
the debug exceptions.
Exception Error Code
None. An exception handler can examine the debug registers to determine which condition
caused the exception.
Saved Instruction Pointer
Fault-Saved contents of CS and EIP registers point to the instruction that generated the
exception.
Trap-Saved contents of CS and EIP registers point to the instruction following the instruction
that generated the exception.
Program State Change
Fault-A program-state change does not accompany the debug exception, because the exception
occurs before the faulting instruction is executed. The program can resume normal execution
upon returning from the debug exception handler
Trap-A program-state change does accompany the debug exception, because the instruction or
task switch being executed is allowed to complete before the exception is generated. However,
the new state of the program is not corrupted and execution of the program can continue reliably.
Interrupt 2-NMI Interrupt
Exception Class Not applicable.
Description
The nonmaskable interrupt (NMI) is generated externally by asserting the processor's NMI pin
or through an NMI request set by the I/O APIC to the local APIC on the APIC serial bus. This
interrupt causes the NMI interrupt handler to be called.
Exception Error Code
Not applicable.
Saved Instruction Pointer
The processor always takes an NMI interrupt on an instruction boundary. The saved contents of
CS and EIP registers point to the next instruction to be executed at the point the interrupt is
taken. Refer to Section 5.4., "Program or Task Restart" for more information about when the
processor takes NMI interrupts.
Program State Change
The instruction executing when an NMI interrupt is received is completed before the NMI is
generated. A program or task can thus be restarted upon returning from an interrupt handler
without loss of continuity, provided the interrupt handler saves the state of the processor before
handling the interrupt and restores the processor's state prior to a return.
Interrupt 3-Breakpoint Exception (#BP)
Exception Class Trap.
Description
Indicates that a breakpoint instruction (INT 3) was executed, causing a breakpoint trap to be
generated. Typically, a debugger sets a breakpoint by replacing the first opcode byte of an
instruction with the opcode for the INT 3 instruction. (The INT 3 instruction is one byte long,
which makes it easy to replace an opcode in a code segment in RAM with the breakpoint
opcode.) The operating system or a debugging tool can use a data segment mapped to the same
physical address space as the code segment to place an INT 3 instruction in places where it is
desired to call the debugger.
With the P6 family, PentiumR, Intel486T, and Intel386T processors, it is more convenient to
set breakpoints with the debug registers. (Refer to Section 15.3.2., "Breakpoint Exception
(#BP)-Interrupt Vector 3", in Chapter 15, Debugging and Performance Monitoring, for information
about the breakpoint exception.) If more breakpoints are needed beyond what the debug
registers allow, the INT 3 instruction can be used.
The breakpoint (#BP) exception can also be generated by executing the INT n instruction with
an operand of 3. The action of this instruction (INT 3) is slightly different than that of the INT
3 instruction (refer to "INTn/INTO/INT3-Call to Interrupt Procedure" in Chapter 3 of the Intel
Architecture Software Developer's Manual, Volume 2).
Exception Error Code
None.
Saved Instruction Pointer
Saved contents of CS and EIP registers point to the instruction following the INT 3 instruction.
Program State Change
Even though the EIP points to the instruction following the breakpoint instruction, the state of
the program is essentially unchanged because the INT 3 instruction does not affect any register
or memory locations. The debugger can thus resume the suspended program by replacing the
INT 3 instruction that caused the breakpoint with the original opcode and decrementing the
saved contents of the EIP register. Upon returning from the debugger, program execution
resumes with the replaced instruction.
Interrupt 4-Overflow Exception (#OF)
Exception Class Trap.
Description
Indicates that an overflow trap occurred when an INTO instruction was executed. The INTO
instruction checks the state of the OF flag in the EFLAGS register. If the OF flag is set, an overflow
trap is generated.
Some arithmetic instructions (such as the ADD and SUB) perform both signed and unsigned
arithmetic. These instructions set the OF and CF flags in the EFLAGS register to indicate signed
overflow and unsigned overflow, respectively. When performing arithmetic on signed operands,
the OF flag can be tested directly or the INTO instruction can be used. The benefit of using the
INTO instruction is that if the overflow exception is detected, an exception handler can be called
automatically to handle the overflow condition.
Exception Error Code
None.
Saved Instruction Pointer
The saved contents of CS and EIP registers point to the instruction following the INTO
instruction.
Program State Change
Even though the EIP points to the instruction following the INTO instruction, the state of the
program is essentially unchanged because the INTO instruction does not affect any register or
memory locations. The program can thus resume normal execution upon returning from the
overflow exception handler.
CHAPTER 6 TASK MANAGEMENT
This chapter describes the Intel Architecture's task management facilities. These facilities are
only available when the processor is running in protected mode.
6.1. TASK MANAGEMENT OVERVIEW
A task is a unit of work that a processor can dispatch, execute, and suspend. It can be used to
execute a program, a task or process, an operating-system service utility, an interrupt or exception
handler, or a kernel or executive utility.
The Intel Architecture provides a mechanism for saving the state of a task, for dispatching tasks
for execution, and for switching from one task to another. When operating in protected mode,
all processor execution takes place from within a task. Even simple systems must define at least
one task. More complex systems can use the processor's task management facilities to support
multitasking applications.
6.1.1. Task Structure
A task is made up of two parts: a task execution space and a task-state segment (TSS). The task
execution space consists of a code segment, a stack segment, and one or more data segments
(refer to Figure 6-1). If an operating system or executive uses the processor's privilege-level
protection mechanism, the task execution space also provides a separate stack for each privilege
level.
The TSS specifies the segments that make up the task execution space and provides a storage
place for task state information. In multitasking systems, the TSS also provides a mechanism for
linking tasks.
NOTE
This chapter describes primarily 32-bit tasks and the 32-bit TSS structure.
For information on 16-bit tasks and the 16-bit TSS structure, refer to Section
6.6., "16-Bit Task-State Segment (TSS)".
A task is identified by the segment selector for its TSS. When a task is loaded into the processor
for execution, the segment selector, base address, limit, and segment descriptor attributes for the
TSS are loaded into the task register (refer to Section 2.4.4., "Task Register (TR)" in Chapter 2,
System Architecture Overview).
If paging is implemented for the task, the base address of the page directory used by the task is
loaded into control register CR3.
Figure 6-1
6.1.2. Task State
The following items define the state of the currently executing task:
The task's current execution space, defined by the segment selectors in the segment
registers (CS, DS, SS, ES, FS, and GS).
The state of the general-purpose registers.
The state of the EFLAGS register.
The state of the EIP register.
The state of control register CR3.
The state of the task register.
The state of the LDTR register.
The I/O map base address and I/O map (contained in the TSS).
Stack pointers to the privilege 0, 1, and 2 stacks (contained in the TSS).
Link to previously executed task (contained in the TSS).
Prior to dispatching a task, all of these items are contained in the task's TSS, except the state of
the task register. Also, the complete contents of the LDTR register are not contained in the TSS,
only the segment selector for the LDT.
6.1.3. Executing a Task
Software or the processor can dispatch a task for execution in one of the following ways:
A explicit call to a task with the CALL instruction.
A explicit jump to a task with the JMP instruction.
An implicit call (by the processor) to an interrupt-handler task.
An implicit call to an exception-handler task.
A return (initiated with an IRET instruction) when the NT flag in the EFLAGS register is
set.
All of these methods of dispatching a task identify the task to be dispatched with a segment
selector that points either to a task gate or the TSS for the task. When dispatching a task with a
CALL or JMP instruction, the selector in the instruction may select either the TSS directly or a
task gate that holds the selector for the TSS. When dispatching a task to handle an interrupt or
exception, the IDT entry for the interrupt or exception must contain a task gate that holds the
selector for the interrupt- or exception-handler TSS.
When a task is dispatched for execution, a task switch automatically occurs between the
currently running task and the dispatched task. During a task switch, the execution environment
of the currently executing task (called the task's state or context) is saved in its TSS and execution
of the task is suspended. The context for the dispatched task is then loaded into the processor
and execution of that task begins with the instruction pointed to by the newly loaded EIP
register. If the task has not been run since the system was last initialized, the EIP will point to
the first instruction of the task's code; otherwise, it will point to the next instruction after the last
instruction that the task executed when it was last active.
If the currently executing task (the calling task) called the task being dispatched (the called task),
the TSS segment selector for the calling task is stored in the TSS of the called task to provide a
link back to the calling task.
For all Intel Architecture processors, tasks are not recursive. A task cannot call or jump to itself.
Interrupts and exceptions can be handled with a task switch to a handler task. Here, the processor
not only can perform a task switch to handle the interrupt or exception, but it can automatically
switch back to the interrupted task upon returning from the interrupt- or exception-handler task.
This mechanism can handle interrupts that occur during interrupt tasks.
As part of a task switch, the processor can also switch to another LDT, allowing each task to have
a different logical-to-physical address mapping for LDT-based segments. The page-directory base
register (CR3) also is reloaded on a task switch, allowing each task to have its own set of page
tables. These protection facilities help isolate tasks and prevent them from interfering with one
another. If one or both of these protection mechanisms are not used, the processor provides no
protection between tasks. This is true even with operating systems that use multiple privilege
levels for protection. Here, a task running at privilege level 3 that uses the same LDT and page
tables as other privilege-level-3 tasks can access code and corrupt data and the stack of other
tasks.
Use of task management facilities for handling multitasking applications is optional. Multitasking
can be handled in software, with each software defined task executed in the context of
a single Intel Architecture task.
6.2. TASK MANAGEMENT DATA STRUCTURES
The processor defines five data structures for handling task-related activities:
Task-state segment (TSS).
Task-gate descriptor.
TSS descriptor.
Task register.
NT flag in the EFLAGS register.
When operating in protected mode, a TSS and TSS descriptor must be created for at least one
task, and the segment selector for the TSS must be loaded into the task register (using the LTR
instruction).
6.2.1. Task-State Segment (TSS)
The processor state information needed to restore a task is saved in a system segment called the
task-state segment (TSS). Figure 6-2 shows the format of a TSS for tasks designed for 32-bit
CPUs. (Compatibility with 16-bit Intel 286 processor tasks is provided by a different kind of
TSS, refer to Figure 6-9.) The fields of a TSS are divided into two main categories: dynamic
fields and static fields.
The processor updates the dynamic fields when a task is suspended during a task switch. The
following are dynamic fields:
General-purpose register fields
State of the EAX, ECX, EDX, EBX, ESP, EBP, ESI, and EDI registers prior to
the task switch.
Segment selector fields
Segment selectors stored in the ES, CS, SS, DS, FS, and GS registers prior to
the task switch.
EFLAGS register field
State of the EFAGS register prior to the task switch.
EIP (instruction pointer) field
State of the EIP register prior to the task switch.
Previous task link field
Contains the segment selector for the TSS of the previous task (updated on a
task switch that was initiated by a call, interrupt, or exception). This field
(which is sometimes called the back link field) permits a task switch back to
the previous task to be initiated with an IRET instruction.
The processor reads the static fields, but does not normally change them. These fields are set up
when a task is created. The following are static fields:
LDT segment selector field
Contains the segment selector for the task's LDT.
CR3 control register field
Contains the base physical address of the page directory to be used by the task.
Control register CR3 is also known as the page-directory base register (PDBR).
Privilege level-0, -1, and -2 stack pointer fields
These stack pointers consist of a logical address made up of the segment
selector for the stack segment (SS0, SS1, and SS2) and an offset into the stack
(ESP0, ESP1, and ESP2). Note that the values in these fields are static for a
particular task; whereas, the SS and ESP values will change if stack switching
occurs within the task.
T (debug trap) flag (byte 100, bit 0)
When set, the T flag causes the processor to raise a debug exception when a
task switch to this task occurs (refer to Section 15.3.1.5., "Task-Switch Exception
Condition", in Chapter 15, Debugging and Performance Monitoring).
I/O map base address field
Contains a 16-bit offset from the base of the TSS to the I/O permission bit map
and interrupt redirection bitmap. When present, these maps are stored in the
TSS at higher addresses. The I/O map base address points to the beginning of
the I/O permission bit map and the end of the interrupt redirection bit map.
Refer to Chapter 9, Input/Output, in the Intel Architecture Software Developer's
Manual, Volume 1, for more information about the I/O permission bit
map. Refer to Section 16.3., "Interrupt and Exception Handling in Virtual-
8086 Mode" in Chapter 16, 8086 Emulation for a detailed description of the
interrupt redirection bit map.
If paging is used, care should be taken to avoid placing a page boundary within the part of the
TSS that the processor reads during a task switch (the first 104 bytes). If a page boundary is
placed within this part of the TSS, the pages on either side of the boundary must be present at
the same time and contiguous in physical memory. The reason for this restriction is that when
accessing a TSS during a task switch, the processor reads and writes into the first 104 bytes of
each TSS from contiguous physical addresses beginning with the physical address of the first
byte of the TSS. It may not perform address translations at a page boundary if one occurs within
this area. So, after the TSS access begins, if a part of the 104 bytes is not both present and physically
contiguous, the processor will access incorrect TSS information, without generating a
page-fault exception. The reading of this incorrect information will generally lead to an unrecoverable
exception later in the task switch process.
Also, if paging is used, the pages corresponding to the previous task's TSS, the current task's
TSS, and the descriptor table entries for each should be marked as read/write. The task switch
will be carried out faster if the pages containing these structures are also present in memory
before the task switch is initiated.
6.2.2. TSS Descriptor
The TSS, like all other segments, is defined by a segment descriptor. Figure 6-3 shows the
format of a TSS descriptor. TSS descriptors may only be placed in the GDT; they cannot be
placed in an LDT or the IDT. An attempt to access a TSS using a segment selector with its TI
flag set (which indicates the current LDT) causes a general-protection exception (#GP) to be
generated. A general-protection exception is also generated if an attempt is made to load a
segment selector for a TSS into a segment register.
The busy flag (B) in the type field indicates whether the task is busy. A busy task is currently
running or is suspended. A type field with a value of 1001B indicates an inactive task; a value
of 1011B indicates a busy task. Tasks are not recursive. The processor uses the busy flag to
detect an attempt to call a task whose execution has been interrupted. To insure that there is only
one busy flag is associated with a task, each TSS should have only one TSS descriptor that points
to it.
Figure 6-3
The base, limit, and DPL fields and the granularity and present flags have functions similar to
their use in data-segment descriptors (refer to Section 3.4.3., "Segment Descriptors" in Chapter
3, Protected-Mode Memory Management). The limit field must have a value equal to or greater
than 67H (for a 32-bit TSS), one byte less than the minimum size of a TSS. Attempting to switch
to a task whose TSS descriptor has a limit less than 67H generates an invalid-TSS exception
(#TS). A larger limit is required if an I/O permission bit map is included in the TSS. An even
larger limit would be required if the operating system stores additional data in the TSS. The
processor does not check for a limit greater than 67H on a task switch; however, it does when
accessing the I/O permission bit map or interrupt redirection bit map.
Any program or procedure with access to a TSS descriptor (that is, whose CPL is numerically
equal to or less than the DPL of the TSS descriptor) can dispatch the task with a call or a jump.
In most systems, the DPLs of TSS descriptors should be set to values less than 3, so that only
privileged software can perform task switching. However, in multitasking applications, DPLs
for some TSS descriptors can be set to 3 to allow task switching at the application (or user) privilege
level.
6.2.3. Task Register
The task register holds the 16-bit segment selector and the entire segment descriptor (32-bit base
address, 16-bit segment limit, and descriptor attributes) for the TSS of the current task (refer to
Figure 2-4 in Chapter 2, System Architecture Overview). This information is copied from the
TSS descriptor in the GDT for the current task. Figure 6-4 shows the path the processor uses to
accesses the TSS, using the information in the task register.
The task register has both a visible part (that can be read and changed by software) and an invisible
part (that is maintained by the processor and is inaccessible by software). The segment
selector in the visible portion points to a TSS descriptor in the GDT. The processor uses the
invisible portion of the task register to cache the segment descriptor for the TSS. Caching these
values in a register makes execution of the task more efficient, because the processor does not
need to fetch these values from memory to reference the TSS of the current task.
The LTR (load task register) and STR (store task register) instructions load and read the visible
portion of the task register. The LTR instruction loads a segment selector (source operand) into
the task register that points to a TSS descriptor in the GDT, and then loads the invisible portion
of the task register with information from the TSS descriptor. This instruction is a privileged
instruction that may be executed only when the CPL is 0. The LTR instruction generally is used
during system initialization to put an initial value in the task register. Afterwards, the contents
of the task register are changed implicitly when a task switch occurs.
The STR (store task register) instruction stores the visible portion of the task register in a
general-purpose register or memory. This instruction can be executed by code running at any
privilege level, to identify the currently running task; however, it is normally used only by operating
system software.
On power up or reset of the processor, the segment selector and base address are set to the default
value of 0 and the limit is set to FFFFH.
6.2.4. Task-Gate Descriptor
A task-gate descriptor provides an indirect, protected reference to a task. Figure 6-5 shows the
format of a task-gate descriptor. A task-gate descriptor can be placed in the GDT, an LDT, or the
IDT.
The TSS segment selector field in a task-gate descriptor points to a TSS descriptor in the GDT.
The RPL in this segment selector is not used.
The DPL of a task-gate descriptor controls access to the TSS descriptor during a task switch.
When a program or procedure makes a call or jump to a task through a task gate, the CPL and
the RPL field of the gate selector pointing to the task gate must be less than or equal to the DPL
of the task-gate descriptor. (Note that when a task gate is used, the DPL of the destination TSS
descriptor is not used.)
Figure 6-4
Figure 6-5
A task can be accessed either through a task-gate descriptor or a TSS descriptor. Both of these
structures are provided to satisfy the following needs:
The need for a task to have only one busy flag. Because the busy flag for a task is stored in
the TSS descriptor, each task should have only one TSS descriptor. There may, however,
be several task gates that reference the same TSS descriptor.
The need to provide selective access to tasks. Task gates fill this need, because they can
reside in an LDT and can have a DPL that is different from the TSS descriptor's DPL. A
program or procedure that does not have sufficient privilege to access the TSS descriptor
for a task in the GDT (which usually has a DPL of 0) may be allowed access to the task
through a task gate with a higher DPL. Task gates give the operating system greater
latitude for limiting access to specific tasks.
The need for an interrupt or exception to be handled by an independent task. Task gates
may also reside in the IDT, which allows interrupts and exceptions to be handled by
handler tasks. When an interrupt or exception vector points to a task gate, the processor
switches to the specified task.
Figure 6-6 illustrates how a task gate in an LDT, a task gate in the GDT, and a task gate in the
IDT can all point to the same task.
6.3. TASK SWITCHING
The processor transfers execution to another task in any of four cases:
The current program, task, or procedure executes a JMP or CALL instruction to a TSS
descriptor in the GDT.
The current program, task, or procedure executes a JMP or CALL instruction to a task-gate
descriptor in the GDT or the current LDT.
An interrupt or exception vector points to a task-gate descriptor in the IDT.
The current task executes an IRET when the NT flag in the EFLAGS register is set.
The JMP, CALL, and IRET instructions, as well as interrupts and exceptions, are all generalized
mechanisms for redirecting a program. The referencing of a TSS descriptor or a task gate (when
calling or jumping to a task) or the state of the NT flag (when executing an IRET instruction)
determines whether a task switch occurs.
The processor performs the following operations when switching to a new task:
1. Obtains the TSS segment selector for the new task as the operand of the JMP or CALL
instruction, from a task gate, or from the previous task link field (for a task switch initiated
with an IRET instruction).
Figure 6-6
Figure 6-6. Task Gates Referencing the Same Task
2. Checks that the current (old) task is allowed to switch to the new task. Data-access
privilege rules apply to JMP and CALL instructions. The CPL of the current (old) task and
the RPL of the segment selector for the new task must be less than or equal to the DPL of
the TSS descriptor or task gate being referenced. Exceptions, interrupts (except for
interrupts generated by the INT n instruction), and the IRET instruction are permitted to
switch tasks regardless of the DPL of the destination task-gate or TSS descriptor. For
interrupts generated by the INT n instruction, the DPL is checked.
3. Checks that the TSS descriptor of the new task is marked present and has a valid limit
(greater than or equal to 67H).
4. Checks that the new task is available (call, jump, exception, or interrupt) or busy (IRET
return).
5. Checks that the current (old) TSS, new TSS, and all segment descriptors used in the task
switch are paged into system memory.
6. If the task switch was initiated with a JMP or IRET instruction, the processor clears the
busy (B) flag in the current (old) task's TSS descriptor; if initiated with a CALL
instruction, an exception, or an interrupt, the busy (B) flag is left set. (Refer to Table 6-2.)
7. If the task switch was initiated with an IRET instruction, the processor clears the NT flag
in a temporarily saved image of the EFLAGS register; if initiated with a CALL or JMP
instruction, an exception, or an interrupt, the NT flag is left unchanged in the saved
EFLAGS image.
8. Saves the state of the current (old) task in the current task's TSS. The processor finds the
base address of the current TSS in the task register and then copies the states of the
following registers into the current TSS: all the general-purpose registers, segment
selectors from the segment registers, the temporarily saved image of the EFLAGS register,
and the instruction pointer register (EIP).
NOTE
At this point, if all checks and saves have been carried out successfully, the
processor commits to the task switch. If an unrecoverable error occurs in
steps 1 through 8, the processor does not complete the task switch and insures
that the processor is returned to its state prior to the execution of the
instruction that initiated the task switch. If an unrecoverable error occurs after
the commit point (in steps 9 through 14), the processor completes the task
switch (without performing additional access and segment availability
checks) and generates the appropriate exception prior to beginning execution
of the new task. If exceptions occur after the commit point, the exception
handler must finish the task switch itself before allowing the processor to
begin executing the task. Refer to Chapter 5, Interrupt and Exception
Handling for more information about the affect of exceptions on a task when
they occur after the commit point of a task switch.
9. If the task switch was initiated with a CALL instruction, an exception, or an interrupt, the
processor sets the NT flag in the EFLAGS image stored in the new task's TSS; if initiated
with an IRET instruction, the processor restores the NT flag from the EFLAGS image
stored on the stack. If initiated with a JMP instruction, the NT flag is left unchanged.
(Refer to Table 6-2.)
10. If the task switch was initiated with a CALL instruction, JMP instruction, an exception, or
an interrupt, the processor sets the busy (B) flag in the new task's TSS descriptor; if
initiated with an IRET instruction, the busy (B) flag is left set.
11. Sets the TS flag in the control register CR0 image stored in the new task's TSS.
12. Loads the task register with the segment selector and descriptor for the new task's TSS.
13. Loads the new task's state from its TSS into processor. Any errors associated with the
loading and qualification of segment descriptors in this step occur in the context of the new
task. The task state information that is loaded here includes the LDTR register, the PDBR
(control register CR3), the EFLAGS register, the EIP register, the general-purpose
registers, and the segment descriptor parts of the segment registers.
14. Begins executing the new task. (To an exception handler, the first instruction of the new
task appears not to have been executed.)
The state of the currently executing task is always saved when a successful task switch occurs.
If the task is resumed, execution starts with the instruction pointed to by the saved EIP value,
and the registers are restored to the values they held when the task was suspended.
When switching tasks, the privilege level of the new task does not inherit its privilege level from
the suspended task. The new task begins executing at the privilege level specified in the CPL
field of the CS register, which is loaded from the TSS. Because tasks are isolated by their separate
address spaces and TSSs and because privilege rules control access to a TSS, software does
not need to perform explicit privilege checks on a task switch.
Table 6-1 shows the exception conditions that the processor checks for when switching tasks. It
also shows the exception that is generated for each check if an error is detected and the segment
that the error code references. (The order of the checks in the table is the order used in the P6
family processors. The exact order is model specific and may be different for other Intel Architecture
processors.) Exception handlers designed to handle these exceptions may be subject to
recursive calls if they attempt to reload the segment selector that generated the exception. The
cause of the exception (or the first of multiple causes) should be fixed before reloading the
selector.
NOTES:
1. #NP is segment-not-present exception, #GP is general-protection exception, #TS is invalid-TSS exception,
and #SF is stack-fault exception.
2. The error code contains an index to the segment descriptor referenced in this column.
3. A segment selector is valid if it is in a compatible type of table (GDT or LDT), occupies an address within
the table's segment limit, and refers to a compatible type of descriptor (for example, a segment selector in
the CS register only is valid when it points to a code-segment descriptor).
The TS (task switched) flag in the control register CR0 is set every time a task switch occurs.
System software uses the TS flag to coordinate the actions of floating-point unit when generating
floating-point exceptions with the rest of the processor. The TS flag indicates that the
context of the floating-point unit may be different from that of the current task. Refer to Section
2.5., "Control Registers" in Chapter 2, System Architecture Overview for a detailed description
of the function and use of the TS flag.
6.4. TASK LINKING
The previous task link field of the TSS (sometimes called the "backlink") and the NT flag in the
EFLAGS register are used to return execution to the previous task. The NT flag indicates
whether the currently executing task is nested within the execution of another task, and the
previous task link field of the current task's TSS holds the TSS selector for the higher-level task
in the nesting hierarchy, if there is one (refer to Figure 6-7).
When a CALL instruction, an interrupt, or an exception causes a task switch, the processor
copies the segment selector for the current TSS into the previous task link field of the TSS for
the new task, and then sets the NT flag in the EFLAGS register. The NT flag indicates that the
previous task link field of the TSS has been loaded with a saved TSS segment selector. If software
uses an IRET instruction to suspend the new task, the processor uses the value in the
previous task link field and the NT flag to return to the previous task; that is, if the NT flag is
set, the processor performs a task switch to the task specified in the previous task link field.
When a JMP instruction causes a task switch, the new task is not nested; that
is, the NT flag is set to 0 and the previous task link field is not used. A JMP
instruction is used to dispatch a new task when nesting is not desired.
Figure 6-7
Table 6-2 summarizes the uses of the busy flag (in the TSS segment descriptor), the NT flag, the
previous task link field, and TS flag (in control register CR0) during a task switch. Note that the
NT flag may be modified by software executing at any privilege level. It is possible for a
program to set its NT flag and execute an IRET instruction, which would have the effect of
invoking the task specified in the previous link field of the current task's TSS. To keep spurious
task switches from succeeding, the operating system should initialize the previous task link field
for every TSS it creates to 0.
6.4.1. Use of Busy Flag To Prevent Recursive Task Switching
A TSS allows only one context to be saved for a task; therefore, once a task is called
(dispatched), a recursive (or re-entrant) call to the task would cause the current state of the task
to be lost. The busy flag in the TSS segment descriptor is provided to prevent re-entrant task
switching and subsequent loss of task state information. The processor manages the busy flag as
follows:
1. When dispatching a task, the processor sets the busy flag of the new task.
2. If during a task switch, the current task is placed in a nested chain (the task switch is being
generated by a CALL instruction, an interrupt, or an exception), the busy flag for the
current task remains set.
3. When switching to the new task (initiated by a CALL instruction, interrupt, or exception),
the processor generates a general-protection exception (#GP) if the busy flag of the new
task is already set. (If the task switch is initiated with an IRET instruction, the exception is
not raised because the processor expects the busy flag to be set.)
4. When a task is terminated by a jump to a new task (initiated with a JMP instruction in the
task code) or by an IRET instruction in the task code, the processor clears the busy flag,
returning the task to the "not busy" state.
In this manner the processor prevents recursive task switching by preventing a task from
switching to itself or to any task in a nested chain of tasks. The chain of nested suspended tasks
may grow to any length, due to multiple calls, interrupts, or exceptions. The busy flag prevents
a task from being invoked if it is in this chain.
The busy flag may be used in multiprocessor configurations, because the processor follows a
LOCK protocol (on the bus or in the cache) when it sets or clears the busy flag. This lock keeps
two processors from invoking the same task at the same time. (Refer to Section 7.1.2.1., "Automatic
Locking" in Chapter 7, Multiple-Processor Management for more information about
setting the busy flag in a multiprocessor applications.)
6.4.2. Modifying Task Linkages
In a uniprocessor system, in situations where it is necessary to remove a task from a chain of
linked tasks, use the following procedure to remove the task:
1. Disable interrupts.
2. Change the previous task link field in the TSS of the pre-empting task (the task that
suspended the task to be removed). It is assumed that the pre-empting task is the next task
(newer task) in the chain from the task to be removed. Change the previous task link field
should to point to the TSS of the next oldest or to an even older task in the chain.
3. Clear the busy (B) flag in the TSS segment descriptor for the task being removed from the
chain. If more than one task is being removed from the chain, the busy flag for each task
being remove must be cleared.
4. Enable interrupts.
In a multiprocessing system, additional synchronization and serialization operations must be
added to this procedure to insure that the TSS and its segment descriptor are both locked when
the previous task link field is changed and the busy flag is cleared.
6.5. TASK ADDRESS SPACE
The address space for a task consists of the segments that the task can access. These segments
include the code, data, stack, and system segments referenced in the TSS and any other segments
accessed by the task code. These segments are mapped into the processor's linear address space,
which is in turn mapped into the processor's physical address space (either directly or through
paging).
The LDT segment field in the TSS can be used to give each task its own LDT. Giving a task its
own LDT allows the task address space to be isolated from other tasks by placing the segment
descriptors for all the segments associated with the task in the task's LDT.
It also is possible for several tasks to use the same LDT. This is a simple and memory-efficient
way to allow some tasks to communicate with or control each other, without dropping the
protection barriers for the entire system.
Because all tasks have access to the GDT, it also is possible to create shared segments accessed
through segment descriptors in this table.
If paging is enabled, the CR3 register (PDBR) field in the TSS allows each task can also have
its own set of page tables for mapping linear addresses to physical addresses. Or, several tasks
can share the same set of page tables.
6.5.1. Mapping Tasks to the Linear and Physical Address
Spaces
Tasks can be mapped to the linear address space and physical address space in either of two
ways:
One linear-to-physical address space mapping is shared among all tasks. When paging is
not enabled, this is the only choice. Without paging, all linear addresses map to the same
physical addresses. When paging is enabled, this form of linear-to-physical address space
mapping is obtained by using one page directory for all tasks. The linear address space
may exceed the available physical space if demand-paged virtual memory is supported.
Each task has its own linear address space that is mapped to the physical address space.
This form of mapping is accomplished by using a different page directory for each task.
Because the PDBR (control register CR3) is loaded on each task switch, each task may
have a different page directory.
The linear address spaces of different tasks may map to completely distinct physical addresses.
If the entries of different page directories point to different page tables and the page tables point
to different pages of physical memory, then the tasks do not share any physical addresses.
With either method of mapping task linear address spaces, the TSSs for all tasks must lie in a
shared area of the physical space, which is accessible to all tasks. This mapping is required so
that the mapping of TSS addresses does not change while the processor is reading and updating
the TSSs during a task switch. The linear address space mapped by the GDT also should be
mapped to a shared area of the physical space; otherwise, the purpose of the GDT is defeated.
Figure 6-8 shows how the linear address spaces of two tasks can overlap in the physical space
by sharing page tables.
Figure 6-8
6.5.2. Task Logical Address Space
To allow the sharing of data among tasks, use any of the following techniques to create shared
logical-to-physical address-space mappings for data segments:
Through the segment descriptors in the GDT. All tasks must have access to the segment
descriptors in the GDT. If some segment descriptors in the GDT point to segments in the
linear-address space that are mapped into an area of the physical-address space common to
all tasks, then all tasks can share the data and code in those segments.
Through a shared LDT. Two or more tasks can use the same LDT if the LDT fields in their
TSSs point to the same LDT. If some segment descriptors in a shared LDT point to
segments that are mapped to a common area of the physical address space, the data and
code in those segments can be shared among the tasks that share the LDT. This method of
sharing is more selective than sharing through the GDT, because the sharing can be limited
to specific tasks. Other tasks in the system may have different LDTs that do not give them
access to the shared segments.
Through segment descriptors in distinct LDTs that are mapped to common addresses in the
linear address space. If this common area of the linear address space is mapped to the same
area of the physical address space for each task, these segment descriptors permit the tasks
to share segments. Such segment descriptors are commonly called aliases. This method of
sharing is even more selective than those listed above, because, other segment descriptors
in the LDTs may point to independent linear addresses which are not shared.
6.6. 16-BIT TASK-STATE SEGMENT (TSS)
The 32-bit Intel Architecture processors also recognize a 16-bit TSS format like the one used in
Intel 286 processors (refer to Figure 6-9). It is supported for compatibility with software written
to run on these earlier Intel Architecture processors.
The following additional information is important to know about the 16-bit TSS.
Do not use a 16-bit TSS to implement a virtual-8086 task.
The valid segment limit for a 16-bit TSS is 2CH.
The 16-bit TSS does not contain a field for the base address of the page directory, which is
loaded into control register CR3. Therefore, a separate set of page tables for each task is
not supported for 16-bit tasks. If a 16-bit task is dispatched, the page-table structure for the
previous task is used.
The I/O base address is not included in the 16-bit TSS, so none of the functions of the I/O
map are supported.
When task state is saved in a 16-bit TSS, the upper 16 bits of the EFLAGS register and the
EIP register are lost.
When the general-purpose registers are loaded or saved from a 16-bit TSS, the upper 16
bits of the registers are modified and not maintained.
Figure 6-9
CHAPTER 8 PROCESSOR MANAGEMENT AND INITIALIZATION
This chapter describes the facilities provided for managing processor wide functions and for
initializing the processor. The subjects covered include: processor initialization, FPU initialization,
processor configuration, feature determination, mode switching, the MSRs (in the
PentiumR and P6 family processors), and the MTRRs (in the P6 family processors).
8.1. INITIALIZATION OVERVIEW
Following power-up or an assertion of the RESET# pin, each processor on the system bus
performs a hardware initialization of the processor (known as a hardware reset) and an optional
built-in self-test (BIST). A hardware reset sets each processor's registers to a known state and
places the processor in real-address mode. It also invalidates the internal caches, translation
lookaside buffers (TLBs) and the branch target buffer (BTB). At this point, the action taken
depends on the processor family:
P6 family processors-All the processors on the system bus (including a single processor
in a uniprocessor system) execute the multiple processor (MP) initialization protocol
across the APIC bus. The processor that is selected through this protocol as the bootstrap
processor (BSP) then immediately starts executing software-initialization code in the
current code segment beginning at the offset in the EIP register. The application (non-BSP)
processors (AP) go into a halt state while the BSP is executing initialization code. Refer to
Section 7.7., "Multiple-Processor (MP) Initialization Protocol" in Chapter 7, Multiple-
Processor Management for more details. Note that in a uniprocessor system, the single P6
family processor automatically becomes the BSP.
PentiumR processors-In either a single- or dual- processor system, a single PentiumR
processor is always pre-designated as the primary processor. Following a reset, the primary
processor behaves as follows in both single- and dual-processor systems. Using the dualprocessor
(DP) ready initialization protocol, the primary processor immediately starts
executing software-initialization code in the current code segment beginning at the offset
in the EIP register. The secondary processor (if there is one) goes into a halt state. (Refer to
Section 7.6., "Dual-Processor (DP) Initialization Protocol" in Chapter 7, Multiple-
Processor Management for more details.)
Intel486T processor-The primary processor (or single processor in a uniprocessor
system) immediately starts executing software-initialization code in the current code
segment beginning at the offset in the EIP register. (The Intel486T does not automatically
execute a DP or MP initialization protocol to determine which processor is the primary
processor.)
The software-initialization code performs all system-specific initialization of the BSP or
primary processor and the system logic.
At this point, for MP (or DP) systems, the BSP (or primary) processor wakes up each AP (or
secondary) processor to enable those processors to execute self-configuration code.
When all processors are initialized, configured, and synchronized, the BSP or primary processor
begins executing an initial operating-system or executive task.
The floating-point unit (FPU) is also initialized to a known state during hardware reset. FPU
software initialization code can then be executed to perform operations such as setting the precision
of the FPU and the exception masks. No special initialization of the FPU is required to
switch operating modes.
Asserting the INIT# pin on the processor invokes a similar response to a hardware reset. The
major difference is that during an INIT, the internal caches, MSRs, MTRRs, and FPU state are
left unchanged (although, the TLBs and BTB are invalidated as with a hardware reset). An INIT
provides a method for switching from protected to real-address mode while maintaining the
contents of the internal caches.
8.1.1. Processor State After Reset
Table 8-1 shows the state of the flags and other registers following power-up for the PentiumR
Pro, PentiumR, and Intel486T processors. The state of control register CR0 is 60000010H (refer
to Figure 8-1), which places the processor is in real-address mode with paging disabled.
8.1.2. Processor Built-In Self-Test (BIST)
Hardware may request that the BIST be performed at power-up. The EAX register is cleared
(0H) if the processor passes the BIST. A nonzero value in the EAX register after the BIST indicates
that a processor fault was detected. If the BIST is not requested, the contents of the EAX
register after a hardware reset is 0H.
The overhead for performing a BIST varies between processor families. For example, the BIST
takes approximately 5.5 million processor clock periods to execute on the PentiumR Pro
processor. (This clock count is model-specific, and Intel reserves the right to change the exact
number of periods, for any of the Intel Architecture processors, without notification.)
Table 8-1. 32-Bit Intel Architecture Processor States Following Power-up, Reset, or INIT
Register |
P6 Family Processors |
Pentium Processor |
Intel486 Processor |
EFLAGS1 |
00000002H |
00000002H |
00000002H |
EIP |
0000FFF0H |
0000FFF0H |
0000FFF0H |
CR0 |
60000010H2 |
60000010H2 |
60000010H2 |
CR2, CR3, CR4 |
00000000H |
00000000H |
00000000H |
MXCSR |
Pentium III processor only-Pwr up or Reset: 1F80H FINIT/FNINIT: Unchanged |
NA |
NA |
CS |
Selector = F000H Base = FFFF0000H Limit = FFFFH AR = Present, R/W, Accessed |
Selector = F000H Base = FFFF0000H Limit = FFFFH AR = Present, R/W, Accessed |
Selector = F000H Base = FFFF0000H Limit = FFFFH AR = Present, R/W, Accessed |
SS, DS, ES, FS, GS |
Selector = 0000H Base = 00000000H Limit = FFFFH AR = Present, R/W, Accessed |
Selector = 0000H Base = 00000000H Limit = FFFFH AR = Present, R/W, Accessed |
Selector = 0000H Base = 00000000H Limit = FFFFH AR = Present, R/W, Accessed |
EDX |
000006xxH |
000005xxH |
000004xxH |
EAX |
03 |
03 |
03 |
EBX, ECX, ESI, EDI, EBP, ESP |
00000000H |
00000000H |
00000000H |
MM0 through MM74 |
Pentium Pro processor - NA Pentium II and Pentium III processor -Pwr up or Reset: 0000000000000000H FINIT/FNINIT: Unchanged |
Pwr up or Reset: 0000000000000000H FINIT/FNINIT: Unchanged |
NA |
XMM0 through XMM75 ST0 through ST74 FPU Control Word4 FPU Status Word4 FPU Tag Word4 FPU Data Operand and CS Seg. Selectors4 |
Pentium III processor only-Pwr up or Reset: 0000000000000000H FINIT/FNINIT: Unchanged Pwr up or Reset: +0.0 FINIT/FNINIT: Unchanged Pwr up or Reset: 0040H FINIT/FNINIT: 037FH Pwr up or Reset: 0000H FINIT/FNINIT: 0000H Pwr up or Reset: 5555H FINIT/FNINIT: FFFFH Pwr up or Reset: 0000H FINIT/FNINIT: 0000H |
NA Pwr up or Reset: +0.0 FINIT/FNINIT: Unchanged Pwr up or Reset: 0040H FINIT/FNINIT: 037FH Pwr up or Reset: 0000H FINIT/FNINIT: 0000H Pwr up or Reset: 5555H FINIT/FNINIT: FFFFH Pwr up or Reset: 0000H FINIT/FNINIT: 0000H |
NA Pwr up or Reset: +0.0 FINIT/FNINIT: Unchanged Pwr up or Reset: 0040H FINIT/FNINIT: 037FH Pwr up or Reset: 0000H FINIT/FNINIT: 0000H Pwr up or Reset: 5555H FINIT/FNINIT: FFFFH Pwr up or Reset: 0000H FINIT/FNINIT: 0000H |
FPU Data Operand and Inst. Pointers4 GDTR,IDTR Register |
Pwr up or Reset: 00000000H FINIT/FNINIT: 00000000H Base = 00000000H Limit = FFFFH AR = Present, R/W P6 Family Processors |
Pwr up or Reset: 00000000H FINIT/FNINIT: 00000000H Base = 00000000H Limit = FFFFH AR = Present, R/W Pentium Processor |
Pwr up or Reset: 00000000H FINIT/FNINIT: 00000000H Base = 00000000H Limit = FFFFH AR = Present, R/W Intel486 Processor |
LDTR, Task Register |
Selector = 0000H Base = 00000000H Limit = FFFFH AR = Present, R/W |
Selector = 0000H Base = 00000000H Limit = FFFFH AR = Present, R/W |
Selector = 0000H Base = 00000000H Limit = FFFFH AR = Present, R/W |
DR0, DR1, DR2, DR3 |
00000000H |
00000000H |
00000000H |
DR6 |
FFFF0FF0H |
FFFF0FF0H |
FFFF1FF0H |
DR7 |
00000400H |
00000400H |
00000000H |
Time-Stamp Counter |
Power up or Reset: 0H INIT: Unchanged |
Power up or Reset: 0H INIT: Unchanged |
Not Implemented |
Perf. Counters and Event Select |
Power up or Reset: 0H INIT: Unchanged |
Power up or Reset: 0H INIT: Unchanged |
Not Implemented |
All Other MSRs |
Pwr up or Reset: Undefined INIT: Unchanged |
Pwr up or Reset: Undefined INIT: Unchanged |
Not Implemented |
Data and Code Cache, TLBs |
Invalid |
Invalid |
Invalid |
Fixed MTRRs |
Pwr up or Reset: Disabled INIT: Unchanged |
Not Implemented |
Not Implemented |
Variable MTRRs |
Pwr up or Reset: Disabled INIT: Unchanged |
Not Implemented |
Not Implemented |
Machine-Check Architecture |
Pwr up or Reset: Undefined INIT: Unchanged |
Not Implemented |
Not Implemented |
APIC |
Pwr up or Reset: Enabled INIT: Unchanged |
Pwr up or Reset: Enabled INIT: Unchanged |
Not Implemented |
NOTES:
1. The 10 most-significant bits of the EFLAGS register are undefined following a reset. Software should not
depend on the states of any of these bits.
2. The CD and NW flags are unchanged, bit 4 is set to 1, all other bits are cleared.
3. If Built-In Self-Test (BIST) is invoked on power up or reset, EAX is 0 only if all tests passed. (BIST cannot
be invoked during an INIT.)
4. The state of the FPU state and MMXT registers is not changed by the execution of an INIT.
5. Available in the PentiumR III processor and PentiumR III XeonT processor only. The state of the SIMD
floating-point registers is not changed by the execution of an INIT.
Figure 8-1
8.1.3. Model and Stepping Information
Following a hardware reset, the EDX register contains component identification and revision
information (refer to Figure 8-2). The device ID field is set to the value 6H, 5H, 4H, or 3H to
indicate a PentiumR Pro, PentiumR, Intel486T, or Intel386T processor, respectively. Different
values may be returned for the various members of these Intel Architecture families. For
example the Intel386T SX processor returns 23H in the device ID field. Binary object code can
be made compatible with other Intel processors by using this number to select the correct initialization
software.
Figure 8-2
The stepping ID field contains a unique identifier for the processor's stepping ID or revision
level. The upper word of EDX is reserved following reset.
8.1.4. First Instruction Executed
The first instruction that is fetched and executed following a hardware reset is located at physical
address FFFFFFF0H. This address is 16 bytes below the processor's uppermost physical
address. The EPROM containing the software-initialization code must be located at this address.
The address FFFFFFF0H is beyond the 1-MByte addressable range of the processor while in
real-address mode. The processor is initialized to this starting address as follows. The CS
register has two parts: the visible segment selector part and the hidden base address part. In realaddress
mode, the base address is normally formed by shifting the 16-bit segment selector value
4 bits to the left to produce a 20-bit base address. However, during a hardware reset, the segment
selector in the CS register is loaded with F000H and the base address is loaded with
FFFF0000H. The starting address is thus formed by adding the base address to the value in the
EIP register (that is, FFFF0000 + FFF0H = FFFFFFF0H).
The first time the CS register is loaded with a new value after a hardware reset, the processor
will follow the normal rule for address translation in real-address mode (that is, [CS base address
= CS segment selector * 16]). To insure that the base address in the CS register remains
unchanged until the EPROM based software-initialization code is completed, the code must not
contain a far jump or far call or allow an interrupt to occur (which would cause the CS selector
value to be changed).
8.2. FPU INITIALIZATION
Software-initialization code can determine the whether the processor contains or is attached to
an FPU by using the CPUID instruction. The code must then initialize the FPU and set flags in
control register CR0 to reflect the state of the FPU environment.
A hardware reset places the PentiumR processor FPU in the state shown in Table 8-1. This state
is different from the state the processor is placed in when executing an FINIT or FNINIT instruction
(also shown in Table 8-1). If the FPU is to be used, the software-initialization code should
execute an FINIT/FNINIT instruction following a hardware reset. These instructions, tag all
data registers as empty, clear all the exception masks, set the TOP-of-stack value to 0, and select
the default rounding and precision controls setting (round to nearest and 64-bit precision).
If the processor is reset by asserting the INIT# pin, the FPU state is not changed.
8.2.1. Configuring the FPU Environment
Initialization code must load the appropriate values into the MP, EM, and NE flags of control
register CR0. These bits are cleared on hardware reset of the processor. Figure 8-2 shows the
suggested settings for these flags, depending on the Intel Architecture processor being initial8-
ized. Initialization code can test for the type of processor present before setting or clearing these
flags.
NOTE:
* The setting of the NE flag depends on the operating system being used.
The EM flag determines whether floating-point instructions are executed by the FPU (EM is
cleared) or generate a device-not-available exception (#NM) so that an exception handler can
emulate the floating-point operation (EM = 1). Ordinarily, the EM flag is cleared when an FPU
or math coprocessor is present and set if they are not present. If the EM flag is set and no FPU,
math coprocessor, or floating-point emulator is present, the system will hang when a floatingpoint
instruction is executed.
The MP flag determines whether WAIT/FWAIT instructions react to the setting of the TS flag.
If the MP flag is clear, WAIT/FWAIT instructions ignore the setting of the TS flag; if the MP
flag is set, they will generate a device-not-available exception (#NM) if the TS flag is set. Generally,
the MP flag should be set for processors with an integrated FPU and clear for processors
without an integrated FPU and without a math coprocessor present. However, an operating
system can choose to save the floating-point context at every context switch, in which case there
would be no need to set the MP bit.
Table 2-1 in Chapter 2, System Architecture Overview shows the actions taken for floating-point
and WAIT/FWAIT instructions based on the settings of the EM, MP, and TS flags.
The NE flag determines whether unmasked floating-point exceptions are handled by generating
a floating-point error exception internally (NE is set, native mode) or through an external interrupt
(NE is cleared). In systems where an external interrupt controller is used to invoke numeric
exception handlers (such as MS-DOS-based systems), the NE bit should be cleared.
8.2.2. Setting the Processor for FPU Software Emulation
Setting the EM flag causes the processor to generate a device-not-available exception (#NM)
and trap to a software exception handler whenever it encounters a floating-point instruction.
(Table 8-2 shows when it is appropriate to use this flag.) Setting this flag has two functions:
It allows floating-point code to run on an Intel processor that neither has an integrated FPU
nor is connected to an external math coprocessor, by using a floating-point emulator.
It allows floating-point code to be executed using a special or nonstandard floating-point
emulator, selected for a particular application, regardless of whether an FPU or math
coprocessor is present.
To emulate floating-point instructions, the EM, MP, and NE flag in control register CR0 should
be set as shown in Table 8-3.
Regardless of the value of the EM bit, the Intel486T SX processor generates a device-not-available
exception (#NM) upon encountering any floating-point instruction.
8.3. CACHE ENABLING
The Intel Architecture processors (beginning with the Intel486T processor) contain internal
instruction and data caches. These caches are enabled by clearing the CD and NW flags in
control register CR0. (They are set during a hardware reset.) Because all internal cache lines are
invalid following reset initialization, it is not necessary to invalidate the cache before enabling
caching. Any external caches may require initialization and invalidation using a system-specific
initialization and invalidation code sequence.
Depending on the hardware and operating system or executive requirements, additional configuration
of the processor's caching facilities will probably be required. Beginning with the
Intel486T processor, page-level caching can be controlled with the PCD and PWT flags in
page-directory and page-table entries. For P6 family processors, the memory type range registers
(MTRRs) control the caching characteristics of the regions of physical memory. (For the
Intel486T and PentiumR processors, external hardware can be used to control the caching characteristics
of regions of physical memory.) Refer to Chapter 9, Memory Cache Control, for
detailed information on configuration of the caching facilities in the P6 family processors and
system memory.
8.4. MODEL-SPECIFIC REGISTERS (MSRS)
The P6 family processors and PentiumR processors contain model-specific registers (MSRs).
These registers are by definition implementation specific; that is, they are not guaranteed to be
supported on future Intel Architecture processors and/or to have the same functions. The MSRs
are provided to control a variety of hardware- and software-related features, including:
The performance-monitoring counters (refer to Section 15.6., "Performance-Monitoring
Counters", in Chapter 15, Debugging and Performance Monitoring).
(P6 family processors only.) Debug extensions (refer to Section 15.4., "Last Branch,
Interrupt, and Exception Recording", in Chapter 15, Debugging and Performance
Monitoring).
(P6 family processors only.) The machine-check exception capability and its accompanying
machine-check architecture (refer to Chapter 13, Machine-Check Architecture).
(P6 family processors only.) The MTRRs (refer to Section 9.12., "Memory Type Range
Registers (MTRRs)", in Chapter 9, Memory Cache Control).
The MSRs can be read and written to using the RDMSR and WRMSR instructions, respectively.
When performing software initialization of a PentiumR Pro or PentiumR processor, many of the
MSRs will need to be initialized to set up things like performance-monitoring events, run-time
machine checks, and memory types for physical memory.
Systems configured to implement FRC mode must write all of the processors' internal MSRs to
deterministic values before performing either a read or read-modify-write operation using these
registers. The following is a list of MSRs that are not initialized by the processors' reset
sequences.
All fixed and variable MTRRs.
All Machine Check Architecture (MCA) status registers.
Microcode update signature register.
All L2 cache initialization MSRs.
The list of available performance-monitoring counters for the PentiumR Pro and PentiumR
processors is given in Appendix A, Performance-Monitoring Events, and the list of available
MSRs for the PentiumR Pro processor is given in Appendix B, Model-Specific Registers. The
references earlier in this section show where the functions of the various groups of MSRs are
described in this manual.
8.5. MEMORY TYPE RANGE REGISTERS (MTRRS)
Memory type range registers (MTRRs) were introduced into the Intel Architecture with the
PentiumR Pro processor. They allow the type of caching (or no caching) to be specified in system
memory for selected physical address ranges. They allow memory accesses to be optimized for
various types of memory such as RAM, ROM, frame buffer memory, and memory-mapped I/O
devices.
In general, initializing the MTRRs is normally handled by the software initialization code or
BIOS and is not an operating system or executive function. At the very least, all the MTRRs
must be cleared to 0, which selects the uncached (UC) memory type. Refer to Section 9.12.,
"Memory Type Range Registers (MTRRs)", in Chapter 9, Memory Cache Control, for detailed
information on the MTRRs.
8.6. SOFTWARE INITIALIZATION FOR REAL-ADDRESS MODE
OPERATION
Following a hardware reset (either through a power-up or the assertion of the RESET# pin) the
processor is placed in real-address mode and begins executing software initialization code from
physical address FFFFFFF0H. Software initialization code must first set up the necessary data
structures for handling basic system functions, such as a real-mode IDT for handling interrupts
and exceptions. If the processor is to remain in real-address mode, software must then load additional
operating-system or executive code modules and data structures to allow reliable execution
of application programs in real-address mode.
If the processor is going to operate in protected mode, software must load the necessary data
structures to operate in protected mode and then switch to protected mode. The protected-mode
data structures that must be loaded are described in Section 8.7., "Software Initialization for
Protected-Mode Operation".
8.6.1. Real-Address Mode IDT
In real-address mode, the only system data structure that must be loaded into memory is the IDT
(also called the "interrupt vector table"). By default, the address of the base of the IDT is physical
address 0H. This address can be changed by using the LIDT instruction to change the base
address value in the IDTR. Software initialization code needs to load interrupt- and exceptionhandler
pointers into the IDT before interrupts can be enabled.
The actual interrupt- and exception-handler code can be contained either in EPROM or RAM;
however, the code must be located within the 1-MByte addressable range of the processor in
real-address mode. If the handler code is to be stored in RAM, it must be loaded along with the
IDT.
8.6.2. NMI Interrupt Handling
The NMI interrupt is always enabled (except when multiple NMIs are nested). If the IDT and
the NMI interrupt handler need to be loaded into RAM, there will be a period of time following
hardware reset when an NMI interrupt cannot be handled. During this time, hardware must
provide a mechanism to prevent an NMI interrupt from halting code execution until the IDT and
the necessary NMI handler software is loaded.
Here are two examples of how NMIs can be handled during the initial states of processor initialization:
A simple IDT and NMI interrupt handler can be provided in EPROM. This allows an NMI
interrupt to be handled immediately after reset initialization.
The system hardware can provide a mechanism to enable and disable NMIs by passing the
NMI# signal through an AND gate controlled by a flag in an I/O port. Hardware can clear
the flag when the processor is reset, and software can set the flag when it is ready to handle
NMI interrupts.
8.7. SOFTWARE INITIALIZATION FOR PROTECTED-MODE
OPERATION
The processor is placed in real-address mode following a hardware reset. At this point in the
initialization process, some basic data structures and code modules must be loaded into physical
memory to support further initialization of the processor, as described in Section 8.6., "Software
Initialization for Real-Address Mode Operation". Before the processor can be switched to
protected mode, the software initialization code must load a minimum number of protected
mode data structures and code modules into memory to support reliable operation of the
processor in protected mode. These data structures include the following:
A protected-mode IDT.
A GDT.
A TSS.
(Optional.) An LDT.
If paging is to be used, at least one page directory and one page table.
A code segment that contains the code to be executed when the processor switches to
protected mode.
One or more code modules that contain the necessary interrupt and exception handlers.
Software initialization code must also initialize the following system registers before the
processor can be switched to protected mode:
The GDTR.
(Optional.) The IDTR. This register can also be initialized immediately after switching to
protected mode, prior to enabling interrupts.
Control registers CR1 through CR4.
(PentiumR Pro processor only.) The memory type range registers (MTRRs).
With these data structures, code modules, and system registers initialized, the processor can be
switched to protected mode by loading control register CR0 with a value that sets the PE flag
(bit 0).
8.7.1. Protected-Mode System Data Structures
The contents of the protected-mode system data structures loaded into memory during software
initialization, depend largely on the type of memory management the protected-mode operatingsystem
or executive is going to support: flat, flat with paging, segmented, or segmented with
paging.
To implement a flat memory model without paging, software initialization code must at a
minimum load a GDT with one code and one data-segment descriptor. A null descriptor in the
first GDT entry is also required. The stack can be placed in a normal read/write data segment,
so no dedicated descriptor for the stack is required. A flat memory model with paging also
requires a page directory and at least one page table (unless all pages are 4 MBytes in which case
only a page directory is required). Refer to Section 8.7.3., "Initializing Paging"
Before the GDT can be used, the base address and limit for the GDT must be loaded into the
GDTR register using an LGDT instruction.
A multisegmented model may require additional segments for the operating system, as well as
segments and LDTs for each application program. LDTs require segment descriptors in the
GDT. Some operating systems allocate new segments and LDTs as they are needed. This
provides maximum flexibility for handling a dynamic programming environment. However,
many operating systems use a single LDT for all tasks, allocating GDT entries in advance. An
embedded system, such as a process controller, might pre-allocate a fixed number of segments
and LDTs for a fixed number of application programs. This would be a simple and efficient way
to structure the software environment of a real-time system.
8.7.2. Initializing Protected-Mode Exceptions and Interrupts
Software initialization code must at a minimum load a protected-mode IDT with gate descriptor
for each exception vector that the processor can generate. If interrupt or trap gates are used, the
gate descriptors can all point to the same code segment, which contains the necessary exception
handlers. If task gates are used, one TSS and accompanying code, data, and task segments are
required for each exception handler called with a task gate.
If hardware allows interrupts to be generated, gate descriptors must be provided in the IDT for
one or more interrupt handlers.
Before the IDT can be used, the base address and limit for the IDT must be loaded into the IDTR
register using an LIDT instruction. This operation is typically carried out immediately after
switching to protected mode.
8.7.3. Initializing Paging
Paging is controlled by the PG flag in control register CR0. When this flag is clear (its state
following a hardware reset), the paging mechanism is turned off; when it is set, paging is
enabled. Before setting the PG flag, the following data structures and registers must be initialized:
Software must load at least one page directory and one page table into physical memory.
The page table can be eliminated if the page directory contains a directory entry pointing to
itself (here, the page directory and page table reside in the same page), or if only 4-MByte
pages are used.
Control register CR3 (also called the PDBR register) is loaded with the physical base
address of the page directory.
(Optional) Software may provide one set of code and data descriptors in the GDT or in an
LDT for supervisor mode and another set for user mode.
With this paging initialization complete, paging is enabled and the processor is switched to
protected mode at the same time by loading control register CR0 with an image in which the PG
and PE flags are set. (Paging cannot be enabled before the processor is switched to protected
mode.)
8.7.4. Initializing Multitasking
If the multitasking mechanism is not going to be used and changes between privilege levels are
not allowed, it is not necessary load a TSS into memory or to initialize the task register.
If the multitasking mechanism is going to be used and/or changes between privilege levels are
allowed, software initialization code must load at least one TSS and an accompanying TSS
descriptor. (A TSS is required to change privilege levels because pointers to the privileged-level
0, 1, and 2 stack segments and the stack pointers for these stacks are obtained from the TSS.)
TSS descriptors must not be marked as busy when they are created; they should be marked busy
by the processor only as a side-effect of performing a task switch. As with descriptors for LDTs,
TSS descriptors reside in the GDT.
After the processor has switched to protected mode, the LTR instruction can be used to load a
segment selector for a TSS descriptor into the task register. This instruction marks the TSS
descriptor as busy, but does not perform a task switch. The processor can, however, use the TSS
to locate pointers to privilege-level 0, 1, and 2 stacks. The segment selector for the TSS must be
loaded before software performs its first task switch in protected mode, because a task switch
copies the current task state into the TSS.
After the LTR instruction has been executed, further operations on the task register are
performed by task switching. As with other segments and LDTs, TSSs and TSS descriptors can
be either pre-allocated or allocated as needed.
8.8. MODE SWITCHING
To use the processor in protected mode, a mode switch must be performed from real-address
mode. Once in protected mode, software generally does not need to return to real-address mode.
To run software written to run in real-address mode (8086 mode), it is generally more convenient
to run the software in virtual-8086 mode, than to switch back to real-address mode.
8.8.1. Switching to Protected Mode
Before switching to protected mode, a minimum set of system data structures and code modules
must be loaded into memory, as described in Section 8.7., "Software Initialization for Protected-
Mode Operation". Once these tables are created, software initialization code can switch into
protected mode.
Protected mode is entered by executing a MOV CR0 instruction that sets the PE flag in the CR0
register. (In the same instruction, the PG flag in register CR0 can be set to enable paging.)
Execution in protected mode begins with a CPL of 0.
The 32-bit Intel Architecture processors have slightly different requirements for switching to
protected mode. To insure upwards and downwards code compatibility with all 32-bit Intel
Architecture processors, it is recommended that the following steps be performed:
1. Disable interrupts. A CLI instruction disables maskable hardware interrupts. NMI
interrupts can be disabled with external circuitry. (Software must guarantee that no
exceptions or interrupts are generated during the mode switching operation.)
2. Execute the LGDT instruction to load the GDTR register with the base address of the
GDT.
3. Execute a MOV CR0 instruction that sets the PE flag (and optionally the PG flag) in
control register CR0.
4. Immediately following the MOV CR0 instruction, execute a far JMP or far CALL
instruction. (This operation is typically a far jump or call to the next instruction in the
instruction stream.)
The JMP or CALL instruction immediately after the MOV CR0 instruction changes the
flow of execution and serializes the processor.
If paging is enabled, the code for the MOV CR0 instruction and the JMP or CALL
instruction must come from a page that is identity mapped (that is, the linear address before
the jump is the same as the physical address after paging and protected mode is enabled).
The target instruction for the JMP or CALL instruction does not need to be identity
mapped.
5. If a local descriptor table is going to be used, execute the LLDT instruction to load the
segment selector for the LDT in the LDTR register.
6. Execute the LTR instruction to load the task register with a segment selector to the initial
protected-mode task or to a writable area of memory that can be used to store TSS
information on a task switch.
7. After entering protected mode, the segment registers continue to hold the contents they had
in real-address mode. The JMP or CALL instruction in step 4 resets the CS register.
Perform one of the following operations to update the contents of the remaining segment
registers.
- Reload segment registers DS, SS, ES, FS, and GS. If the ES, FS, and/or GS registers
are not going to be used, load them with a null selector.
- Perform a JMP or CALL instruction to a new task, which automatically resets the
values of the segment registers and branches to a new code segment.
8. Execute the LIDT instruction to load the IDTR register with the address and limit of the
protected-mode IDT.
9. Execute the STI instruction to enable maskable hardware interrupts and perform the
necessary hardware operation to enable NMI interrupts.
Random failures can occur if other instructions exist between steps 3 and 4 above. Failures will
be readily seen in some situations, such as when instructions that reference memory are inserted
between steps 3 and 4 while in System Management mode.
8.8.2. Switching Back to Real-Address Mode
The processor switches back to real-address mode if software clears the PE bit in the CR0
register with a MOV CR0 instruction. A procedure that re-enters real-address mode should
perform the following steps:
1. Disable interrupts. A CLI instruction disables maskable hardware interrupts. NMI
interrupts can be disabled with external circuitry.
2. If paging is enabled, perform the following operations:
- Transfer program control to linear addresses that are identity mapped to physical
addresses (that is, linear addresses equal physical addresses).
- Insure that the GDT and IDT are in identity mapped pages.
- Clear the PG bit in the CR0 register.
- Move 0H into the CR3 register to flush the TLB.
3. Transfer program control to a readable segment that has a limit of 64 KBytes (FFFFH).
This operation loads the CS register with the segment limit required in real-address mode.
4. Load segment registers SS, DS, ES, FS, and GS with a selector for a descriptor containing
the following values, which are appropriate for real-address mode:
- Limit = 64 KBytes (0FFFFH)
- Byte granular (G = 0)
- Expand up (E = 0)
- Writable (W = 1)
- Present (P = 1)
- Base = any value
The segment registers must be loaded with nonnull segment selectors or the segment
registers will be unusable in real-address mode. Note that if the segment registers are not
reloaded, execution continues using the descriptor attributes loaded during protected
mode.
5. Execute an LIDT instruction to point to a real-address mode interrupt table that is within
the 1-MByte real-address mode address range.
6. Clear the PE flag in the CR0 register to switch to real-address mode.
7. Execute a far JMP instruction to jump to a real-address mode program. This operation
flushes the instruction queue and loads the appropriate base and access rights values in the
CS register.
8. Load the SS, DS, ES, FS, and GS registers as needed by the real-address mode code. If any
of the registers are not going to be used in real-address mode, write 0s to them.
9. Execute the STI instruction to enable maskable hardware interrupts and perform the
necessary hardware operation to enable NMI interrupts.
NOTE
All the code that is executed in steps 1 through 9 must be in a single page and
the linear addresses in that page must be identity mapped to physical
addresses.
8.9. INITIALIZATION AND MODE SWITCHING EXAMPLE
This section provides an initialization and mode switching example that can be incorporated into
an application. This code was originally written to initialize the Intel386T processor, but it will
execute successfully on the PentiumR Pro, PentiumR, and Intel486T processors. The code in this
example is intended to reside in EPROM and to run following a hardware reset of the processor.
The function of the code is to do the following:
Establish a basic real-address mode operating environment.
Load the necessary protected-mode system data structures into RAM.
Load the system registers with the necessary pointers to the data structures and the
appropriate flag settings for protected-mode operation.
Switch the processor to protected mode.
Figure 8-3 shows the physical memory layout for the processor following a hardware reset and
the starting point of this example. The EPROM that contains the initialization code resides at the
upper end of the processor's physical memory address range, starting at address FFFFFFFFH
and going down from there. The address of the first instruction to be executed is at FFFFFFF0H,
the default starting address for the processor following a hardware reset.
The main steps carried out in this example are summarized in Table 8-4. The source listing for
the example (with the filename STARTUP.ASM) is given in Example 8-1. The line numbers
given in Table 8-4 refer to the source listing.
The following are some additional notes concerning this example:
When the processor is switched into protected mode, the original code segment baseaddress
value of FFFF0000H (located in the hidden part of the CS register) is retained and
execution continues from the current offset in the EIP register. The processor will thus
continue to execute code in the EPROM until a far jump or call is made to a new code
segment, at which time, the base address in the CS register will be changed.
Maskable hardware interrupts are disabled after a hardware reset and should remain
disabled until the necessary interrupt handlers have been installed. The NMI interrupt is
not disabled following a reset. The NMI# pin must thus be inhibited from being asserted
until an NMI handler has been loaded and made available to the processor.
The use of a temporary GDT allows simple transfer of tables from the EPROM to
anywhere in the RAM area. A GDT entry is constructed with its base pointing to address 0
and a limit of 4 GBytes. When the DS and ES registers are loaded with this descriptor, the
temporary GDT is no longer needed and can be replaced by the application GDT.
This code loads one TSS and no LDTs. If more TSSs exist in the application, they must be
loaded into RAM. If there are LDTs they may be loaded as well.
Figure 8-3
|
Table 8-4. Main Initialization Steps in STARTUP.ASM Source Listing
STARTUP.ASM Line Numbers |
Description |
From |
To |
157 |
157 |
Jump (short) to the entry code in the EPROM |
162 |
169 |
Construct a temporary GDT in RAM with one entry: 0 - null 1 - R/W data segment, base = 0, limit = 4 GBytes |
171 |
172 |
Load the GDTR to point to the temporary GDT |
174 |
177 |
Load CR0 with PE flag set to switch to protected mode |
179 |
181 |
Jump near to clear real mode instruction queue |
184 |
186 |
Load DS, ES registers with GDT[1] descriptor, so both point to the entire physical memory space |
188 |
195 |
Perform specific board initialization that is imposed by the new protected mode |
196 |
218 |
Copy the applications GDT from ROM into RAM |
220 |
238 |
Copy the applications IDT from ROM into RAM |
241 |
243 |
Load applications GDTR |
244 |
245 |
Load applications IDTR |
247 |
261 |
Copy the applications TSS from ROM into RAM |
263 |
267 |
Update TSS descriptor and other aliases in GDT (GDT alias or IDT alias) |
277 |
277 |
Load the task register (without task switch) using LTR instruction |
282 |
286 |
Load SS, ESP with the value found in the applications TSS |
287 |
287 |
Push EFLAGS value found in the applications TSS |
288 |
288 |
Push CS value found in the applications TSS |
289 |
289 |
Push EIP value found in the applications TSS |
290 |
293 |
Load DS, ES with the value found in the applications TSS |
296 |
296 |
Perform IRET; pop the above values and enter the application code |
8.9.1. Assembler Usage
In this example, the Intel assembler ASM386 and build tools BLD386 are used to assemble and
build the initialization code module. The following assumptions are used when using the Intel
ASM386 and BLD386 tools.
The ASM386 will generate the right operand size opcodes according to the code-segment
attribute. The attribute is assigned either by the ASM386 invocation controls or in the
code-segment definition.
If a code segment that is going to run in real-address mode is defined, it must be set to a
USE 16 attribute. If a 32-bit operand is used in an instruction in this code segment (for
example, MOV EAX, EBX), the assembler automatically generates an operand prefix for
the instruction that forces the processor to execute a 32-bit operation, even though its
default code-segment attribute is 16-bit.
Intel's ASM386 assembler allows specific use of the 16- or 32-bit instructions, for
example, LGDTW, LGDTD, IRETD. If the generic instruction LGDT is used, the defaultsegment
attribute will be used to generate the right opcode.
8.9.2. STARTUP.ASM Listing
The source code listing to move the processor into protected mode is provided in Example 8-1.
This listing does not include any opcode and offset information.
Example 8-1. STARTUP.ASM
MS-DOS* 5.0(045-N) 386(TM) MACRO ASSEMBLER STARTUP 09:44:51 08/19/92 PAGE 1
MS-DOS 5.0(045-N) 386(TM) MACRO ASSEMBLER V4.0, ASSEMBLY OF MODULE
STARTUP
OBJECT MODULE PLACED IN startup.obj
ASSEMBLER INVOKED BY: f:\386tools\ASM386.EXE startup.a58 pw (132 )
LINE SOURCE
1 NAME STARTUP
2
3 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
4 ;
5 ; ASSUMPTIONS:
6 ;
7 ; 1. The bottom 64K of memory is ram, and can be used for
8 ; scratch space by this module.
9 ;
10 ; 2. The system has sufficient free usable ram to copy the
11 ; initial GDT, IDT, and TSS
8-20
PROCESSOR MANAGEMENT AND INITIALIZATION
12 ;
13 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
14
15 ; configuration data - must match with build definition
16
17 CS_BASE EQU 0FFFF0000H
18
19 ; CS_BASE is the linear address of the segment STARTUP_CODE
20 ; - this is specified in the build language file
21
22 RAM_START EQU 400H
23
24 ; RAM_START is the start of free, usable ram in the linear
25 ; memory space. The GDT, IDT, and initial TSS will be
26 ; copied above this space, and a small data segment will be
27 ; discarded at this linear address. The 32-bit word at
28 ; RAM_START will contain the linear address of the first
29 ; free byte above the copied tables - this may be useful if
30 ; a memory manager is used.
31
32 TSS_INDEX EQU 10
33
34 ; TSS_INDEX is the index of the TSS of the first task to
35 ; run after startup
36
37
38 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
39
40 ; ------------------------- STRUCTURES and EQU ---------------
41 ; structures for system data
42
43 ; TSS structure
44 TASK_STATE STRUC
45 link DW ?
46 link_h DW ?
47 ESP0 DD ?
48 SS0 DW ?
49 SS0_h DW ?
50 ESP1 DD ?
51 SS1 DW ?
52 SS1_h DW ?
53 ESP2 DD ?
54 SS2 DW ?
55 SS2_h DW ?
56 CR3_reg DD ?
57 EIP_reg DD ?
58 EFLAGS_reg DD ?
59 EAX_reg DD ?
60 ECX_reg DD ?
61 EDX_reg DD ?
62 EBX_reg DD ?
63 ESP_reg DD ?
64 EBP_reg DD ?
65 ESI_reg DD ?
66 EDI_reg DD ?
67 ES_reg DW ?
68 ES_h DW ?
69 CS_reg DW ?
70 CS_h DW ?
71 SS_reg DW ?
72 SS_h DW ?
73 DS_reg DW ?
74 DS_h DW ?
75 FS_reg DW ?
76 FS_h DW ?
77 GS_reg DW ?
78 GS_h DW ?
79 LDT_reg DW ?
80 LDT_h DW ?
81 TRAP_reg DW ?
82 IO_map_base DW ?
83 TASK_STATE ENDS
84
85 ; basic structure of a descriptor
86 DESC STRUC
87 lim_0_15 DW ?
88 bas_0_15 DW ?
89 bas_16_23 DB ?
90 access DB ?
91 gran DB ?
92 bas_24_31 DB ?
93 DESC ENDS
94
95 ; structure for use with LGDT and LIDT instructions
96 TABLE_REG STRUC
97 table_lim DW ?
98 table_linear DD ?
99 TABLE_REG ENDS
100
101 ; offset of GDT and IDT descriptors in builder generated GDT
102 GDT_DESC_OFF EQU 1*SIZE(DESC)
103 IDT_DESC_OFF EQU 2*SIZE(DESC)
104
105 ; equates for building temporary GDT in RAM
106 LINEAR_SEL EQU 1*SIZE (DESC)
107 LINEAR_PROTO_LO EQU 00000FFFFH ; LINEAR_ALIAS
108 LINEAR_PROTO_HI EQU 000CF9200H
109
110 ; Protection Enable Bit in CR0
111 PE_BIT EQU 1B
112
113 ; ------------------------------------------------------------
114
115 ; ------------------------- DATA SEGMENT----------------------
116
117 ; Initially, this data segment starts at linear 0, according
118 ; to the processor's power-up state.
119
120 STARTUP_DATA SEGMENT RW
121
122 free_mem_linear_base LABEL DWORD
123 TEMP_GDT LABEL BYTE ; must be first in segment
124 TEMP_GDT_NULL_DESC DESC <>
125 TEMP_GDT_LINEAR_DESC DESC <>
126
127 ; scratch areas for LGDT and LIDT instructions
128 TEMP_GDT_SCRATCH TABLE_REG <>
129 APP_GDT_RAM TABLE_REG <>
130 APP_IDT_RAM TABLE_REG <>
131 ; align end_data
132 fill DW ?
133
134 ; last thing in this segment - should be on a dword boundary
135 end_data LABEL BYTE
136
137 STARTUP_DATA ENDS
138 ; ------------------------------------------------------------
139
140
141 ; ------------------------- CODE SEGMENT----------------------
142 STARTUP_CODE SEGMENT ER PUBLIC USE16
143
144 ; filled in by builder
145 PUBLIC GDT_EPROM
146 GDT_EPROM TABLE_REG <>
147
148 ; filled in by builder
149 PUBLIC IDT_EPROM
150 IDT_EPROM TABLE_REG <>
151
152 ; entry point into startup code - the bootstrap will vector
153 ; here with a near JMP generated by the builder. This
154 ; label must be in the top 64K of linear memory.
155
156 PUBLIC STARTUP
157 STARTUP:
158
159 ; DS,ES address the bottom 64K of flat linear memory
160 ASSUME DS:STARTUP_DATA, ES:STARTUP_DATA
161 ; See Figure 8-4
162 ; load GDTR with temporary GDT
163 LEA EBX,TEMP_GDT ; build the TEMP_GDT in low ram,
164 MOV DWORD PTR [EBX],0 ; where we can address
165 MOV DWORD PTR [EBX]+4,0
166 MOV DWORD PTR [EBX]+8, LINEAR_PROTO_LO
167 MOV DWORD PTR [EBX]+12, LINEAR_PROTO_HI
168 MOV TEMP_GDT_scratch.table_linear,EBX
169 MOV TEMP_GDT_scratch.table_lim,15
170
171 DB 66H ; execute a 32 bit LGDT
172 LGDT TEMP_GDT_scratch
173
174 ; enter protected mode
175 MOV EBX,CR0
176 OR EBX,PE_BIT
177 MOV CR0,EBX
178
179 ; clear prefetch queue
180 JMP CLEAR_LABEL
181 CLEAR_LABEL:
182
183 ; make DS and ES address 4G of linear memory
184 MOV CX,LINEAR_SEL
185 MOV DS,CX
186 MOV ES,CX
187
188 ; do board specific initialization
189 ;
190 ;
191 ; ......
192 ;
193
194
195 ; See Figure 8-5
196 ; copy EPROM GDT to ram at:
197 ; RAM_START + size (STARTUP_DATA)
198 MOV EAX,RAM_START
199 ADD EAX,OFFSET (end_data)
200 MOV EBX,RAM_START
201 MOV ECX, CS_BASE
202 ADD ECX, OFFSET (GDT_EPROM)
203 MOV ESI, [ECX].table_linear
204 MOV EDI,EAX
205 MOVZX ECX, [ECX].table_lim
206 MOV APP_GDT_ram[EBX].table_lim,CX
207 INC ECX
208 MOV EDX,EAX
209 MOV APP_GDT_ram[EBX].table_linear,EAX
210 ADD EAX,ECX
211 REP MOVS BYTE PTR ES:[EDI],BYTE PTR DS:[ESI]
212
213 ; fixup GDT base in descriptor
214 MOV ECX,EDX
215 MOV [EDX].bas_0_15+GDT_DESC_OFF,CX
216 ROR ECX,16
217 MOV [EDX].bas_16_23+GDT_DESC_OFF,CL
218 MOV [EDX].bas_24_31+GDT_DESC_OFF,CH
219
220 ; copy EPROM IDT to ram at:
221 ; RAM_START+size(STARTUP_DATA)+SIZE (EPROM GDT)
222 MOV ECX, CS_BASE
223 ADD ECX, OFFSET (IDT_EPROM)
224 MOV ESI, [ECX].table_linear
225 MOV EDI,EAX
226 MOVZX ECX, [ECX].table_lim
227 MOV APP_IDT_ram[EBX].table_lim,CX
228 INC ECX
229 MOV APP_IDT_ram[EBX].table_linear,EAX
230 MOV EBX,EAX
231 ADD EAX,ECX
232 REP MOVS BYTE PTR ES:[EDI],BYTE PTR DS:[ESI]
233
234 ; fixup IDT pointer in GDT
235 MOV [EDX].bas_0_15+IDT_DESC_OFF,BX
236 ROR EBX,16
237 MOV [EDX].bas_16_23+IDT_DESC_OFF,BL
238 MOV [EDX].bas_24_31+IDT_DESC_OFF,BH
239
240 ; load GDTR and IDTR
241 MOV EBX,RAM_START
242 DB 66H ; execute a 32 bit LGDT
243 LGDT APP_GDT_ram[EBX]
244 DB 66H ; execute a 32 bit LIDT
245 LIDT APP_IDT_ram[EBX]
246
247 ; move the TSS
248 MOV EDI,EAX
249 MOV EBX,TSS_INDEX*SIZE(DESC)
250 MOV ECX,GDT_DESC_OFF ;build linear address for TSS
251 MOV GS,CX
252 MOV DH,GS:[EBX].bas_24_31
253 MOV DL,GS:[EBX].bas_16_23
254 ROL EDX,16
255 MOV DX,GS:[EBX].bas_0_15
256 MOV ESI,EDX
257 LSL ECX,EBX
258 INC ECX
259 MOV EDX,EAX
260 ADD EAX,ECX
261 REP MOVS BYTE PTR ES:[EDI],BYTE PTR DS:[ESI]
262
263 ; fixup TSS pointer
264 MOV GS:[EBX].bas_0_15,DX
265 ROL EDX,16
266 MOV GS:[EBX].bas_24_31,DH
267 MOV GS:[EBX].bas_16_23,DL
268 ROL EDX,16
269 ;save start of free ram at linear location RAMSTART
270 MOV free_mem_linear_base+RAM_START,EAX
271
272 ;assume no LDT used in the initial task - if necessary,
273 ;code to move the LDT could be added, and should resemble
274 ;that used to move the TSS
275
276 ; load task register
277 LTR BX ; No task switch, only descriptor loading
278 ; See Figure 8-6
279 ; load minimal set of registers necessary to simulate task
280 ; switch
281
282
283 MOV AX,[EDX].SS_reg ; start loading registers
284 MOV EDI,[EDX].ESP_reg
285 MOV SS,AX
286 MOV ESP,EDI ; stack now valid
287 PUSH DWORD PTR [EDX].EFLAGS_reg
288 PUSH DWORD PTR [EDX].CS_reg
289 PUSH DWORD PTR [EDX].EIP_reg
290 MOV AX,[EDX].DS_reg
291 MOV BX,[EDX].ES_reg
292 MOV DS,AX ; DS and ES no longer linear memory
293 MOV ES,BX
294
295 ; simulate far jump to initial task
296 IRETD
297
298 STARTUP_CODE ENDS
*** WARNING #377 IN 298, (PASS 2) SEGMENT CONTAINS PRIVILEGED INSTRUCTION(S)
299
300 END STARTUP, DS:STARTUP_DATA, SS:STARTUP_DATA
301
302
ASSEMBLY COMPLETE, 1 WARNING, NO ERRORS.
Figure 8-4
Figure 8-4. Constructing Temporary GDT and Switching to Protected Mode (Lines
162-172 of List File)
Figure 8-5
Figure 8-5. Moving the GDT, IDT and TSS from ROM to RAM (Lines 196-261 of List File)
Figure 8-6
Figure 8-6. Task Switching (Lines 282-296 of List File)
8.9.3. MAIN.ASM Source Code
The file MAIN.ASM shown in Example 8-2 defines the data and stack segments for this application
and can be substituted with the main module task written in a high-level language that is
invoked by the IRET instruction executed by STARTUP.ASM.
Example 8-2. MAIN.ASM
NAME main_module
data SEGMENT RW
dw 1000 dup(?)
DATA ENDS
stack stackseg 800
CODE SEGMENT ER use32 PUBLIC
main_start:
nop
nop
nop
CODE ENDS
END main_start, ds:data, ss:stack
8.9.4. Supporting Files
The batch file shown in Example 8-3 can be used to assemble the source code files
STARTUP.ASM and MAIN.ASM and build the final application.
Example 8-3. Batch File to Assemble and Build the Application
ASM386 STARTUP.ASM
ASM386 MAIN.ASM
BLD386 STARTUP.OBJ, MAIN.OBJ buildfile(EPROM.BLD) bootstrap(STARTUP) Bootload
BLD386 performs several operations in this example:
It allocates physical memory location to segments and tables.
It generates tables using the build file and the input files.
It links object files and resolves references.
It generates a boot-loadable file to be programmed into the EPROM.
Example 8-4 shows the build file used as an input to BLD386 to perform the above functions.
Example 8-4. Build File
INIT_BLD_EXAMPLE;
SEGMENT
*SEGMENTS(DPL = 0)
, startup.startup_code(BASE = 0FFFF0000H)
;
TASK
BOOT_TASK(OBJECT = startup, INITIAL,DPL = 0,
NOT INTENABLED)
, PROTECTED_MODE_TASK(OBJECT = main_module,DPL = 0,
NOT INTENABLED)
;
TABLE
GDT (
LOCATION = GDT_EPROM
, ENTRY = (
10: PROTECTED_MODE_TASK
, startup.startup_code
, startup.startup_data
, main_module.data
, main_module.code
, main_module.stack
)
),
IDT (
LOCATION = IDT_EPROM
);
MEMORY
(
RESERVE = (0..3FFFH
-- Area for the GDT, IDT, TSS copied from ROM
, 60000H..0FFFEFFFFH)
, RANGE = (ROM_AREA = ROM (0FFFF0000H..0FFFFFFFFH))
-- Eprom size 64K
, RANGE = (RAM_AREA = RAM (4000H..05FFFFH))
);
END
8.10. P6 FAMILY MICROCODE UPDATE FEATURE
P6 family processors have the capability to correct specific errata through the loading of an
Intel-supplied data block. This data block is referred to as a microcode update. This chapter
describes the underlying mechanisms the BIOS needs to provide in order to utilize this feature
during system initialization. It also describes a specification that provides for incorporating
future releases of the microcode update into a system BIOS.
Intel considers the combination of a particular silicon revision and the microcode update as the
equivalent stepping of the processor. Intel does not validate processors without the microcode
update loaded. Intel completes a full-stepping level validation and testing for new releases of
microcode updates.
A microcode update is used to correct specific errata in the processor. The BIOS, which incorporates
an update loader, is responsible for loading the appropriate update on all processors
during system initialization (refer to Figure 8-7). There are effectively two steps to this process.
The first is to incorporate the necessary microcode updates into the BIOS, the second is to actually
load the appropriate microcode update into the processor.
Figure 8-7
Figure 8-7. Integrating Processor Specific Updates
8.10.1. Microcode Update
A microcode update consists of an Intel-supplied binary that contains a descriptive header and
data. No executable code resides within the update. This section describes the update and the
structure of its data format.
Each microcode update is tailored for a particular stepping of a P6 family processor. It is
designed such that a mismatch between a stepping of the processor and the update will result in
a failure to load. Thus, a given microcode update is associated with a particular type, family,
model, and stepping of the processor as returned by the CPUID instruction. In addition, the
intended processor platform type must be determined to properly target the microcode update.
The intended processor platform type is determined by reading a model-specific register MSR
(17h) (refer to Table 8-6) within the P6 family processor. This is a 64-bit register that may be
read using the RDMSR instruction (refer to Section 3.2., "Instruction Reference" Chapter 3,
Instruction Set Reference, Volume 1 of the Programmer's Reference Manual). The three platform
ID bits, when read as a binary coded decimal (BCD) number indicate the bit position in the
microcode update header's, Processor Flags field, that is associated with the installed processor.
Register Name:BBL_CR_OVRD
MSR Address:017h
Access:Read Only
BBL_CR_OVRD is a 64-bit register accessed only when referenced as a Qword through a
RDMSR instruction.
The microcode update is a data block that is exactly 2048 bytes in length. The initial 48 bytes
of the update contain a header with information used to identify the update. The update header
and its reserved fields are interpreted by software based upon the header version. The initial
version of the header is 00000001h. An encoding scheme also guards against tampering of the
update data and provides a means for determining the authenticity of any given update. Table
8-7 defines each of the fields and Figure 8-8 shows the format of the microcode update data
block.
|