Search     or:     and:
 LINUX 
 Language 
 Kernel 
 Package 
 Book 
 Test 
 OS 
 Forum 
 iakovlev.org 
 OS
 osjournal 
 Protected Mode 
 Hardware 
 Kernels
  Dark Fiber
  BOS
  QNX
  OS Dev
  Lecture notes
  MINIX
  OS
  Solaris
  История UNIX
  История FreeBSD
  Сетунь
  Эльбрус
NEWS
Последние статьи :
  Тренажёр 16.01   
  Эльбрус 05.12   
  Алгоритмы 12.04   
  Rust 07.11   
  Go 25.12   
  EXT4 10.11   
  FS benchmark 15.09   
  Сетунь 23.07   
  Trees 25.06   
  Apache 03.02   
 
TOP 20
 Linux Kernel 2.6...5170 
 Trees...940 
 Максвелл 3...871 
 Go Web ...823 
 William Gropp...803 
 Ethreal 3...787 
 Gary V.Vaughan-> Libtool...774 
 Ethreal 4...771 
 Rodriguez 6...766 
 Ext4 FS...755 
 Clickhouse...755 
 Steve Pate 1...754 
 Ethreal 1...742 
 Secure Programming for Li...732 
 C++ Patterns 3...716 
 Ulrich Drepper...698 
 Assembler...695 
 DevFS...662 
 Стивенс 9...651 
 MySQL & PosgreSQL...632 
 
  01.01.2024 : 3621733 посещений 

iakovlev.org

Intel Architecture Software Developer's Manual

Volume 3 : System Programming

CHAPTER 5 INTERRUPT AND EXCEPTION HANDLING

This chapter describes the processor's interrupt and exception-handling mechanism, when operating in protected mode. Most of the information provided here also applies to the interrupt and exception mechanism used in real-address or virtual-8086 mode. Refer to Chapter 16, 8086 Emulation for a description of the differences in the interrupt and exception mechanism for realaddress and virtual-8086 mode.

5.1. INTERRUPT AND EXCEPTION OVERVIEW

Interrupts and exceptions are forced transfers of execution from the currently running program or task to a special procedure or task called a handler. Interrupts typically occur at random times during the execution of a program, in response to signals from hardware. They are used to handle events external to the processor, such as requests to service peripheral devices. Software can also generate interrupts by executing the INT n instruction. Exceptions occur when the processor detects an error condition while executing an instruction, such as division by zero. The processor detects a variety of error conditions including protection violations, page faults, and internal machine faults. The machine-check architecture of the P6 family and PentiumR processors also permits a machine-check exception to be generated when internal hardware errors and bus errors are detected.

The processor's interrupt and exception-handling mechanism allows interrupts and exceptions to be handled transparently to application programs and the operating system or executive. When an interrupt is received or an exception is detected, the currently running procedure or task is automatically suspended while the processor executes an interrupt or exception handler. When execution of the handler is complete, the processor resumes execution of the interrupted procedure or task. The resumption of the interrupted procedure or task happens without loss of program continuity, unless recovery from an exception was not possible or an interrupt caused the currently running program to be terminated.

This chapter describes the processor's interrupt and exception-handling mechanism, when operating in protected mode. A detailed description of the exceptions and the conditions that cause them to be generated is given at the end of this chapter. Refer to Chapter 16, 8086 Emulation for a description of the interrupt and exception mechanism for real-address and virtual-8086 mode.

5.1.1. Sources of Interrupts

The processor receives interrupts from two sources:

 External (hardware generated) interrupts.

 Software-generated interrupts.

5.1.1.1. EXTERNAL INTERRUPTS

External interrupts are received through pins on the processor or through the local APIC serial bus. The primary interrupt pins on a P6 family or PentiumR processor are the LINT[1:0] pins, which are connected to the local APIC (refer to Section 7.5., "Advanced Programmable Interrupt Controller (APIC)" in Chapter 7, Multiple-Processor Management). When the local APIC is disabled, these pins are configured as INTR and NMI pins, respectively. Asserting the INTR pin signals the processor that an external interrupt has occurred, and the processor reads from the system bus the interrupt vector number provided by an external interrupt controller, such as an 8259A (refer to Section 5.2., "Exception and Interrupt Vectors"). Asserting the NMI pin signals a nonmaskable interrupt (NMI), which is assigned to interrupt vector 2.

When the local APIC is enabled, the LINT[1:0] pins can be programmed through the APIC's vector table to be associated with any of the processor's exception or interrupt vectors.

The processor's local APIC can be connected to a system-based I/O APIC. Here, external interrupts received at the I/O APIC's pins can be directed to the local APIC through the APIC serial bus (pins PICD[1:0]). The I/O APIC determines the vector number of the interrupt and sends this number to the local APIC. When a system contains multiple processors, processors can also send interrupts to one another by means of the APIC serial bus.

The LINT[1:0] pins are not available on the Intel486T processor and the earlier PentiumR processors that do not contain an on-chip local APIC. Instead these processors have dedicated NMI and INTR pins. With these processors, external interrupts are typically generated by a system-based interrupt controller (8259A), with the interrupts being signaled through the INTR pin.

Note that several other pins on the processor cause a processor interrupt to occur; however, these interrupts are not handled by the interrupt and exception mechanism described in this chapter. These pins include the RESET#, FLUSH#, STPCLK#, SMI#, R/S#, and INIT# pins. Which of these pins are included on a particular Intel Architecture processor is implementation dependent. The functions of these pins are described in the data books for the individual processors. The SMI# pin is also described in Chapter 12, System Management Mode (SMM).

5.1.1.2. MASKABLE HARDWARE INTERRUPTS

Any external interrupt that is delivered to the processor by means of the INTR pin or through the local APIC is called a maskable hardware interrupt. The maskable hardware interrupts that can be delivered through the INTR pin include all Intel Architecture defined interrupt vectors from 0 through 255; those that can be delivered through the local APIC include interrupt vectors 16 through 255.

All maskable hardware interrupts can be masked as a group. Use the single IF flag in the EFLAGS register (refer to Section 5.6.1., "Masking Maskable Hardware Interrupts") to mask these maskable interrupts. Note that when interrupts 0 through 15 are delivered through the local APIC, the APIC indicates the receipt of an illegal vector.

5.1.1.3. SOFTWARE-GENERATED INTERRUPTS

The INT n instruction permits interrupts to be generated from within software by supplying the interrupt vector number as an operand. For example, the INT 35 instruction forces an implicit call to the interrupt handler for interrupt 35.

Any of the interrupt vectors from 0 to 255 can be used as a parameter in this instruction. If the processor's predefined NMI vector is used, however, the response of the processor will not be the same as it would be from an NMI interrupt generated in the normal manner. If vector number 2 (the NMI vector) is used in this instruction, the NMI interrupt handler is called, but the processor's NMI-handling hardware is not activated.

Note that interrupts generated in software with the INT n instruction cannot be masked by the IF flag in the EFLAGS register.

5.1.2. Sources of Exceptions

The processor receives exceptions from three sources:

 Processor-detected program-error exceptions.

 Software-generated exceptions.

 Machine-check exceptions.

5.1.2.1. PROGRAM-ERROR EXCEPTIONS

The processor generates one or more exceptions when it detects program errors during the execution in an application program or the operating system or executive. The Intel Architecture defines a vector number for each processor-detectable exception. The exceptions are further classified as faults, traps, and aborts (refer to Section 5.3., "Exception Classifications").

5.1.2.2. SOFTWARE-GENERATED EXCEPTIONS

The INTO, INT 3, and BOUND instructions permit exceptions to be generated in software. These instructions allow checks for specific exception conditions to be performed at specific points in the instruction stream. For example, the INT 3 instruction causes a breakpoint exception to be generated.

The INT n instruction can be used to emulate a specific exception in software, with one limitation. If the n operand in the INT n instruction contains a vector for one of the Intel Architecture exceptions, the processor will generate an interrupt to that vector, which will in turn invoke the exception handler associated with that vector. Because this is actually an interrupt, however, the processor does not push an error code onto the stack, even if a hardware-generated exception for that vector normally produces one. For those exceptions that produce an error code, the exception handler will attempt to pop an error code from the stack while handling the exception. If the INT n instruction was used to emulate the generation of an exception, the handler will pop off and discard the EIP (in place of the missing error code), sending the return to the wrong location.

5.1.2.3. MACHINE-CHECK EXCEPTIONS

The P6 family and PentiumR processors provide both internal and external machine-check mechanisms for checking the operation of the internal chip hardware and bus transactions. These mechanisms constitute extended (implementation dependent) exception mechanisms. When a machine-check error is detected, the processor signals a machine-check exception (vector 18) and returns an error code. Refer to "Interrupt 18-Machine Check Exception (#MC)" at the end of this chapter and Chapter 13, Machine-Check Architecture, for a detailed description of the machine-check mechanism.

5.2. EXCEPTION AND INTERRUPT VECTORS

The processor associates an identification number, called a vector, with each exception and interrupt. Table 5-1 shows the assignment of exception and interrupt vectors. This table also gives the exception type for each vector, indicates whether an error code is saved on the stack for an exception, and gives the source of the exception or interrupt.

The vectors in the range 0 through 31 are assigned to the exceptions and the NMI interrupt. Not all of these vectors are currently used by the processor. Unassigned vectors in this range are reserved for possible future uses. Do not use the reserved vectors.

The vectors in the range 32 to 255 are designated as user-defined interrupts. These interrupts are not reserved by the Intel Architecture and are generally assigned to external I/O devices and to permit them to signal the processor through one of the external hardware interrupt mechanisms described in Section 5.1.1., "Sources of Interrupts"

5.3. EXCEPTION CLASSIFICATIONS

Exceptions are classified as faults, traps, or aborts depending on the way they are reported and whether the instruction that caused the exception can be restarted with no loss of program or task continuity.

Faults

A fault is an exception that can generally be corrected and that, once corrected, allows the program to be restarted with no loss of continuity. When a fault is reported, the processor restores the machine state to the state prior to the beginning of execution of the faulting instruction. The return address (saved contents of the CS and EIP registers) for the fault handler points to the faulting instruction, rather than the instruction following the faulting instruction.

Note: There are a small subset of exceptions that are normally reported as faults, but under architectural corner cases, they are not restartable and some processor context will be lost. An example of these cases is the execution of the POPAD instruction where the stack frame crosses over the the end of the stack segment. The exception handler will see that the CS:EIP has been restored as if the POPAD instruction had not executed however internal processor state (general purpose registers) will have been modified. These corner cases are considered programming errors and an application causeing this class of exceptions will likely be terminated by the operating system.

Traps

A trap is an exception that is reported immediately following the execution of the trapping instruction. Traps allow execution of a program or task to be continued without loss of program continuity. The return address for the trap handler points to the instruction to be executed after the trapping instruction.

Aborts

An abort is an exception that does not always report the precise location of the instruction causing the exception and does not allow restart of the program or task that caused the exception. Aborts are used to report severe errors, such as hardware errors and inconsistent or illegal values in system tables.

NOTES:

1. The UD2 instruction was introduced in the PentiumR Pro processor.

2. Intel Architecture processors after the Intel386T processor do not generate this exception.

3. This exception was introduced in the Intel486T processor.

4. This exception was introduced in the PentiumR processor and enhanced in the P6 family processors.

5. This exception was introduced in the PentiumR III processor.

5.4. PROGRAM OR TASK RESTART

To allow restarting of program or task following the handling of an exception or an interrupt, all exceptions except aborts are guaranteed to report the exception on a precise instruction boundary, and all interrupts are guaranteed to be taken on an instruction boundary.

For fault-class exceptions, the return instruction pointer that the processor saves when it generates the exception points to the faulting instruction. So, when a program or task is restarted following the handling of a fault, the faulting instruction is restarted (re-executed). Restarting the faulting instruction is commonly used to handle exceptions that are generated when access to an operand is blocked. The most common example of a fault is a page-fault exception (#PF) that occurs when a program or task references an operand in a page that is not in memory. When a page-fault exception occurs, the exception handler can load the page into memory and resume execution of the program or task by restarting the faulting instruction. To insure that this instruction restart is handled transparently to the currently executing program or task, the processor saves the necessary registers and stack pointers to allow it to restore itself to its state prior to the execution of the faulting instruction.

For trap-class exceptions, the return instruction pointer points to the instruction following the trapping instruction. If a trap is detected during an instruction which transfers execution, the return instruction pointer reflects the transfer. For example, if a trap is detected while executing a JMP instruction, the return instruction pointer points to the destination of the JMP instruction, not to the next address past the JMP instruction. All trap exceptions allow program or task restart with no loss of continuity. For example, the overflow exception is a trapping exception. Here, the return instruction pointer points to the instruction following the INTO instruction that tested the OF (overflow) flag in the EFLAGS register. The trap handler for this exception resolves the overflow condition. Upon return from the trap handler, program or task execution continues at the next instruction following the INTO instruction.

The abort-class exceptions do not support reliable restarting of the program or task. Abort handlers generally are designed to collect diagnostic information about the state of the processor when the abort exception occurred and then shut down the application and system as gracefully as possible.

Interrupts rigorously support restarting of interrupted programs and tasks without loss of continuity. The return instruction pointer saved for an interrupt points to the next instruction to be executed at the instruction boundary where the processor took the interrupt. If the instruction just executed has a repeat prefix, the interrupt is taken at the end of the current iteration with the registers set to execute the next iteration.

The ability of a P6 family processor to speculatively execute instructions does not affect the taking of interrupts by the processor. Interrupts are taken at instruction boundaries located during the retirement phase of instruction execution; so they are always taken in the "in-order" instruction stream. Refer to Chapter 2, Introduction to the Intel Architecture, in the Intel Architecture Software Developer's Manual, Volume 1, for more information about the P6 family processors' microarchitecture and its support for out-of-order instruction execution.

Note that the PentiumR processor and earlier Intel Architecture processors also perform varying amounts of prefetching and preliminary decoding of instructions; however, here also exceptions and interrupts are not signaled until actual "in-order" execution of the instructions. For a given code sample, the signaling of exceptions will occur uniformly when the code is executed on any family of Intel Architecture processors (except where new exceptions or new opcodes have been defined).

5.5. NONMASKABLE INTERRUPT (NMI)

The nonmaskable interrupt (NMI) can be generated in either of two ways:

 External hardware asserts the NMI pin.

 The processor receives a message on the APIC serial bus of delivery mode NMI.

When the processor receives a NMI from either of these sources, the processor handles it immediately by calling the NMI handler pointed to by interrupt vector number 2. The processor also invokes certain hardware conditions to insure that no other interrupts, including NMI interrupts, are received until the NMI handler has completed executing (refer to Section 5.5.1., "Handling Multiple NMIs").

Also, when an NMI is received from either of the above sources, it cannot be masked by the IF flag in the EFLAGS register.

It is possible to issue a maskable hardware interrupt (through the INTR pin) to vector 2 to invoke the NMI interrupt handler; however, this interrupt will not truly be an NMI interrupt. A true NMI interrupt that activates the processor's NMI-handling hardware can only be delivered through one of the mechanisms listed above.

5.5.1. Handling Multiple NMIs

While an NMI interrupt handler is executing, the processor disables additional calls to the NMI handler until the next IRET instruction is executed. This blocking of subsequent NMIs prevents stacking up calls to the NMI handler. It is recommended that the NMI interrupt handler be accessed through an interrupt gate to disable maskable hardware interrupts (refer to Section 5.6.1., "Masking Maskable Hardware Interrupts").

5.6. ENABLING AND DISABLING INTERRUPTS

The processor inhibits the generation of some interrupts, depending on the state of the processor and of the IF and RF flags in the EFLAGS register, as described in the following sections. 5.6.1. Masking Maskable Hardware Interrupts The IF flag can disable the servicing of maskable hardware interrupts received on the processor's INTR pin or through the local APIC (refer to Section 5.1.1.2., "Maskable Hardware Interrupts"). When the IF flag is clear, the processor inhibits interrupts delivered to the INTR pin or through the local APIC from generating an internal interrupt request; when the IF flag is set, interrupts delivered to the INTR or through the local APIC pin are processed as normal external interrupts. The IF flag does not affect nonmaskable interrupts (NMIs) delivered to the NMI pin or delivery mode NMI messages delivered through the APIC serial bus, nor does it affect processor generated exceptions. As with the other flags in the EFLAGS register, the processor clears the IF flag in response to a hardware reset.

The fact that the group of maskable hardware interrupts includes the reserved interrupt and exception vectors 0 through 32 can potentially cause confusion. Architecturally, when the IF flag is set, an interrupt for any of the vectors from 0 through 32 can be delivered to the processor through the INTR pin and any of the vectors from 16 through 32 can be delivered through the local APIC. The processor will then generate an interrupt and call the interrupt or exception handler pointed to by the vector number. So for example, it is possible to invoke the page-fault handler through the INTR pin (by means of vector 14); however, this is not a true page-fault exception. It is an interrupt. As with the INT n instruction (refer to Section 5.1.2.2., "Software- Generated Exceptions"), when an interrupt is generated through the INTR pin to an exception vector, the processor does not push an error code on the stack, so the exception handler may not operate correctly.

The IF flag can be set or cleared with the STI (set interrupt-enable flag) and CLI (clear interruptenable flag) instructions, respectively. These instructions may be executed only if the CPL is equal to or less than the IOPL. A general-protection exception (#GP) is generated if they are executed when the CPL is greater than the IOPL. (The effect of the IOPL on these instructions is modified slightly when the virtual mode extension is enabled by setting the VME flag in control register CR4, refer to Section 16.3., "Interrupt and Exception Handling in Virtual-8086 Mode" in Chapter 16, 8086 Emulation.)

The IF flag is also affected by the following operations:

 The PUSHF instruction stores all flags on the stack, where they can be examined and modified. The POPF instruction can be used to load the modified flags back into the EFLAGS register.

 Task switches and the POPF and IRET instructions load the EFLAGS register; therefore, they can be used to modify the setting of the IF flag.

 When an interrupt is handled through an interrupt gate, the IF flag is automatically cleared, which disables maskable hardware interrupts. (If an interrupt is handled through a trap gate, the IF flag is not cleared.)

Refer to the descriptions of the CLI, STI, PUSHF, POPF, and IRET instructions in Chapter 3, Instruction Set Reference, of the Intel Architecture Software Developer's Manual, Volume 2, for a detailed description of the operations these instructions are allowed to perform on the IF flag.

5.6.2. Masking Instruction Breakpoints

The RF (resume) flag in the EFLAGS register controls the response of the processor to instruction- breakpoint conditions (refer to the description of the RF flag in Section 2.3., "System Flags and Fields in the EFLAGS Register" in Chapter 2, System Architecture Overview). When set, it prevents an instruction breakpoint from generating a debug exception (#DB); when clear, instruction breakpoints will generate debug exceptions. The primary function of the RF flag is to prevent the processor from going into a debug exception loop on an instruction-breakpoint. Refer to Section 15.3.1.1., "Instruction-Breakpoint Exception Condition", in Chapter 15, Debugging and Performance Monitoring, for more information on the use of this flag.

5.6.3. Masking Exceptions and Interrupts When Switching Stacks

To switch to a different stack segment, software often uses a pair of instructions, for example:

MOV SS, AX

MOV ESP, StackTop

If an interrupt or exception occurs after the segment selector has been loaded into the SS register but before the ESP register has been loaded, these two parts of the logical address into the stack space are inconsistent for the duration of the interrupt or exception handler.

To prevent this situation, the processor inhibits interrupts, debug exceptions, and single-step trap exceptions after either a MOV to SS instruction or a POP to SS instruction, until the instruction boundary following the next instruction is reached. All other faults may still be generated. If the LSS instruction is used to modify the contents of the SS register (which is the recommended method of modifying this register), this problem does not occur.

5.7. PRIORITY AMONG SIMULTANEOUS EXCEPTIONS AND INTERRUPTS

If more than one exception or interrupt is pending at an instruction boundary, the processor services them in a predictable order. Table 5-3 shows the priority among classes of exception and interrupt sources. While priority among these classes is consistent throughout the architecture, exceptions within each class are implementation-dependent and may vary from processor to processor. The processor first services a pending exception or interrupt from the class which has the highest priority, transferring execution to the first instruction of the handler. Lower priority exceptions are discarded; lower priority interrupts are held pending. Discarded exceptions are re-generated when the interrupt handler returns execution to the point in the program or task where the exceptions and/or interrupts occurred.

The PentiumR III processor added the SIMD floating-point execution unit. The SIMD floatingpoint execution unit can generate exceptions as well. Since the SIMD floating-point execution unit utilizes a 4-wide register set an exception may result from more than one operand within a SIMD floating-point register. Hence the PentiumR III processor handles these exceptions according to a predetermined precedence. When a sub-operand of a packed instruction generates two or more exception conditions, the exception precedence sometimes results in the higher priority exception being handled and the lower priority exceptions being ignored. Prioritization of exceptions is performed only on a sub-operand basis, and not between suboperands. For example, an invalid exception generated by one sub-operand will not prevent the reporting of a divide-by-zero exception generated by another sub-operand. Table 5-2 shows the precedence for Streaming SIMD Extensions numeric exceptions. The table reflects the order in which interrupts are handled upon simultaneous recognition by the processor (for example, when multiple interrupts are pending at an instruction boundary). However, the table does not necessarily reflect the order in which interrupts will be recognized by the processor if received simultaneously at the processor pins.

1. Though this is not an exception, the handling of a QNaN operand has precedence over lower priority exceptions. For example, a QNaN divided by zero results in a QNaN, not a zero-divide exception.

2. If masked, then instruction execution continues, and a lower priority exception can occur as well.

5.8. INTERRUPT DESCRIPTOR TABLE (IDT)

The interrupt descriptor table (IDT) associates each exception or interrupt vector with a gate descriptor for the procedure or task used to service the associated exception or interrupt. Like the GDT and LDTs, the IDT is an array of 8-byte descriptors (in protected mode). Unlike the GDT, the first entry of the IDT may contain a descriptor. To form an index into the IDT, the processor scales the exception or interrupt vector by eight (the number of bytes in a gate descriptor). Because there are only 256 interrupt or exception vectors, the IDT need not contain more than 256 descriptors. It can contain fewer than 256 descriptors, because descriptors are required only for the interrupt and exception vectors that may occur. All empty descriptor slots in the IDT should have the present flag for the descriptor set to 0.

Table 5-3. Priority Among Simultaneous Exceptions and Interrupts

Priority Descriptions
1 (Highest) Hardware Reset and Machine Checks - RESET - Machine Check
2 Trap on Task Switch - T flag in TSS is set
3 External Hardware Interventions - FLUSH - STOPCLK - SMI - INIT
4 Traps on the Previous Instruction - Breakpoints -Debug Trap Exceptions (TF flag set or data/I-O breakpoint)
5 External Interrupts - NMI Interrupts - Maskable Hardware Interrupts
6 Faults from Fetching Next Instruction -Code Breakpoint Fault -Code-Segment Limit Violation1 -Code Page Fault1
7 Faults from Decoding the Next Instruction - Instruction length > 15 bytes - Illegal Opcode - Coprocessor Not Available
8 (Lowest) Faults on Executing an Instruction - Floating-point exception - Overflow - Bound error - Invalid TSS - Segment Not Present - Stack fault - General Protection - Data Page Fault - Alignment Check - SIMD floating-point exception

NOTE: 1. For the PentiumR and Intel486T processors, the Code Segment Limit Violation and the Code Page Fault exceptions are assigned to the priority 7.

The base addresses of the IDT should be aligned on an 8-byte boundary to maximize performance of cache line fills. The limit value is expressed in bytes and is added to the base address to get the address of the last valid byte. A limit value of 0 results in exactly 1 valid byte. Because

IDT entries are always eight bytes long, the limit should always be one less than an integral multiple of eight (that is, 8N - 1). The IDT may reside anywhere in the linear address space. As shown in Figure 5-1, the processor locates the IDT using the IDTR register. This register holds both a 32-bit base address and 16-bit limit for the IDT.

Figure 5-1

The LIDT (load IDT register) and SIDT (store IDT register) instructions load and store the contents of the IDTR register, respectively. The LIDT instruction loads the IDTR register with the base address and limit held in a memory operand. This instruction can be executed only when the CPL is 0. It normally is used by the initialization code of an operating system when creating an IDT. An operating system also may use it to change from one IDT to another. The SIDT instruction copies the base and limit value stored in IDTR to memory. This instruction can be executed at any privilege level.

If a vector references a descriptor beyond the limit of the IDT, a general-protection exception (#GP) is generated.

5.9. IDT DESCRIPTORS

The IDT may contain any of three kinds of gate descriptors:

 Task-gate descriptor

 Interrupt-gate descriptor

 Trap-gate descriptor

Figure 5-2 shows the formats for the task-gate, interrupt-gate, and trap-gate descriptors. The format of a task gate used in an IDT is the same as that of a task gate used in the GDT or an LDT (refer to Section 6.2.4., "Task-Gate Descriptor" in Chapter 6, Task Management). The task gate contains the segment selector for a TSS for an exception and/or interrupt handler task. Interrupt and trap gates are very similar to call gates (refer to Section 4.8.3., "Call Gates" in Chapter 4, Protection). They contain a far pointer (segment selector and offset) that the processor uses to transfer execution to a handler procedure in an exception- or interrupt-handler code segment. These gates differ in the way the processor handles the IF flag in the EFLAGS register (refer to Section 5.10.1.2., "Flag Usage By Exception- or Interrupt-Handler Procedure").

5.10. EXCEPTION AND INTERRUPT HANDLING

The processor handles calls to exception- and interrupt-handlers similar to the way it handles calls with a CALL instruction to a procedure or a task. When responding to an exception or interrupt, the processor uses the exception or interrupt vector as an index to a descriptor in the IDT. If the index points to an interrupt gate or trap gate, the processor calls the exception or interrupt handler in a manner similar to a CALL to a call gate (refer to Section 4.8.2., "Gate Descriptors" through Section 4.8.6., "Returning from a Called Procedure" in Chapter 4, Protection). If index points to a task gate, the processor executes a task switch to the exception- or interrupt-handler task in a manner similar to a CALL to a task gate (refer to Section 6.3., "Task Switching" in Chapter 6, Task Management).

5.10.1. Exception- or Interrupt-Handler Procedures

An interrupt gate or trap gate references an exception- or interrupt-handler procedure that runs in the context of the currently executing task (refer to Figure 5-3). The segment selector for the gate points to a segment descriptor for an executable code segment in either the GDT or the current LDT. The offset field of the gate descriptor points to the beginning of the exception- or interrupt-handling procedure.

When the processor performs a call to the exception- or interrupt-handler procedure, it saves the current states of the EFLAGS register, CS register, and EIP register on the stack (refer to Figure 5-4). (The CS and EIP registers provide a return instruction pointer for the handler.) If an exception causes an error code to be saved, it is pushed on the stack after the EIP value.

If the handler procedure is going to be executed at the same privilege level as the interrupted procedure, the handler uses the current stack.

If the handler procedure is going to be executed at a numerically lower privilege level, a stack switch occurs. When a stack switch occurs, a stack pointer for the stack to be returned to is also saved on the stack. (The SS and ESP registers provide a return stack pointer for the handler.) The segment selector and stack pointer for the stack to be used by the handler is obtained from the TSS for the currently executing task. The processor copies the EFLAGS, SS, ESP, CS, EIP, and error code information from the interrupted procedure's stack to the handler's stack.

To return from an exception- or interrupt-handler procedure, the handler must use the IRET (or IRETD) instruction. The IRET instruction is similar to the RET instruction except that it restores the saved flags into the EFLAGS register. The IOPL field of the EFLAGS register is restored only if the CPL is 0. The IF flag is changed only if the CPL is less than or equal to the IOPL. Refer to "IRET/IRETD-Interrupt Return" in Chapter 3 of the Intel Architecture Software Developer's Manual, Volume 2, for the complete operation performed by the IRET instruction.

If a stack switch occurred when calling the handler procedure, the IRET instruction switches back to the interrupted procedure's stack on the return.

Figure 5-3

5.10.1.1. PROTECTION OF EXCEPTION- AND INTERRUPT-HANDLER PROCEDURES

The privilege-level protection for exception- and interrupt-handler procedures is similar to that used for ordinary procedure calls when called through a call gate (refer to Section 4.8.4., "Accessing a Code Segment Through a Call Gate" in Chapter 4, Protection). The processor does not permit transfer of execution to an exception- or interrupt-handler procedure in a less privileged code segment (numerically greater privilege level) than the CPL. An attempt to violate this rule results in a general-protection exception (#GP). The protection mechanism for exceptionand interrupt-handler procedures is different in the following ways:

 Because interrupt and exception vectors have no RPL, the RPL is not checked on implicit calls to exception and interrupt handlers.

 The processor checks the DPL of the interrupt or trap gate only if an exception or interrupt is generated with an INT n, INT 3, or INTO instruction. Here, the CPL must be less than or equal to the DPL of the gate. This restriction prevents application programs or procedures running at privilege level 3 from using a software interrupt to access critical exception handlers, such as the page-fault handler, providing that those handlers are placed in more privileged code segments (numerically lower privilege level). For hardware-generated interrupts and processor-detected exceptions, the processor ignores the DPL of interrupt and trap gates.

Because exceptions and interrupts generally do not occur at predictable times, these privilege rules effectively impose restrictions on the privilege levels at which exception and interrupthandling procedures can run. Either of the following techniques can be used to avoid privilegelevel violations.

 The exception or interrupt handler can be placed in a conforming code segment. This technique can be used for handlers that only need to access data available on the stack (for example, divide error exceptions). If the handler needs data from a data segment, the data segment needs to be accessible from privilege level 3, which would make it unprotected.

 The handler can be placed in a nonconforming code segment with privilege level 0. This handler would always run, regardless of the CPL that the interrupted program or task is running at.

5.10.1.2. FLAG USAGE BY EXCEPTION- OR INTERRUPT-HANDLER PROCEDURE

When accessing an exception or interrupt handler through either an interrupt gate or a trap gate, the processor clears the TF flag in the EFLAGS register after it saves the contents of the EFLAGS register on the stack. (On calls to exception and interrupt handlers, the processor also clears the VM, RF, and NT flags in the EFLAGS register, after they are saved on the stack.) Clearing the TF flag prevents instruction tracing from affecting interrupt response. A subsequent IRET instruction restores the TF (and VM, RF, and NT) flags to the values in the saved contents of the EFLAGS register on the stack.

The only difference between an interrupt gate and a trap gate is the way the processor handles the IF flag in the EFLAGS register. When accessing an exception- or interrupt-handling procedure through an interrupt gate, the processor clears the IF flag to prevent other interrupts from interfering with the current interrupt handler. A subsequent IRET instruction restores the IF flag to its value in the saved contents of the EFLAGS register on the stack. Accessing a handler procedure through a trap gate does not affect the IF flag.

5.10.2. Interrupt Tasks

When an exception or interrupt handler is accessed through a task gate in the IDT, a task switch results. Handling an exception or interrupt with a separate task offers several advantages:

 The entire context of the interrupted program or task is saved automatically.

 A new TSS permits the handler to use a new privilege level 0 stack when handling the exception or interrupt. If an exception or interrupt occurs when the current privilege level 0 stack is corrupted, accessing the handler through a task gate can prevent a system crash by providing the handler with a new privilege level 0 stack.

 The handler can be further isolated from other tasks by giving it a separate address space. This is done by giving it a separate LDT.

The disadvantage of handling an interrupt with a separate task is that the amount of machine state that must be saved on a task switch makes it slower than using an interrupt gate, resulting in increased interrupt latency.

A task gate in the IDT references a TSS descriptor in the GDT (refer to Figure 5-5). A switch to the handler task is handled in the same manner as an ordinary task switch (refer to Section 6.3., "Task Switching" in Chapter 6, Task Management). The link back to the interrupted task is stored in the previous task link field of the handler task's TSS. If an exception caused an error code to be generated, this error code is copied to the stack of the new task. When exception- or interrupt-handler tasks are used in an operating system, there are actually two mechanisms that can be used to dispatch tasks: the software scheduler (part of the operating system) and the hardware scheduler (part of the processor's interrupt mechanism). The software scheduler needs to accommodate interrupt tasks that may be dispatched when interrupts are enabled.

5.11. ERROR CODE

When an exception condition is related to a specific segment, the processor pushes an error code onto the stack of the exception handler (whether it is a procedure or task). The error code has the format shown in Figure 5-6. The error code resembles a segment selector; however, instead of a TI flag and RPL field, the error code contains 3 flags:

EXT

External event (bit 0). When set, indicates that an event external to the program caused the exception, such as a hardware interrupt.

IDT

Descriptor location (bit 1). When set, indicates that the index portion of the error code refers to a gate descriptor in the IDT; when clear, indicates that the index refers to a descriptor in the GDT or the current LDT.

TI

GDT/LDT (bit 2). Only used when the IDT flag is clear. When set, the TI flag indicates that the index portion of the error code refers to a segment or gate descriptor in the LDT; when clear, it indicates that the index refers to a descriptor in the current GDT.

The segment selector index field provides an index into the IDT, GDT, or current LDT to the segment or gate selector being referenced by the error code. In some cases the error code is null (that is, all bits in the lower word are clear). A null error code indicates that the error was not caused by a reference to a specific segment or that a null segment descriptor was referenced in an operation.

The format of the error code is different for page-fault exceptions (#PF), refer to "Interrupt 14-Page-Fault Exception (#PF)" in this chapter. The error code is pushed on the stack as a doubleword or word (depending on the default interrupt, trap, or task gate size). To keep the stack aligned for doubleword pushes, the upper half of the error code is reserved. Note that the error code is not popped when the IRET instruction is executed to return from an exception handler, so the handler must remove the error code before executing a return.

Error codes are not pushed on the stack for exceptions that are generated externally (with the INTR or LINT[1:0] pins) or the INT n instruction, even if an error code is normally produced for those exceptions.

5.12. EXCEPTION AND INTERRUPT REFERENCE

The following sections describe conditions which generate exceptions and interrupts. They are arranged in the order of vector numbers. The information contained in these sections are as follows:

Exception Class

Indicates whether the exception class is a fault, trap, or abort type. Some exceptions can be either a fault or trap type, depending on when the error condition is detected. (This section is not applicable to interrupts.)

Description

Gives a general description of the purpose of the exception or interrupt type. It also describes how the processor handles the exception or interrupt.

Exception Error Code

Indicates whether an error code is saved for the exception. If one is saved, the contents of the error code are described. (This section is not applicable to interrupts.)

Saved Instruction Pointer

Describes which instruction the saved (or return) instruction pointer points to. It also indicates whether the pointer can be used to restart a faulting instruction.

Program State Change

Describes the effects of the exception or interrupt on the state of the currently running program or task and the possibilities of restarting the program or task without loss of continuity.

Interrupt 0-Divide Error Exception (#DE)

Exception Class Fault.

Description

Indicates the divisor operand for a DIV or IDIV instruction is 0 or that the result cannot be represented in the number of bits specified for the destination operand.

Exception Error Code

None.

Saved Instruction Pointer

Saved contents of CS and EIP registers point to the instruction that generated the exception.

Program State Change

A program-state change does not accompany the divide error, because the exception occurs before the faulting instruction is executed.

Interrupt 1-Debug Exception (#DB)

Exception Class

Trap or Fault. The exception handler can distinguish between traps or faults by examining the contents of DR6 and the other debug registers.

Description

Indicates that one or more of several debug-exception conditions has been detected. Whether the exception is a fault or a trap depends on the condition, as shown below:

Refer to Chapter 15, Debugging and Performance Monitoring, for detailed information about the debug exceptions.

Exception Error Code

None. An exception handler can examine the debug registers to determine which condition caused the exception.

Saved Instruction Pointer

Fault-Saved contents of CS and EIP registers point to the instruction that generated the exception.

Trap-Saved contents of CS and EIP registers point to the instruction following the instruction that generated the exception.

Program State Change

Fault-A program-state change does not accompany the debug exception, because the exception occurs before the faulting instruction is executed. The program can resume normal execution upon returning from the debug exception handler

Trap-A program-state change does accompany the debug exception, because the instruction or task switch being executed is allowed to complete before the exception is generated. However, the new state of the program is not corrupted and execution of the program can continue reliably.

Interrupt 2-NMI Interrupt

Exception Class Not applicable.

Description

The nonmaskable interrupt (NMI) is generated externally by asserting the processor's NMI pin or through an NMI request set by the I/O APIC to the local APIC on the APIC serial bus. This interrupt causes the NMI interrupt handler to be called.

Exception Error Code

Not applicable.

Saved Instruction Pointer

The processor always takes an NMI interrupt on an instruction boundary. The saved contents of CS and EIP registers point to the next instruction to be executed at the point the interrupt is taken. Refer to Section 5.4., "Program or Task Restart" for more information about when the processor takes NMI interrupts.

Program State Change

The instruction executing when an NMI interrupt is received is completed before the NMI is generated. A program or task can thus be restarted upon returning from an interrupt handler without loss of continuity, provided the interrupt handler saves the state of the processor before handling the interrupt and restores the processor's state prior to a return.

Interrupt 3-Breakpoint Exception (#BP)

Exception Class Trap.

Description

Indicates that a breakpoint instruction (INT 3) was executed, causing a breakpoint trap to be generated. Typically, a debugger sets a breakpoint by replacing the first opcode byte of an instruction with the opcode for the INT 3 instruction. (The INT 3 instruction is one byte long, which makes it easy to replace an opcode in a code segment in RAM with the breakpoint opcode.) The operating system or a debugging tool can use a data segment mapped to the same physical address space as the code segment to place an INT 3 instruction in places where it is desired to call the debugger.

With the P6 family, PentiumR, Intel486T, and Intel386T processors, it is more convenient to set breakpoints with the debug registers. (Refer to Section 15.3.2., "Breakpoint Exception (#BP)-Interrupt Vector 3", in Chapter 15, Debugging and Performance Monitoring, for information about the breakpoint exception.) If more breakpoints are needed beyond what the debug registers allow, the INT 3 instruction can be used.

The breakpoint (#BP) exception can also be generated by executing the INT n instruction with an operand of 3. The action of this instruction (INT 3) is slightly different than that of the INT 3 instruction (refer to "INTn/INTO/INT3-Call to Interrupt Procedure" in Chapter 3 of the Intel Architecture Software Developer's Manual, Volume 2).

Exception Error Code

None.

Saved Instruction Pointer

Saved contents of CS and EIP registers point to the instruction following the INT 3 instruction.

Program State Change

Even though the EIP points to the instruction following the breakpoint instruction, the state of the program is essentially unchanged because the INT 3 instruction does not affect any register or memory locations. The debugger can thus resume the suspended program by replacing the INT 3 instruction that caused the breakpoint with the original opcode and decrementing the saved contents of the EIP register. Upon returning from the debugger, program execution resumes with the replaced instruction.

Interrupt 4-Overflow Exception (#OF)

Exception Class Trap.

Description

Indicates that an overflow trap occurred when an INTO instruction was executed. The INTO instruction checks the state of the OF flag in the EFLAGS register. If the OF flag is set, an overflow trap is generated.

Some arithmetic instructions (such as the ADD and SUB) perform both signed and unsigned arithmetic. These instructions set the OF and CF flags in the EFLAGS register to indicate signed overflow and unsigned overflow, respectively. When performing arithmetic on signed operands, the OF flag can be tested directly or the INTO instruction can be used. The benefit of using the INTO instruction is that if the overflow exception is detected, an exception handler can be called automatically to handle the overflow condition.

Exception Error Code

None.

Saved Instruction Pointer

The saved contents of CS and EIP registers point to the instruction following the INTO instruction.

Program State Change

Even though the EIP points to the instruction following the INTO instruction, the state of the program is essentially unchanged because the INTO instruction does not affect any register or memory locations. The program can thus resume normal execution upon returning from the overflow exception handler.

CHAPTER 6 TASK MANAGEMENT

This chapter describes the Intel Architecture's task management facilities. These facilities are only available when the processor is running in protected mode.

6.1. TASK MANAGEMENT OVERVIEW

A task is a unit of work that a processor can dispatch, execute, and suspend. It can be used to execute a program, a task or process, an operating-system service utility, an interrupt or exception handler, or a kernel or executive utility.

The Intel Architecture provides a mechanism for saving the state of a task, for dispatching tasks for execution, and for switching from one task to another. When operating in protected mode, all processor execution takes place from within a task. Even simple systems must define at least one task. More complex systems can use the processor's task management facilities to support multitasking applications.

6.1.1. Task Structure

A task is made up of two parts: a task execution space and a task-state segment (TSS). The task execution space consists of a code segment, a stack segment, and one or more data segments (refer to Figure 6-1). If an operating system or executive uses the processor's privilege-level protection mechanism, the task execution space also provides a separate stack for each privilege level.

The TSS specifies the segments that make up the task execution space and provides a storage place for task state information. In multitasking systems, the TSS also provides a mechanism for linking tasks.

NOTE

This chapter describes primarily 32-bit tasks and the 32-bit TSS structure. For information on 16-bit tasks and the 16-bit TSS structure, refer to Section 6.6., "16-Bit Task-State Segment (TSS)".

A task is identified by the segment selector for its TSS. When a task is loaded into the processor for execution, the segment selector, base address, limit, and segment descriptor attributes for the TSS are loaded into the task register (refer to Section 2.4.4., "Task Register (TR)" in Chapter 2, System Architecture Overview).

If paging is implemented for the task, the base address of the page directory used by the task is loaded into control register CR3.

Figure 6-1

6.1.2. Task State

The following items define the state of the currently executing task:

 The task's current execution space, defined by the segment selectors in the segment registers (CS, DS, SS, ES, FS, and GS).

 The state of the general-purpose registers.

 The state of the EFLAGS register.

 The state of the EIP register.

 The state of control register CR3.

 The state of the task register.

 The state of the LDTR register.

 The I/O map base address and I/O map (contained in the TSS).

 Stack pointers to the privilege 0, 1, and 2 stacks (contained in the TSS).

 Link to previously executed task (contained in the TSS).

Prior to dispatching a task, all of these items are contained in the task's TSS, except the state of the task register. Also, the complete contents of the LDTR register are not contained in the TSS, only the segment selector for the LDT.

6.1.3. Executing a Task

Software or the processor can dispatch a task for execution in one of the following ways:

 A explicit call to a task with the CALL instruction.

 A explicit jump to a task with the JMP instruction.

 An implicit call (by the processor) to an interrupt-handler task.

 An implicit call to an exception-handler task.

 A return (initiated with an IRET instruction) when the NT flag in the EFLAGS register is set.

All of these methods of dispatching a task identify the task to be dispatched with a segment selector that points either to a task gate or the TSS for the task. When dispatching a task with a CALL or JMP instruction, the selector in the instruction may select either the TSS directly or a task gate that holds the selector for the TSS. When dispatching a task to handle an interrupt or exception, the IDT entry for the interrupt or exception must contain a task gate that holds the selector for the interrupt- or exception-handler TSS.

When a task is dispatched for execution, a task switch automatically occurs between the currently running task and the dispatched task. During a task switch, the execution environment of the currently executing task (called the task's state or context) is saved in its TSS and execution of the task is suspended. The context for the dispatched task is then loaded into the processor and execution of that task begins with the instruction pointed to by the newly loaded EIP register. If the task has not been run since the system was last initialized, the EIP will point to the first instruction of the task's code; otherwise, it will point to the next instruction after the last instruction that the task executed when it was last active.

If the currently executing task (the calling task) called the task being dispatched (the called task), the TSS segment selector for the calling task is stored in the TSS of the called task to provide a link back to the calling task.

For all Intel Architecture processors, tasks are not recursive. A task cannot call or jump to itself.

Interrupts and exceptions can be handled with a task switch to a handler task. Here, the processor not only can perform a task switch to handle the interrupt or exception, but it can automatically switch back to the interrupted task upon returning from the interrupt- or exception-handler task. This mechanism can handle interrupts that occur during interrupt tasks.

As part of a task switch, the processor can also switch to another LDT, allowing each task to have a different logical-to-physical address mapping for LDT-based segments. The page-directory base register (CR3) also is reloaded on a task switch, allowing each task to have its own set of page tables. These protection facilities help isolate tasks and prevent them from interfering with one another. If one or both of these protection mechanisms are not used, the processor provides no protection between tasks. This is true even with operating systems that use multiple privilege levels for protection. Here, a task running at privilege level 3 that uses the same LDT and page tables as other privilege-level-3 tasks can access code and corrupt data and the stack of other tasks.

Use of task management facilities for handling multitasking applications is optional. Multitasking can be handled in software, with each software defined task executed in the context of a single Intel Architecture task.

6.2. TASK MANAGEMENT DATA STRUCTURES

The processor defines five data structures for handling task-related activities:

 Task-state segment (TSS).

 Task-gate descriptor.

 TSS descriptor.

 Task register.

 NT flag in the EFLAGS register.

When operating in protected mode, a TSS and TSS descriptor must be created for at least one task, and the segment selector for the TSS must be loaded into the task register (using the LTR instruction).

6.2.1. Task-State Segment (TSS)

The processor state information needed to restore a task is saved in a system segment called the task-state segment (TSS). Figure 6-2 shows the format of a TSS for tasks designed for 32-bit CPUs. (Compatibility with 16-bit Intel 286 processor tasks is provided by a different kind of TSS, refer to Figure 6-9.) The fields of a TSS are divided into two main categories: dynamic fields and static fields.

The processor updates the dynamic fields when a task is suspended during a task switch. The following are dynamic fields:

General-purpose register fields

State of the EAX, ECX, EDX, EBX, ESP, EBP, ESI, and EDI registers prior to the task switch.

Segment selector fields

Segment selectors stored in the ES, CS, SS, DS, FS, and GS registers prior to the task switch.

EFLAGS register field

State of the EFAGS register prior to the task switch.

EIP (instruction pointer) field

State of the EIP register prior to the task switch.

Previous task link field

Contains the segment selector for the TSS of the previous task (updated on a task switch that was initiated by a call, interrupt, or exception). This field (which is sometimes called the back link field) permits a task switch back to the previous task to be initiated with an IRET instruction.

The processor reads the static fields, but does not normally change them. These fields are set up when a task is created. The following are static fields:

LDT segment selector field

Contains the segment selector for the task's LDT.

CR3 control register field

Contains the base physical address of the page directory to be used by the task. Control register CR3 is also known as the page-directory base register (PDBR).

Privilege level-0, -1, and -2 stack pointer fields

These stack pointers consist of a logical address made up of the segment selector for the stack segment (SS0, SS1, and SS2) and an offset into the stack (ESP0, ESP1, and ESP2). Note that the values in these fields are static for a particular task; whereas, the SS and ESP values will change if stack switching occurs within the task.

T (debug trap) flag (byte 100, bit 0)

When set, the T flag causes the processor to raise a debug exception when a task switch to this task occurs (refer to Section 15.3.1.5., "Task-Switch Exception Condition", in Chapter 15, Debugging and Performance Monitoring).

I/O map base address field

Contains a 16-bit offset from the base of the TSS to the I/O permission bit map and interrupt redirection bitmap. When present, these maps are stored in the TSS at higher addresses. The I/O map base address points to the beginning of the I/O permission bit map and the end of the interrupt redirection bit map. Refer to Chapter 9, Input/Output, in the Intel Architecture Software Developer's Manual, Volume 1, for more information about the I/O permission bit map. Refer to Section 16.3., "Interrupt and Exception Handling in Virtual- 8086 Mode" in Chapter 16, 8086 Emulation for a detailed description of the interrupt redirection bit map.

If paging is used, care should be taken to avoid placing a page boundary within the part of the TSS that the processor reads during a task switch (the first 104 bytes). If a page boundary is placed within this part of the TSS, the pages on either side of the boundary must be present at the same time and contiguous in physical memory. The reason for this restriction is that when accessing a TSS during a task switch, the processor reads and writes into the first 104 bytes of each TSS from contiguous physical addresses beginning with the physical address of the first byte of the TSS. It may not perform address translations at a page boundary if one occurs within this area. So, after the TSS access begins, if a part of the 104 bytes is not both present and physically contiguous, the processor will access incorrect TSS information, without generating a page-fault exception. The reading of this incorrect information will generally lead to an unrecoverable exception later in the task switch process.

Also, if paging is used, the pages corresponding to the previous task's TSS, the current task's TSS, and the descriptor table entries for each should be marked as read/write. The task switch will be carried out faster if the pages containing these structures are also present in memory before the task switch is initiated.

6.2.2. TSS Descriptor

The TSS, like all other segments, is defined by a segment descriptor. Figure 6-3 shows the format of a TSS descriptor. TSS descriptors may only be placed in the GDT; they cannot be placed in an LDT or the IDT. An attempt to access a TSS using a segment selector with its TI flag set (which indicates the current LDT) causes a general-protection exception (#GP) to be generated. A general-protection exception is also generated if an attempt is made to load a segment selector for a TSS into a segment register.

The busy flag (B) in the type field indicates whether the task is busy. A busy task is currently running or is suspended. A type field with a value of 1001B indicates an inactive task; a value of 1011B indicates a busy task. Tasks are not recursive. The processor uses the busy flag to detect an attempt to call a task whose execution has been interrupted. To insure that there is only one busy flag is associated with a task, each TSS should have only one TSS descriptor that points to it.

Figure 6-3

The base, limit, and DPL fields and the granularity and present flags have functions similar to their use in data-segment descriptors (refer to Section 3.4.3., "Segment Descriptors" in Chapter 3, Protected-Mode Memory Management). The limit field must have a value equal to or greater than 67H (for a 32-bit TSS), one byte less than the minimum size of a TSS. Attempting to switch to a task whose TSS descriptor has a limit less than 67H generates an invalid-TSS exception (#TS). A larger limit is required if an I/O permission bit map is included in the TSS. An even larger limit would be required if the operating system stores additional data in the TSS. The processor does not check for a limit greater than 67H on a task switch; however, it does when accessing the I/O permission bit map or interrupt redirection bit map.

Any program or procedure with access to a TSS descriptor (that is, whose CPL is numerically equal to or less than the DPL of the TSS descriptor) can dispatch the task with a call or a jump. In most systems, the DPLs of TSS descriptors should be set to values less than 3, so that only privileged software can perform task switching. However, in multitasking applications, DPLs for some TSS descriptors can be set to 3 to allow task switching at the application (or user) privilege level.

6.2.3. Task Register

The task register holds the 16-bit segment selector and the entire segment descriptor (32-bit base address, 16-bit segment limit, and descriptor attributes) for the TSS of the current task (refer to Figure 2-4 in Chapter 2, System Architecture Overview). This information is copied from the TSS descriptor in the GDT for the current task. Figure 6-4 shows the path the processor uses to accesses the TSS, using the information in the task register.

The task register has both a visible part (that can be read and changed by software) and an invisible part (that is maintained by the processor and is inaccessible by software). The segment selector in the visible portion points to a TSS descriptor in the GDT. The processor uses the invisible portion of the task register to cache the segment descriptor for the TSS. Caching these values in a register makes execution of the task more efficient, because the processor does not need to fetch these values from memory to reference the TSS of the current task.

The LTR (load task register) and STR (store task register) instructions load and read the visible portion of the task register. The LTR instruction loads a segment selector (source operand) into the task register that points to a TSS descriptor in the GDT, and then loads the invisible portion of the task register with information from the TSS descriptor. This instruction is a privileged instruction that may be executed only when the CPL is 0. The LTR instruction generally is used during system initialization to put an initial value in the task register. Afterwards, the contents of the task register are changed implicitly when a task switch occurs.

The STR (store task register) instruction stores the visible portion of the task register in a general-purpose register or memory. This instruction can be executed by code running at any privilege level, to identify the currently running task; however, it is normally used only by operating system software.

On power up or reset of the processor, the segment selector and base address are set to the default value of 0 and the limit is set to FFFFH.

6.2.4. Task-Gate Descriptor

A task-gate descriptor provides an indirect, protected reference to a task. Figure 6-5 shows the format of a task-gate descriptor. A task-gate descriptor can be placed in the GDT, an LDT, or the IDT.

The TSS segment selector field in a task-gate descriptor points to a TSS descriptor in the GDT. The RPL in this segment selector is not used.

The DPL of a task-gate descriptor controls access to the TSS descriptor during a task switch. When a program or procedure makes a call or jump to a task through a task gate, the CPL and the RPL field of the gate selector pointing to the task gate must be less than or equal to the DPL of the task-gate descriptor. (Note that when a task gate is used, the DPL of the destination TSS descriptor is not used.)

Figure 6-4

Figure 6-5

A task can be accessed either through a task-gate descriptor or a TSS descriptor. Both of these structures are provided to satisfy the following needs:

 The need for a task to have only one busy flag. Because the busy flag for a task is stored in the TSS descriptor, each task should have only one TSS descriptor. There may, however, be several task gates that reference the same TSS descriptor.

 The need to provide selective access to tasks. Task gates fill this need, because they can reside in an LDT and can have a DPL that is different from the TSS descriptor's DPL. A program or procedure that does not have sufficient privilege to access the TSS descriptor for a task in the GDT (which usually has a DPL of 0) may be allowed access to the task through a task gate with a higher DPL. Task gates give the operating system greater latitude for limiting access to specific tasks.

 The need for an interrupt or exception to be handled by an independent task. Task gates may also reside in the IDT, which allows interrupts and exceptions to be handled by handler tasks. When an interrupt or exception vector points to a task gate, the processor switches to the specified task.

Figure 6-6 illustrates how a task gate in an LDT, a task gate in the GDT, and a task gate in the IDT can all point to the same task.

6.3. TASK SWITCHING

The processor transfers execution to another task in any of four cases:

 The current program, task, or procedure executes a JMP or CALL instruction to a TSS descriptor in the GDT.

 The current program, task, or procedure executes a JMP or CALL instruction to a task-gate descriptor in the GDT or the current LDT.

 An interrupt or exception vector points to a task-gate descriptor in the IDT.

 The current task executes an IRET when the NT flag in the EFLAGS register is set. The JMP, CALL, and IRET instructions, as well as interrupts and exceptions, are all generalized mechanisms for redirecting a program. The referencing of a TSS descriptor or a task gate (when calling or jumping to a task) or the state of the NT flag (when executing an IRET instruction) determines whether a task switch occurs.

The processor performs the following operations when switching to a new task:

1. Obtains the TSS segment selector for the new task as the operand of the JMP or CALL instruction, from a task gate, or from the previous task link field (for a task switch initiated with an IRET instruction).

Figure 6-6
Figure 6-6. Task Gates Referencing the Same Task

2. Checks that the current (old) task is allowed to switch to the new task. Data-access privilege rules apply to JMP and CALL instructions. The CPL of the current (old) task and the RPL of the segment selector for the new task must be less than or equal to the DPL of the TSS descriptor or task gate being referenced. Exceptions, interrupts (except for interrupts generated by the INT n instruction), and the IRET instruction are permitted to switch tasks regardless of the DPL of the destination task-gate or TSS descriptor. For interrupts generated by the INT n instruction, the DPL is checked.

3. Checks that the TSS descriptor of the new task is marked present and has a valid limit (greater than or equal to 67H).

4. Checks that the new task is available (call, jump, exception, or interrupt) or busy (IRET return).

5. Checks that the current (old) TSS, new TSS, and all segment descriptors used in the task switch are paged into system memory.

6. If the task switch was initiated with a JMP or IRET instruction, the processor clears the busy (B) flag in the current (old) task's TSS descriptor; if initiated with a CALL instruction, an exception, or an interrupt, the busy (B) flag is left set. (Refer to Table 6-2.)

7. If the task switch was initiated with an IRET instruction, the processor clears the NT flag in a temporarily saved image of the EFLAGS register; if initiated with a CALL or JMP instruction, an exception, or an interrupt, the NT flag is left unchanged in the saved EFLAGS image.

8. Saves the state of the current (old) task in the current task's TSS. The processor finds the base address of the current TSS in the task register and then copies the states of the following registers into the current TSS: all the general-purpose registers, segment selectors from the segment registers, the temporarily saved image of the EFLAGS register, and the instruction pointer register (EIP).

NOTE

At this point, if all checks and saves have been carried out successfully, the processor commits to the task switch. If an unrecoverable error occurs in steps 1 through 8, the processor does not complete the task switch and insures that the processor is returned to its state prior to the execution of the instruction that initiated the task switch. If an unrecoverable error occurs after the commit point (in steps 9 through 14), the processor completes the task switch (without performing additional access and segment availability checks) and generates the appropriate exception prior to beginning execution of the new task. If exceptions occur after the commit point, the exception handler must finish the task switch itself before allowing the processor to begin executing the task. Refer to Chapter 5, Interrupt and Exception Handling for more information about the affect of exceptions on a task when they occur after the commit point of a task switch.

9. If the task switch was initiated with a CALL instruction, an exception, or an interrupt, the processor sets the NT flag in the EFLAGS image stored in the new task's TSS; if initiated with an IRET instruction, the processor restores the NT flag from the EFLAGS image stored on the stack. If initiated with a JMP instruction, the NT flag is left unchanged. (Refer to Table 6-2.)

10. If the task switch was initiated with a CALL instruction, JMP instruction, an exception, or an interrupt, the processor sets the busy (B) flag in the new task's TSS descriptor; if initiated with an IRET instruction, the busy (B) flag is left set.

11. Sets the TS flag in the control register CR0 image stored in the new task's TSS.

12. Loads the task register with the segment selector and descriptor for the new task's TSS.

13. Loads the new task's state from its TSS into processor. Any errors associated with the loading and qualification of segment descriptors in this step occur in the context of the new task. The task state information that is loaded here includes the LDTR register, the PDBR (control register CR3), the EFLAGS register, the EIP register, the general-purpose registers, and the segment descriptor parts of the segment registers.

14. Begins executing the new task. (To an exception handler, the first instruction of the new task appears not to have been executed.)

The state of the currently executing task is always saved when a successful task switch occurs. If the task is resumed, execution starts with the instruction pointed to by the saved EIP value, and the registers are restored to the values they held when the task was suspended.

When switching tasks, the privilege level of the new task does not inherit its privilege level from the suspended task. The new task begins executing at the privilege level specified in the CPL field of the CS register, which is loaded from the TSS. Because tasks are isolated by their separate address spaces and TSSs and because privilege rules control access to a TSS, software does not need to perform explicit privilege checks on a task switch.

Table 6-1 shows the exception conditions that the processor checks for when switching tasks. It also shows the exception that is generated for each check if an error is detected and the segment that the error code references. (The order of the checks in the table is the order used in the P6 family processors. The exact order is model specific and may be different for other Intel Architecture processors.) Exception handlers designed to handle these exceptions may be subject to recursive calls if they attempt to reload the segment selector that generated the exception. The cause of the exception (or the first of multiple causes) should be fixed before reloading the selector.

NOTES:

1. #NP is segment-not-present exception, #GP is general-protection exception, #TS is invalid-TSS exception, and #SF is stack-fault exception.

2. The error code contains an index to the segment descriptor referenced in this column.

3. A segment selector is valid if it is in a compatible type of table (GDT or LDT), occupies an address within the table's segment limit, and refers to a compatible type of descriptor (for example, a segment selector in the CS register only is valid when it points to a code-segment descriptor).

The TS (task switched) flag in the control register CR0 is set every time a task switch occurs. System software uses the TS flag to coordinate the actions of floating-point unit when generating floating-point exceptions with the rest of the processor. The TS flag indicates that the context of the floating-point unit may be different from that of the current task. Refer to Section 2.5., "Control Registers" in Chapter 2, System Architecture Overview for a detailed description of the function and use of the TS flag.

6.4. TASK LINKING

The previous task link field of the TSS (sometimes called the "backlink") and the NT flag in the EFLAGS register are used to return execution to the previous task. The NT flag indicates whether the currently executing task is nested within the execution of another task, and the previous task link field of the current task's TSS holds the TSS selector for the higher-level task in the nesting hierarchy, if there is one (refer to Figure 6-7).

When a CALL instruction, an interrupt, or an exception causes a task switch, the processor copies the segment selector for the current TSS into the previous task link field of the TSS for the new task, and then sets the NT flag in the EFLAGS register. The NT flag indicates that the previous task link field of the TSS has been loaded with a saved TSS segment selector. If software uses an IRET instruction to suspend the new task, the processor uses the value in the previous task link field and the NT flag to return to the previous task; that is, if the NT flag is set, the processor performs a task switch to the task specified in the previous task link field. When a JMP instruction causes a task switch, the new task is not nested; that is, the NT flag is set to 0 and the previous task link field is not used. A JMP instruction is used to dispatch a new task when nesting is not desired.

Figure 6-7

Table 6-2 summarizes the uses of the busy flag (in the TSS segment descriptor), the NT flag, the previous task link field, and TS flag (in control register CR0) during a task switch. Note that the NT flag may be modified by software executing at any privilege level. It is possible for a program to set its NT flag and execute an IRET instruction, which would have the effect of invoking the task specified in the previous link field of the current task's TSS. To keep spurious task switches from succeeding, the operating system should initialize the previous task link field for every TSS it creates to 0.

6.4.1. Use of Busy Flag To Prevent Recursive Task Switching

A TSS allows only one context to be saved for a task; therefore, once a task is called (dispatched), a recursive (or re-entrant) call to the task would cause the current state of the task to be lost. The busy flag in the TSS segment descriptor is provided to prevent re-entrant task switching and subsequent loss of task state information. The processor manages the busy flag as follows:

1. When dispatching a task, the processor sets the busy flag of the new task.

2. If during a task switch, the current task is placed in a nested chain (the task switch is being generated by a CALL instruction, an interrupt, or an exception), the busy flag for the current task remains set.

3. When switching to the new task (initiated by a CALL instruction, interrupt, or exception), the processor generates a general-protection exception (#GP) if the busy flag of the new task is already set. (If the task switch is initiated with an IRET instruction, the exception is not raised because the processor expects the busy flag to be set.)

4. When a task is terminated by a jump to a new task (initiated with a JMP instruction in the task code) or by an IRET instruction in the task code, the processor clears the busy flag, returning the task to the "not busy" state.

In this manner the processor prevents recursive task switching by preventing a task from switching to itself or to any task in a nested chain of tasks. The chain of nested suspended tasks may grow to any length, due to multiple calls, interrupts, or exceptions. The busy flag prevents a task from being invoked if it is in this chain.

The busy flag may be used in multiprocessor configurations, because the processor follows a LOCK protocol (on the bus or in the cache) when it sets or clears the busy flag. This lock keeps two processors from invoking the same task at the same time. (Refer to Section 7.1.2.1., "Automatic Locking" in Chapter 7, Multiple-Processor Management for more information about setting the busy flag in a multiprocessor applications.)

6.4.2. Modifying Task Linkages

In a uniprocessor system, in situations where it is necessary to remove a task from a chain of linked tasks, use the following procedure to remove the task:

1. Disable interrupts.

2. Change the previous task link field in the TSS of the pre-empting task (the task that suspended the task to be removed). It is assumed that the pre-empting task is the next task (newer task) in the chain from the task to be removed. Change the previous task link field should to point to the TSS of the next oldest or to an even older task in the chain.

3. Clear the busy (B) flag in the TSS segment descriptor for the task being removed from the chain. If more than one task is being removed from the chain, the busy flag for each task being remove must be cleared.

4. Enable interrupts.

In a multiprocessing system, additional synchronization and serialization operations must be added to this procedure to insure that the TSS and its segment descriptor are both locked when the previous task link field is changed and the busy flag is cleared.

6.5. TASK ADDRESS SPACE

The address space for a task consists of the segments that the task can access. These segments include the code, data, stack, and system segments referenced in the TSS and any other segments accessed by the task code. These segments are mapped into the processor's linear address space, which is in turn mapped into the processor's physical address space (either directly or through paging).

The LDT segment field in the TSS can be used to give each task its own LDT. Giving a task its own LDT allows the task address space to be isolated from other tasks by placing the segment descriptors for all the segments associated with the task in the task's LDT.

It also is possible for several tasks to use the same LDT. This is a simple and memory-efficient way to allow some tasks to communicate with or control each other, without dropping the protection barriers for the entire system.

Because all tasks have access to the GDT, it also is possible to create shared segments accessed through segment descriptors in this table.

If paging is enabled, the CR3 register (PDBR) field in the TSS allows each task can also have its own set of page tables for mapping linear addresses to physical addresses. Or, several tasks can share the same set of page tables.

6.5.1. Mapping Tasks to the Linear and Physical Address Spaces

Tasks can be mapped to the linear address space and physical address space in either of two ways:

 One linear-to-physical address space mapping is shared among all tasks. When paging is not enabled, this is the only choice. Without paging, all linear addresses map to the same physical addresses. When paging is enabled, this form of linear-to-physical address space mapping is obtained by using one page directory for all tasks. The linear address space may exceed the available physical space if demand-paged virtual memory is supported.

 Each task has its own linear address space that is mapped to the physical address space. This form of mapping is accomplished by using a different page directory for each task. Because the PDBR (control register CR3) is loaded on each task switch, each task may have a different page directory.

The linear address spaces of different tasks may map to completely distinct physical addresses. If the entries of different page directories point to different page tables and the page tables point to different pages of physical memory, then the tasks do not share any physical addresses.

With either method of mapping task linear address spaces, the TSSs for all tasks must lie in a shared area of the physical space, which is accessible to all tasks. This mapping is required so that the mapping of TSS addresses does not change while the processor is reading and updating the TSSs during a task switch. The linear address space mapped by the GDT also should be mapped to a shared area of the physical space; otherwise, the purpose of the GDT is defeated. Figure 6-8 shows how the linear address spaces of two tasks can overlap in the physical space by sharing page tables.

Figure 6-8

6.5.2. Task Logical Address Space

To allow the sharing of data among tasks, use any of the following techniques to create shared logical-to-physical address-space mappings for data segments:

 Through the segment descriptors in the GDT. All tasks must have access to the segment descriptors in the GDT. If some segment descriptors in the GDT point to segments in the linear-address space that are mapped into an area of the physical-address space common to all tasks, then all tasks can share the data and code in those segments.

 Through a shared LDT. Two or more tasks can use the same LDT if the LDT fields in their TSSs point to the same LDT. If some segment descriptors in a shared LDT point to segments that are mapped to a common area of the physical address space, the data and code in those segments can be shared among the tasks that share the LDT. This method of sharing is more selective than sharing through the GDT, because the sharing can be limited to specific tasks. Other tasks in the system may have different LDTs that do not give them access to the shared segments.

 Through segment descriptors in distinct LDTs that are mapped to common addresses in the linear address space. If this common area of the linear address space is mapped to the same area of the physical address space for each task, these segment descriptors permit the tasks to share segments. Such segment descriptors are commonly called aliases. This method of sharing is even more selective than those listed above, because, other segment descriptors in the LDTs may point to independent linear addresses which are not shared.

6.6. 16-BIT TASK-STATE SEGMENT (TSS)

The 32-bit Intel Architecture processors also recognize a 16-bit TSS format like the one used in Intel 286 processors (refer to Figure 6-9). It is supported for compatibility with software written to run on these earlier Intel Architecture processors. The following additional information is important to know about the 16-bit TSS.

 Do not use a 16-bit TSS to implement a virtual-8086 task.

 The valid segment limit for a 16-bit TSS is 2CH.

 The 16-bit TSS does not contain a field for the base address of the page directory, which is loaded into control register CR3. Therefore, a separate set of page tables for each task is not supported for 16-bit tasks. If a 16-bit task is dispatched, the page-table structure for the previous task is used.

 The I/O base address is not included in the 16-bit TSS, so none of the functions of the I/O map are supported.

 When task state is saved in a 16-bit TSS, the upper 16 bits of the EFLAGS register and the EIP register are lost.

 When the general-purpose registers are loaded or saved from a 16-bit TSS, the upper 16 bits of the registers are modified and not maintained.

Figure 6-9

CHAPTER 8 PROCESSOR MANAGEMENT AND INITIALIZATION

This chapter describes the facilities provided for managing processor wide functions and for initializing the processor. The subjects covered include: processor initialization, FPU initialization, processor configuration, feature determination, mode switching, the MSRs (in the PentiumR and P6 family processors), and the MTRRs (in the P6 family processors).

8.1. INITIALIZATION OVERVIEW

Following power-up or an assertion of the RESET# pin, each processor on the system bus performs a hardware initialization of the processor (known as a hardware reset) and an optional built-in self-test (BIST). A hardware reset sets each processor's registers to a known state and places the processor in real-address mode. It also invalidates the internal caches, translation lookaside buffers (TLBs) and the branch target buffer (BTB). At this point, the action taken depends on the processor family:

 P6 family processors-All the processors on the system bus (including a single processor in a uniprocessor system) execute the multiple processor (MP) initialization protocol across the APIC bus. The processor that is selected through this protocol as the bootstrap processor (BSP) then immediately starts executing software-initialization code in the current code segment beginning at the offset in the EIP register. The application (non-BSP) processors (AP) go into a halt state while the BSP is executing initialization code. Refer to Section 7.7., "Multiple-Processor (MP) Initialization Protocol" in Chapter 7, Multiple- Processor Management for more details. Note that in a uniprocessor system, the single P6 family processor automatically becomes the BSP.

 PentiumR processors-In either a single- or dual- processor system, a single PentiumR processor is always pre-designated as the primary processor. Following a reset, the primary processor behaves as follows in both single- and dual-processor systems. Using the dualprocessor (DP) ready initialization protocol, the primary processor immediately starts executing software-initialization code in the current code segment beginning at the offset in the EIP register. The secondary processor (if there is one) goes into a halt state. (Refer to Section 7.6., "Dual-Processor (DP) Initialization Protocol" in Chapter 7, Multiple- Processor Management for more details.)

 Intel486T processor-The primary processor (or single processor in a uniprocessor system) immediately starts executing software-initialization code in the current code segment beginning at the offset in the EIP register. (The Intel486T does not automatically execute a DP or MP initialization protocol to determine which processor is the primary processor.)

The software-initialization code performs all system-specific initialization of the BSP or primary processor and the system logic.

At this point, for MP (or DP) systems, the BSP (or primary) processor wakes up each AP (or secondary) processor to enable those processors to execute self-configuration code.

When all processors are initialized, configured, and synchronized, the BSP or primary processor begins executing an initial operating-system or executive task.

The floating-point unit (FPU) is also initialized to a known state during hardware reset. FPU software initialization code can then be executed to perform operations such as setting the precision of the FPU and the exception masks. No special initialization of the FPU is required to switch operating modes.

Asserting the INIT# pin on the processor invokes a similar response to a hardware reset. The major difference is that during an INIT, the internal caches, MSRs, MTRRs, and FPU state are left unchanged (although, the TLBs and BTB are invalidated as with a hardware reset). An INIT provides a method for switching from protected to real-address mode while maintaining the contents of the internal caches.

8.1.1. Processor State After Reset

Table 8-1 shows the state of the flags and other registers following power-up for the PentiumR Pro, PentiumR, and Intel486T processors. The state of control register CR0 is 60000010H (refer to Figure 8-1), which places the processor is in real-address mode with paging disabled.

8.1.2. Processor Built-In Self-Test (BIST)

Hardware may request that the BIST be performed at power-up. The EAX register is cleared (0H) if the processor passes the BIST. A nonzero value in the EAX register after the BIST indicates that a processor fault was detected. If the BIST is not requested, the contents of the EAX register after a hardware reset is 0H.

The overhead for performing a BIST varies between processor families. For example, the BIST takes approximately 5.5 million processor clock periods to execute on the PentiumR Pro processor. (This clock count is model-specific, and Intel reserves the right to change the exact number of periods, for any of the Intel Architecture processors, without notification.)

Table 8-1. 32-Bit Intel Architecture Processor States Following Power-up, Reset, or INIT

Register P6 Family Processors Pentium Processor Intel486 Processor
EFLAGS1 00000002H 00000002H 00000002H
EIP 0000FFF0H 0000FFF0H 0000FFF0H
CR0 60000010H2 60000010H2 60000010H2
CR2, CR3, CR4 00000000H 00000000H 00000000H
MXCSR Pentium III processor only-Pwr up or Reset: 1F80H FINIT/FNINIT: Unchanged NA NA
CS Selector = F000H Base = FFFF0000H Limit = FFFFH AR = Present, R/W, Accessed Selector = F000H Base = FFFF0000H Limit = FFFFH AR = Present, R/W, Accessed Selector = F000H Base = FFFF0000H Limit = FFFFH AR = Present, R/W, Accessed
SS, DS, ES, FS, GS Selector = 0000H Base = 00000000H Limit = FFFFH AR = Present, R/W, Accessed Selector = 0000H Base = 00000000H Limit = FFFFH AR = Present, R/W, Accessed Selector = 0000H Base = 00000000H Limit = FFFFH AR = Present, R/W, Accessed
EDX 000006xxH 000005xxH 000004xxH
EAX 03 03 03
EBX, ECX, ESI, EDI, EBP, ESP 00000000H 00000000H 00000000H
MM0 through MM74 Pentium Pro processor - NA Pentium II and Pentium III processor -Pwr up or Reset: 0000000000000000H FINIT/FNINIT: Unchanged Pwr up or Reset: 0000000000000000H FINIT/FNINIT: Unchanged NA
XMM0 through XMM75 ST0 through ST74 FPU Control Word4 FPU Status Word4 FPU Tag Word4 FPU Data Operand and CS Seg. Selectors4 Pentium III processor only-Pwr up or Reset: 0000000000000000H FINIT/FNINIT: Unchanged Pwr up or Reset: +0.0 FINIT/FNINIT: Unchanged Pwr up or Reset: 0040H FINIT/FNINIT: 037FH Pwr up or Reset: 0000H FINIT/FNINIT: 0000H Pwr up or Reset: 5555H FINIT/FNINIT: FFFFH Pwr up or Reset: 0000H FINIT/FNINIT: 0000H NA Pwr up or Reset: +0.0 FINIT/FNINIT: Unchanged Pwr up or Reset: 0040H FINIT/FNINIT: 037FH Pwr up or Reset: 0000H FINIT/FNINIT: 0000H Pwr up or Reset: 5555H FINIT/FNINIT: FFFFH Pwr up or Reset: 0000H FINIT/FNINIT: 0000H NA Pwr up or Reset: +0.0 FINIT/FNINIT: Unchanged Pwr up or Reset: 0040H FINIT/FNINIT: 037FH Pwr up or Reset: 0000H FINIT/FNINIT: 0000H Pwr up or Reset: 5555H FINIT/FNINIT: FFFFH Pwr up or Reset: 0000H FINIT/FNINIT: 0000H
FPU Data Operand and Inst. Pointers4 GDTR,IDTR Register Pwr up or Reset: 00000000H FINIT/FNINIT: 00000000H Base = 00000000H Limit = FFFFH AR = Present, R/W P6 Family Processors Pwr up or Reset: 00000000H FINIT/FNINIT: 00000000H Base = 00000000H Limit = FFFFH AR = Present, R/W Pentium Processor Pwr up or Reset: 00000000H FINIT/FNINIT: 00000000H Base = 00000000H Limit = FFFFH AR = Present, R/W Intel486 Processor
LDTR, Task Register Selector = 0000H Base = 00000000H Limit = FFFFH AR = Present, R/W Selector = 0000H Base = 00000000H Limit = FFFFH AR = Present, R/W Selector = 0000H Base = 00000000H Limit = FFFFH AR = Present, R/W
DR0, DR1, DR2, DR3 00000000H 00000000H 00000000H
DR6 FFFF0FF0H FFFF0FF0H FFFF1FF0H
DR7 00000400H 00000400H 00000000H
Time-Stamp Counter Power up or Reset: 0H INIT: Unchanged Power up or Reset: 0H INIT: Unchanged Not Implemented
Perf. Counters and Event Select Power up or Reset: 0H INIT: Unchanged Power up or Reset: 0H INIT: Unchanged Not Implemented
All Other MSRs Pwr up or Reset: Undefined INIT: Unchanged Pwr up or Reset: Undefined INIT: Unchanged Not Implemented
Data and Code Cache, TLBs Invalid Invalid Invalid
Fixed MTRRs Pwr up or Reset: Disabled INIT: Unchanged Not Implemented Not Implemented
Variable MTRRs Pwr up or Reset: Disabled INIT: Unchanged Not Implemented Not Implemented
Machine-Check Architecture Pwr up or Reset: Undefined INIT: Unchanged Not Implemented Not Implemented
APIC Pwr up or Reset: Enabled INIT: Unchanged Pwr up or Reset: Enabled INIT: Unchanged Not Implemented

NOTES:

1. The 10 most-significant bits of the EFLAGS register are undefined following a reset. Software should not depend on the states of any of these bits.

2. The CD and NW flags are unchanged, bit 4 is set to 1, all other bits are cleared.

3. If Built-In Self-Test (BIST) is invoked on power up or reset, EAX is 0 only if all tests passed. (BIST cannot be invoked during an INIT.)

4. The state of the FPU state and MMXT registers is not changed by the execution of an INIT.

5. Available in the PentiumR III processor and PentiumR III XeonT processor only. The state of the SIMD floating-point registers is not changed by the execution of an INIT.

Figure 8-1

8.1.3. Model and Stepping Information

Following a hardware reset, the EDX register contains component identification and revision information (refer to Figure 8-2). The device ID field is set to the value 6H, 5H, 4H, or 3H to indicate a PentiumR Pro, PentiumR, Intel486T, or Intel386T processor, respectively. Different values may be returned for the various members of these Intel Architecture families. For example the Intel386T SX processor returns 23H in the device ID field. Binary object code can be made compatible with other Intel processors by using this number to select the correct initialization software.

Figure 8-2

The stepping ID field contains a unique identifier for the processor's stepping ID or revision level. The upper word of EDX is reserved following reset.

8.1.4. First Instruction Executed

The first instruction that is fetched and executed following a hardware reset is located at physical address FFFFFFF0H. This address is 16 bytes below the processor's uppermost physical address. The EPROM containing the software-initialization code must be located at this address.

The address FFFFFFF0H is beyond the 1-MByte addressable range of the processor while in real-address mode. The processor is initialized to this starting address as follows. The CS register has two parts: the visible segment selector part and the hidden base address part. In realaddress mode, the base address is normally formed by shifting the 16-bit segment selector value 4 bits to the left to produce a 20-bit base address. However, during a hardware reset, the segment selector in the CS register is loaded with F000H and the base address is loaded with FFFF0000H. The starting address is thus formed by adding the base address to the value in the EIP register (that is, FFFF0000 + FFF0H = FFFFFFF0H).

The first time the CS register is loaded with a new value after a hardware reset, the processor will follow the normal rule for address translation in real-address mode (that is, [CS base address = CS segment selector * 16]). To insure that the base address in the CS register remains unchanged until the EPROM based software-initialization code is completed, the code must not contain a far jump or far call or allow an interrupt to occur (which would cause the CS selector value to be changed).

8.2. FPU INITIALIZATION

Software-initialization code can determine the whether the processor contains or is attached to an FPU by using the CPUID instruction. The code must then initialize the FPU and set flags in control register CR0 to reflect the state of the FPU environment.

A hardware reset places the PentiumR processor FPU in the state shown in Table 8-1. This state is different from the state the processor is placed in when executing an FINIT or FNINIT instruction (also shown in Table 8-1). If the FPU is to be used, the software-initialization code should execute an FINIT/FNINIT instruction following a hardware reset. These instructions, tag all data registers as empty, clear all the exception masks, set the TOP-of-stack value to 0, and select the default rounding and precision controls setting (round to nearest and 64-bit precision).

If the processor is reset by asserting the INIT# pin, the FPU state is not changed.

8.2.1. Configuring the FPU Environment

Initialization code must load the appropriate values into the MP, EM, and NE flags of control register CR0. These bits are cleared on hardware reset of the processor. Figure 8-2 shows the suggested settings for these flags, depending on the Intel Architecture processor being initial8- ized. Initialization code can test for the type of processor present before setting or clearing these flags.

NOTE:

* The setting of the NE flag depends on the operating system being used.

The EM flag determines whether floating-point instructions are executed by the FPU (EM is cleared) or generate a device-not-available exception (#NM) so that an exception handler can emulate the floating-point operation (EM = 1). Ordinarily, the EM flag is cleared when an FPU or math coprocessor is present and set if they are not present. If the EM flag is set and no FPU, math coprocessor, or floating-point emulator is present, the system will hang when a floatingpoint instruction is executed.

The MP flag determines whether WAIT/FWAIT instructions react to the setting of the TS flag. If the MP flag is clear, WAIT/FWAIT instructions ignore the setting of the TS flag; if the MP flag is set, they will generate a device-not-available exception (#NM) if the TS flag is set. Generally, the MP flag should be set for processors with an integrated FPU and clear for processors without an integrated FPU and without a math coprocessor present. However, an operating system can choose to save the floating-point context at every context switch, in which case there would be no need to set the MP bit.

Table 2-1 in Chapter 2, System Architecture Overview shows the actions taken for floating-point and WAIT/FWAIT instructions based on the settings of the EM, MP, and TS flags.

The NE flag determines whether unmasked floating-point exceptions are handled by generating a floating-point error exception internally (NE is set, native mode) or through an external interrupt (NE is cleared). In systems where an external interrupt controller is used to invoke numeric exception handlers (such as MS-DOS-based systems), the NE bit should be cleared.

8.2.2. Setting the Processor for FPU Software Emulation

Setting the EM flag causes the processor to generate a device-not-available exception (#NM) and trap to a software exception handler whenever it encounters a floating-point instruction. (Table 8-2 shows when it is appropriate to use this flag.) Setting this flag has two functions:

 It allows floating-point code to run on an Intel processor that neither has an integrated FPU nor is connected to an external math coprocessor, by using a floating-point emulator.

 It allows floating-point code to be executed using a special or nonstandard floating-point emulator, selected for a particular application, regardless of whether an FPU or math coprocessor is present.

To emulate floating-point instructions, the EM, MP, and NE flag in control register CR0 should be set as shown in Table 8-3.

Regardless of the value of the EM bit, the Intel486T SX processor generates a device-not-available exception (#NM) upon encountering any floating-point instruction.

8.3. CACHE ENABLING

The Intel Architecture processors (beginning with the Intel486T processor) contain internal instruction and data caches. These caches are enabled by clearing the CD and NW flags in control register CR0. (They are set during a hardware reset.) Because all internal cache lines are invalid following reset initialization, it is not necessary to invalidate the cache before enabling caching. Any external caches may require initialization and invalidation using a system-specific initialization and invalidation code sequence.

Depending on the hardware and operating system or executive requirements, additional configuration of the processor's caching facilities will probably be required. Beginning with the Intel486T processor, page-level caching can be controlled with the PCD and PWT flags in page-directory and page-table entries. For P6 family processors, the memory type range registers (MTRRs) control the caching characteristics of the regions of physical memory. (For the Intel486T and PentiumR processors, external hardware can be used to control the caching characteristics of regions of physical memory.) Refer to Chapter 9, Memory Cache Control, for detailed information on configuration of the caching facilities in the P6 family processors and system memory.

8.4. MODEL-SPECIFIC REGISTERS (MSRS)

The P6 family processors and PentiumR processors contain model-specific registers (MSRs). These registers are by definition implementation specific; that is, they are not guaranteed to be supported on future Intel Architecture processors and/or to have the same functions. The MSRs are provided to control a variety of hardware- and software-related features, including:

 The performance-monitoring counters (refer to Section 15.6., "Performance-Monitoring Counters", in Chapter 15, Debugging and Performance Monitoring).

 (P6 family processors only.) Debug extensions (refer to Section 15.4., "Last Branch, Interrupt, and Exception Recording", in Chapter 15, Debugging and Performance Monitoring).

 (P6 family processors only.) The machine-check exception capability and its accompanying machine-check architecture (refer to Chapter 13, Machine-Check Architecture).

 (P6 family processors only.) The MTRRs (refer to Section 9.12., "Memory Type Range Registers (MTRRs)", in Chapter 9, Memory Cache Control).

The MSRs can be read and written to using the RDMSR and WRMSR instructions, respectively.

When performing software initialization of a PentiumR Pro or PentiumR processor, many of the MSRs will need to be initialized to set up things like performance-monitoring events, run-time machine checks, and memory types for physical memory.

Systems configured to implement FRC mode must write all of the processors' internal MSRs to deterministic values before performing either a read or read-modify-write operation using these registers. The following is a list of MSRs that are not initialized by the processors' reset sequences.

 All fixed and variable MTRRs.

 All Machine Check Architecture (MCA) status registers.

 Microcode update signature register.

 All L2 cache initialization MSRs.

The list of available performance-monitoring counters for the PentiumR Pro and PentiumR processors is given in Appendix A, Performance-Monitoring Events, and the list of available MSRs for the PentiumR Pro processor is given in Appendix B, Model-Specific Registers. The references earlier in this section show where the functions of the various groups of MSRs are described in this manual.

8.5. MEMORY TYPE RANGE REGISTERS (MTRRS)

Memory type range registers (MTRRs) were introduced into the Intel Architecture with the PentiumR Pro processor. They allow the type of caching (or no caching) to be specified in system memory for selected physical address ranges. They allow memory accesses to be optimized for various types of memory such as RAM, ROM, frame buffer memory, and memory-mapped I/O devices.

In general, initializing the MTRRs is normally handled by the software initialization code or BIOS and is not an operating system or executive function. At the very least, all the MTRRs must be cleared to 0, which selects the uncached (UC) memory type. Refer to Section 9.12., "Memory Type Range Registers (MTRRs)", in Chapter 9, Memory Cache Control, for detailed information on the MTRRs.

8.6. SOFTWARE INITIALIZATION FOR REAL-ADDRESS MODE OPERATION

Following a hardware reset (either through a power-up or the assertion of the RESET# pin) the processor is placed in real-address mode and begins executing software initialization code from physical address FFFFFFF0H. Software initialization code must first set up the necessary data structures for handling basic system functions, such as a real-mode IDT for handling interrupts and exceptions. If the processor is to remain in real-address mode, software must then load additional operating-system or executive code modules and data structures to allow reliable execution of application programs in real-address mode.

If the processor is going to operate in protected mode, software must load the necessary data structures to operate in protected mode and then switch to protected mode. The protected-mode data structures that must be loaded are described in Section 8.7., "Software Initialization for Protected-Mode Operation".

8.6.1. Real-Address Mode IDT

In real-address mode, the only system data structure that must be loaded into memory is the IDT (also called the "interrupt vector table"). By default, the address of the base of the IDT is physical address 0H. This address can be changed by using the LIDT instruction to change the base address value in the IDTR. Software initialization code needs to load interrupt- and exceptionhandler pointers into the IDT before interrupts can be enabled. The actual interrupt- and exception-handler code can be contained either in EPROM or RAM; however, the code must be located within the 1-MByte addressable range of the processor in real-address mode. If the handler code is to be stored in RAM, it must be loaded along with the IDT.

8.6.2. NMI Interrupt Handling

The NMI interrupt is always enabled (except when multiple NMIs are nested). If the IDT and the NMI interrupt handler need to be loaded into RAM, there will be a period of time following hardware reset when an NMI interrupt cannot be handled. During this time, hardware must provide a mechanism to prevent an NMI interrupt from halting code execution until the IDT and the necessary NMI handler software is loaded. Here are two examples of how NMIs can be handled during the initial states of processor initialization:

 A simple IDT and NMI interrupt handler can be provided in EPROM. This allows an NMI interrupt to be handled immediately after reset initialization.

 The system hardware can provide a mechanism to enable and disable NMIs by passing the NMI# signal through an AND gate controlled by a flag in an I/O port. Hardware can clear the flag when the processor is reset, and software can set the flag when it is ready to handle NMI interrupts.

8.7. SOFTWARE INITIALIZATION FOR PROTECTED-MODE OPERATION

The processor is placed in real-address mode following a hardware reset. At this point in the initialization process, some basic data structures and code modules must be loaded into physical memory to support further initialization of the processor, as described in Section 8.6., "Software Initialization for Real-Address Mode Operation". Before the processor can be switched to protected mode, the software initialization code must load a minimum number of protected mode data structures and code modules into memory to support reliable operation of the processor in protected mode. These data structures include the following:

 A protected-mode IDT.

 A GDT.

 A TSS.

 (Optional.) An LDT.

 If paging is to be used, at least one page directory and one page table.

 A code segment that contains the code to be executed when the processor switches to protected mode.

 One or more code modules that contain the necessary interrupt and exception handlers. Software initialization code must also initialize the following system registers before the processor can be switched to protected mode:

 The GDTR.

 (Optional.) The IDTR. This register can also be initialized immediately after switching to protected mode, prior to enabling interrupts.

 Control registers CR1 through CR4.

 (PentiumR Pro processor only.) The memory type range registers (MTRRs). With these data structures, code modules, and system registers initialized, the processor can be switched to protected mode by loading control register CR0 with a value that sets the PE flag (bit 0).

8.7.1. Protected-Mode System Data Structures

The contents of the protected-mode system data structures loaded into memory during software initialization, depend largely on the type of memory management the protected-mode operatingsystem or executive is going to support: flat, flat with paging, segmented, or segmented with paging.

To implement a flat memory model without paging, software initialization code must at a minimum load a GDT with one code and one data-segment descriptor. A null descriptor in the first GDT entry is also required. The stack can be placed in a normal read/write data segment, so no dedicated descriptor for the stack is required. A flat memory model with paging also requires a page directory and at least one page table (unless all pages are 4 MBytes in which case only a page directory is required). Refer to Section 8.7.3., "Initializing Paging"

Before the GDT can be used, the base address and limit for the GDT must be loaded into the GDTR register using an LGDT instruction.

A multisegmented model may require additional segments for the operating system, as well as segments and LDTs for each application program. LDTs require segment descriptors in the GDT. Some operating systems allocate new segments and LDTs as they are needed. This provides maximum flexibility for handling a dynamic programming environment. However, many operating systems use a single LDT for all tasks, allocating GDT entries in advance. An embedded system, such as a process controller, might pre-allocate a fixed number of segments and LDTs for a fixed number of application programs. This would be a simple and efficient way to structure the software environment of a real-time system.

8.7.2. Initializing Protected-Mode Exceptions and Interrupts

Software initialization code must at a minimum load a protected-mode IDT with gate descriptor for each exception vector that the processor can generate. If interrupt or trap gates are used, the gate descriptors can all point to the same code segment, which contains the necessary exception handlers. If task gates are used, one TSS and accompanying code, data, and task segments are required for each exception handler called with a task gate.

If hardware allows interrupts to be generated, gate descriptors must be provided in the IDT for one or more interrupt handlers.

Before the IDT can be used, the base address and limit for the IDT must be loaded into the IDTR register using an LIDT instruction. This operation is typically carried out immediately after switching to protected mode.

8.7.3. Initializing Paging

Paging is controlled by the PG flag in control register CR0. When this flag is clear (its state following a hardware reset), the paging mechanism is turned off; when it is set, paging is enabled. Before setting the PG flag, the following data structures and registers must be initialized:

 Software must load at least one page directory and one page table into physical memory. The page table can be eliminated if the page directory contains a directory entry pointing to itself (here, the page directory and page table reside in the same page), or if only 4-MByte pages are used.

 Control register CR3 (also called the PDBR register) is loaded with the physical base address of the page directory.

 (Optional) Software may provide one set of code and data descriptors in the GDT or in an LDT for supervisor mode and another set for user mode. With this paging initialization complete, paging is enabled and the processor is switched to protected mode at the same time by loading control register CR0 with an image in which the PG and PE flags are set. (Paging cannot be enabled before the processor is switched to protected mode.)

8.7.4. Initializing Multitasking

If the multitasking mechanism is not going to be used and changes between privilege levels are not allowed, it is not necessary load a TSS into memory or to initialize the task register.

If the multitasking mechanism is going to be used and/or changes between privilege levels are allowed, software initialization code must load at least one TSS and an accompanying TSS descriptor. (A TSS is required to change privilege levels because pointers to the privileged-level 0, 1, and 2 stack segments and the stack pointers for these stacks are obtained from the TSS.) TSS descriptors must not be marked as busy when they are created; they should be marked busy by the processor only as a side-effect of performing a task switch. As with descriptors for LDTs, TSS descriptors reside in the GDT.

After the processor has switched to protected mode, the LTR instruction can be used to load a segment selector for a TSS descriptor into the task register. This instruction marks the TSS descriptor as busy, but does not perform a task switch. The processor can, however, use the TSS to locate pointers to privilege-level 0, 1, and 2 stacks. The segment selector for the TSS must be loaded before software performs its first task switch in protected mode, because a task switch copies the current task state into the TSS.

After the LTR instruction has been executed, further operations on the task register are performed by task switching. As with other segments and LDTs, TSSs and TSS descriptors can be either pre-allocated or allocated as needed.

8.8. MODE SWITCHING

To use the processor in protected mode, a mode switch must be performed from real-address mode. Once in protected mode, software generally does not need to return to real-address mode. To run software written to run in real-address mode (8086 mode), it is generally more convenient to run the software in virtual-8086 mode, than to switch back to real-address mode.

8.8.1. Switching to Protected Mode

Before switching to protected mode, a minimum set of system data structures and code modules must be loaded into memory, as described in Section 8.7., "Software Initialization for Protected- Mode Operation". Once these tables are created, software initialization code can switch into protected mode.

Protected mode is entered by executing a MOV CR0 instruction that sets the PE flag in the CR0 register. (In the same instruction, the PG flag in register CR0 can be set to enable paging.) Execution in protected mode begins with a CPL of 0.

The 32-bit Intel Architecture processors have slightly different requirements for switching to protected mode. To insure upwards and downwards code compatibility with all 32-bit Intel Architecture processors, it is recommended that the following steps be performed:

1. Disable interrupts. A CLI instruction disables maskable hardware interrupts. NMI interrupts can be disabled with external circuitry. (Software must guarantee that no exceptions or interrupts are generated during the mode switching operation.)

2. Execute the LGDT instruction to load the GDTR register with the base address of the GDT.

3. Execute a MOV CR0 instruction that sets the PE flag (and optionally the PG flag) in control register CR0.

4. Immediately following the MOV CR0 instruction, execute a far JMP or far CALL instruction. (This operation is typically a far jump or call to the next instruction in the instruction stream.)

The JMP or CALL instruction immediately after the MOV CR0 instruction changes the flow of execution and serializes the processor.

If paging is enabled, the code for the MOV CR0 instruction and the JMP or CALL instruction must come from a page that is identity mapped (that is, the linear address before the jump is the same as the physical address after paging and protected mode is enabled). The target instruction for the JMP or CALL instruction does not need to be identity mapped.

5. If a local descriptor table is going to be used, execute the LLDT instruction to load the segment selector for the LDT in the LDTR register.

6. Execute the LTR instruction to load the task register with a segment selector to the initial protected-mode task or to a writable area of memory that can be used to store TSS information on a task switch.

7. After entering protected mode, the segment registers continue to hold the contents they had in real-address mode. The JMP or CALL instruction in step 4 resets the CS register. Perform one of the following operations to update the contents of the remaining segment registers.

- Reload segment registers DS, SS, ES, FS, and GS. If the ES, FS, and/or GS registers are not going to be used, load them with a null selector.

- Perform a JMP or CALL instruction to a new task, which automatically resets the values of the segment registers and branches to a new code segment.

8. Execute the LIDT instruction to load the IDTR register with the address and limit of the protected-mode IDT.

9. Execute the STI instruction to enable maskable hardware interrupts and perform the necessary hardware operation to enable NMI interrupts.

Random failures can occur if other instructions exist between steps 3 and 4 above. Failures will be readily seen in some situations, such as when instructions that reference memory are inserted between steps 3 and 4 while in System Management mode.

8.8.2. Switching Back to Real-Address Mode

The processor switches back to real-address mode if software clears the PE bit in the CR0 register with a MOV CR0 instruction. A procedure that re-enters real-address mode should perform the following steps:

1. Disable interrupts. A CLI instruction disables maskable hardware interrupts. NMI interrupts can be disabled with external circuitry.

2. If paging is enabled, perform the following operations:

- Transfer program control to linear addresses that are identity mapped to physical addresses (that is, linear addresses equal physical addresses).

- Insure that the GDT and IDT are in identity mapped pages.

- Clear the PG bit in the CR0 register.

- Move 0H into the CR3 register to flush the TLB.

3. Transfer program control to a readable segment that has a limit of 64 KBytes (FFFFH). This operation loads the CS register with the segment limit required in real-address mode.

4. Load segment registers SS, DS, ES, FS, and GS with a selector for a descriptor containing the following values, which are appropriate for real-address mode:

- Limit = 64 KBytes (0FFFFH)

- Byte granular (G = 0)

- Expand up (E = 0)

- Writable (W = 1)

- Present (P = 1)

- Base = any value

The segment registers must be loaded with nonnull segment selectors or the segment registers will be unusable in real-address mode. Note that if the segment registers are not reloaded, execution continues using the descriptor attributes loaded during protected mode.

5. Execute an LIDT instruction to point to a real-address mode interrupt table that is within the 1-MByte real-address mode address range.

6. Clear the PE flag in the CR0 register to switch to real-address mode.

7. Execute a far JMP instruction to jump to a real-address mode program. This operation flushes the instruction queue and loads the appropriate base and access rights values in the CS register.

8. Load the SS, DS, ES, FS, and GS registers as needed by the real-address mode code. If any of the registers are not going to be used in real-address mode, write 0s to them.

9. Execute the STI instruction to enable maskable hardware interrupts and perform the necessary hardware operation to enable NMI interrupts.

NOTE

All the code that is executed in steps 1 through 9 must be in a single page and the linear addresses in that page must be identity mapped to physical addresses.

8.9. INITIALIZATION AND MODE SWITCHING EXAMPLE

This section provides an initialization and mode switching example that can be incorporated into an application. This code was originally written to initialize the Intel386T processor, but it will execute successfully on the PentiumR Pro, PentiumR, and Intel486T processors. The code in this example is intended to reside in EPROM and to run following a hardware reset of the processor. The function of the code is to do the following:

 Establish a basic real-address mode operating environment.

 Load the necessary protected-mode system data structures into RAM.

 Load the system registers with the necessary pointers to the data structures and the appropriate flag settings for protected-mode operation.

 Switch the processor to protected mode.

Figure 8-3 shows the physical memory layout for the processor following a hardware reset and the starting point of this example. The EPROM that contains the initialization code resides at the upper end of the processor's physical memory address range, starting at address FFFFFFFFH and going down from there. The address of the first instruction to be executed is at FFFFFFF0H, the default starting address for the processor following a hardware reset.

The main steps carried out in this example are summarized in Table 8-4. The source listing for the example (with the filename STARTUP.ASM) is given in Example 8-1. The line numbers given in Table 8-4 refer to the source listing. The following are some additional notes concerning this example:

 When the processor is switched into protected mode, the original code segment baseaddress value of FFFF0000H (located in the hidden part of the CS register) is retained and execution continues from the current offset in the EIP register. The processor will thus continue to execute code in the EPROM until a far jump or call is made to a new code segment, at which time, the base address in the CS register will be changed.

 Maskable hardware interrupts are disabled after a hardware reset and should remain disabled until the necessary interrupt handlers have been installed. The NMI interrupt is not disabled following a reset. The NMI# pin must thus be inhibited from being asserted until an NMI handler has been loaded and made available to the processor.

 The use of a temporary GDT allows simple transfer of tables from the EPROM to anywhere in the RAM area. A GDT entry is constructed with its base pointing to address 0 and a limit of 4 GBytes. When the DS and ES registers are loaded with this descriptor, the temporary GDT is no longer needed and can be replaced by the application GDT.

 This code loads one TSS and no LDTs. If more TSSs exist in the application, they must be loaded into RAM. If there are LDTs they may be loaded as well.

Figure 8-3

Table 8-4. Main Initialization Steps in STARTUP.ASM Source Listing

STARTUP.ASM Line Numbers Description
From To
157 157 Jump (short) to the entry code in the EPROM
162 169 Construct a temporary GDT in RAM with one entry: 0 - null 1 - R/W data segment, base = 0, limit = 4 GBytes
171 172 Load the GDTR to point to the temporary GDT
174 177 Load CR0 with PE flag set to switch to protected mode
179 181 Jump near to clear real mode instruction queue
184 186 Load DS, ES registers with GDT[1] descriptor, so both point to the entire physical memory space
188 195 Perform specific board initialization that is imposed by the new protected mode
196 218 Copy the applications GDT from ROM into RAM
220 238 Copy the applications IDT from ROM into RAM
241 243 Load applications GDTR
244 245 Load applications IDTR
247 261 Copy the applications TSS from ROM into RAM
263 267 Update TSS descriptor and other aliases in GDT (GDT alias or IDT alias)
277 277 Load the task register (without task switch) using LTR instruction
282 286 Load SS, ESP with the value found in the applications TSS
287 287 Push EFLAGS value found in the applications TSS
288 288 Push CS value found in the applications TSS
289 289 Push EIP value found in the applications TSS
290 293 Load DS, ES with the value found in the applications TSS
296 296 Perform IRET; pop the above values and enter the application code

8.9.1. Assembler Usage

In this example, the Intel assembler ASM386 and build tools BLD386 are used to assemble and build the initialization code module. The following assumptions are used when using the Intel ASM386 and BLD386 tools.

 The ASM386 will generate the right operand size opcodes according to the code-segment attribute. The attribute is assigned either by the ASM386 invocation controls or in the code-segment definition.

 If a code segment that is going to run in real-address mode is defined, it must be set to a USE 16 attribute. If a 32-bit operand is used in an instruction in this code segment (for example, MOV EAX, EBX), the assembler automatically generates an operand prefix for the instruction that forces the processor to execute a 32-bit operation, even though its default code-segment attribute is 16-bit.

 Intel's ASM386 assembler allows specific use of the 16- or 32-bit instructions, for example, LGDTW, LGDTD, IRETD. If the generic instruction LGDT is used, the defaultsegment attribute will be used to generate the right opcode.

8.9.2. STARTUP.ASM Listing

The source code listing to move the processor into protected mode is provided in Example 8-1. This listing does not include any opcode and offset information.

Example 8-1. STARTUP.ASM

MS-DOS* 5.0(045-N) 386(TM) MACRO ASSEMBLER STARTUP 09:44:51 08/19/92 PAGE 1

MS-DOS 5.0(045-N) 386(TM) MACRO ASSEMBLER V4.0, ASSEMBLY OF MODULE STARTUP

OBJECT MODULE PLACED IN startup.obj

ASSEMBLER INVOKED BY: f:\386tools\ASM386.EXE startup.a58 pw (132 )

 LINE SOURCE
 1 NAME STARTUP
 2
 3 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 4 ;
 5 ; ASSUMPTIONS:
 6 ;
 7 ; 1. The bottom 64K of memory is ram, and can be used for
 8 ; scratch space by this module.
 9 ;
 10 ; 2. The system has sufficient free usable ram to copy the
 11 ; initial GDT, IDT, and TSS
 8-20
 PROCESSOR MANAGEMENT AND INITIALIZATION
 12 ;
 13 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 14
 15 ; configuration data - must match with build definition
 16
 17 CS_BASE EQU 0FFFF0000H
 18
 19 ; CS_BASE is the linear address of the segment STARTUP_CODE
 20 ; - this is specified in the build language file
 21
 22 RAM_START EQU 400H
 23
 24 ; RAM_START is the start of free, usable ram in the linear
 25 ; memory space. The GDT, IDT, and initial TSS will be
 26 ; copied above this space, and a small data segment will be
 27 ; discarded at this linear address. The 32-bit word at
 28 ; RAM_START will contain the linear address of the first
 29 ; free byte above the copied tables - this may be useful if
 30 ; a memory manager is used.
 31
 32 TSS_INDEX EQU 10
 33
 34 ; TSS_INDEX is the index of the TSS of the first task to
 35 ; run after startup
 36
 37
 38 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 39
 40 ; ------------------------- STRUCTURES and EQU ---------------
 41 ; structures for system data
 42
 43 ; TSS structure
 44 TASK_STATE STRUC
 45 link DW ?
 46 link_h DW ?
 47 ESP0 DD ?
 48 SS0 DW ?
 49 SS0_h DW ?
 50 ESP1 DD ?
 51 SS1 DW ?
 52 SS1_h DW ?
 53 ESP2 DD ?
 54 SS2 DW ?
 55 SS2_h DW ?
 56 CR3_reg DD ?
 57 EIP_reg DD ?
 58 EFLAGS_reg DD ?
 59 EAX_reg DD ?
 60 ECX_reg DD ?
 61 EDX_reg DD ?
 62 EBX_reg DD ?
 63 ESP_reg DD ?
 64 EBP_reg DD ?
 65 ESI_reg DD ?
 66 EDI_reg DD ?
 67 ES_reg DW ?
 68 ES_h DW ?
 69 CS_reg DW ?
 70 CS_h DW ?
 71 SS_reg DW ?
 72 SS_h DW ?
 73 DS_reg DW ?
 74 DS_h DW ?
 75 FS_reg DW ?
 76 FS_h DW ?
 77 GS_reg DW ?
 78 GS_h DW ?
 79 LDT_reg DW ?
 80 LDT_h DW ?
 81 TRAP_reg DW ?
 82 IO_map_base DW ?
 83 TASK_STATE ENDS
 84
 85 ; basic structure of a descriptor
 86 DESC STRUC
 87 lim_0_15 DW ?
 88 bas_0_15 DW ?
 89 bas_16_23 DB ?
 90 access DB ?
 91 gran DB ?
 92 bas_24_31 DB ?
 93 DESC ENDS
 94
 95 ; structure for use with LGDT and LIDT instructions
 96 TABLE_REG STRUC
 97 table_lim DW ?
 98 table_linear DD ?
 99 TABLE_REG ENDS
 100
 101 ; offset of GDT and IDT descriptors in builder generated GDT
 102 GDT_DESC_OFF EQU 1*SIZE(DESC)
 103 IDT_DESC_OFF EQU 2*SIZE(DESC)
 104
 105 ; equates for building temporary GDT in RAM
 106 LINEAR_SEL EQU 1*SIZE (DESC)
 107 LINEAR_PROTO_LO EQU 00000FFFFH ; LINEAR_ALIAS
 108 LINEAR_PROTO_HI EQU 000CF9200H
 109
 110 ; Protection Enable Bit in CR0
 111 PE_BIT EQU 1B
 112
 113 ; ------------------------------------------------------------
 114
 115 ; ------------------------- DATA SEGMENT----------------------
 116
 117 ; Initially, this data segment starts at linear 0, according
 118 ; to the processor's power-up state.
 119
 120 STARTUP_DATA SEGMENT RW
 121
 122 free_mem_linear_base LABEL DWORD
 123 TEMP_GDT LABEL BYTE ; must be first in segment
 124 TEMP_GDT_NULL_DESC DESC <>
 125 TEMP_GDT_LINEAR_DESC DESC <>
 126
 127 ; scratch areas for LGDT and LIDT instructions
 128 TEMP_GDT_SCRATCH TABLE_REG <>
 129 APP_GDT_RAM TABLE_REG <>
 130 APP_IDT_RAM TABLE_REG <>
 131 ; align end_data
 132 fill DW ?
 133
 134 ; last thing in this segment - should be on a dword boundary
 135 end_data LABEL BYTE
 136
 137 STARTUP_DATA ENDS
 138 ; ------------------------------------------------------------
 139
 140
 141 ; ------------------------- CODE SEGMENT----------------------
 142 STARTUP_CODE SEGMENT ER PUBLIC USE16
 143
 144 ; filled in by builder
 145 PUBLIC GDT_EPROM
 146 GDT_EPROM TABLE_REG <>
 147
 148 ; filled in by builder
 149 PUBLIC IDT_EPROM
 150 IDT_EPROM TABLE_REG <>
 151
 152 ; entry point into startup code - the bootstrap will vector
 153 ; here with a near JMP generated by the builder. This
 154 ; label must be in the top 64K of linear memory.
 155
 156 PUBLIC STARTUP
 157 STARTUP:
 158
 159 ; DS,ES address the bottom 64K of flat linear memory
 160 ASSUME DS:STARTUP_DATA, ES:STARTUP_DATA
 161 ; See Figure 8-4
 162 ; load GDTR with temporary GDT
 163 LEA EBX,TEMP_GDT ; build the TEMP_GDT in low ram,
 164 MOV DWORD PTR [EBX],0 ; where we can address
 165 MOV DWORD PTR [EBX]+4,0
 166 MOV DWORD PTR [EBX]+8, LINEAR_PROTO_LO
 167 MOV DWORD PTR [EBX]+12, LINEAR_PROTO_HI
 168 MOV TEMP_GDT_scratch.table_linear,EBX
 169 MOV TEMP_GDT_scratch.table_lim,15
 170
 171 DB 66H ; execute a 32 bit LGDT
 172 LGDT TEMP_GDT_scratch
 173
 174 ; enter protected mode
 175 MOV EBX,CR0
 176 OR EBX,PE_BIT
 177 MOV CR0,EBX
 178
 179 ; clear prefetch queue
 180 JMP CLEAR_LABEL
 181 CLEAR_LABEL:
 182
 183 ; make DS and ES address 4G of linear memory
 184 MOV CX,LINEAR_SEL
 185 MOV DS,CX
 186 MOV ES,CX
 187
 188 ; do board specific initialization
 189 ;
 190 ;
 191 ; ......
 192 ;
 193
 194
 195 ; See Figure 8-5
 196 ; copy EPROM GDT to ram at:
 197 ; RAM_START + size (STARTUP_DATA)
 198 MOV EAX,RAM_START
 199 ADD EAX,OFFSET (end_data)
 200 MOV EBX,RAM_START
 201 MOV ECX, CS_BASE
 202 ADD ECX, OFFSET (GDT_EPROM)
 203 MOV ESI, [ECX].table_linear
 204 MOV EDI,EAX
 205 MOVZX ECX, [ECX].table_lim
 206 MOV APP_GDT_ram[EBX].table_lim,CX
 207 INC ECX
 208 MOV EDX,EAX
 209 MOV APP_GDT_ram[EBX].table_linear,EAX
 210 ADD EAX,ECX
 211 REP MOVS BYTE PTR ES:[EDI],BYTE PTR DS:[ESI]
 212
 213 ; fixup GDT base in descriptor
 214 MOV ECX,EDX
 215 MOV [EDX].bas_0_15+GDT_DESC_OFF,CX
 216 ROR ECX,16
 217 MOV [EDX].bas_16_23+GDT_DESC_OFF,CL
 218 MOV [EDX].bas_24_31+GDT_DESC_OFF,CH
 219
 220 ; copy EPROM IDT to ram at:
 221 ; RAM_START+size(STARTUP_DATA)+SIZE (EPROM GDT)
 222 MOV ECX, CS_BASE
 223 ADD ECX, OFFSET (IDT_EPROM)
 224 MOV ESI, [ECX].table_linear
 225 MOV EDI,EAX
 226 MOVZX ECX, [ECX].table_lim
 227 MOV APP_IDT_ram[EBX].table_lim,CX
 228 INC ECX
 229 MOV APP_IDT_ram[EBX].table_linear,EAX
 230 MOV EBX,EAX
 231 ADD EAX,ECX
 232 REP MOVS BYTE PTR ES:[EDI],BYTE PTR DS:[ESI]
 233
 234 ; fixup IDT pointer in GDT
 235 MOV [EDX].bas_0_15+IDT_DESC_OFF,BX
 236 ROR EBX,16
 237 MOV [EDX].bas_16_23+IDT_DESC_OFF,BL
 238 MOV [EDX].bas_24_31+IDT_DESC_OFF,BH
 239
 240 ; load GDTR and IDTR
 241 MOV EBX,RAM_START
 242 DB 66H ; execute a 32 bit LGDT
 243 LGDT APP_GDT_ram[EBX]
 244 DB 66H ; execute a 32 bit LIDT
 245 LIDT APP_IDT_ram[EBX]
 246
 247 ; move the TSS
 248 MOV EDI,EAX
 249 MOV EBX,TSS_INDEX*SIZE(DESC)
 250 MOV ECX,GDT_DESC_OFF ;build linear address for TSS
 251 MOV GS,CX
 252 MOV DH,GS:[EBX].bas_24_31
 253 MOV DL,GS:[EBX].bas_16_23
 254 ROL EDX,16
 255 MOV DX,GS:[EBX].bas_0_15
 256 MOV ESI,EDX
 257 LSL ECX,EBX
 258 INC ECX
 259 MOV EDX,EAX
 260 ADD EAX,ECX
 261 REP MOVS BYTE PTR ES:[EDI],BYTE PTR DS:[ESI]
 262
 263 ; fixup TSS pointer
 264 MOV GS:[EBX].bas_0_15,DX
 265 ROL EDX,16
 266 MOV GS:[EBX].bas_24_31,DH
 267 MOV GS:[EBX].bas_16_23,DL
 268 ROL EDX,16
 269 ;save start of free ram at linear location RAMSTART
 270 MOV free_mem_linear_base+RAM_START,EAX
 271
 272 ;assume no LDT used in the initial task - if necessary,
 273 ;code to move the LDT could be added, and should resemble
 274 ;that used to move the TSS
 275
 276 ; load task register
 277 LTR BX ; No task switch, only descriptor loading
 278 ; See Figure 8-6
 279 ; load minimal set of registers necessary to simulate task
 280 ; switch
 281
 282
 283 MOV AX,[EDX].SS_reg ; start loading registers
 284 MOV EDI,[EDX].ESP_reg
 285 MOV SS,AX
 286 MOV ESP,EDI ; stack now valid
 287 PUSH DWORD PTR [EDX].EFLAGS_reg
 288 PUSH DWORD PTR [EDX].CS_reg
 289 PUSH DWORD PTR [EDX].EIP_reg
 290 MOV AX,[EDX].DS_reg
 291 MOV BX,[EDX].ES_reg
 292 MOV DS,AX ; DS and ES no longer linear memory
 293 MOV ES,BX
 294
 295 ; simulate far jump to initial task
 296 IRETD
 297
 298 STARTUP_CODE ENDS
 *** WARNING #377 IN 298, (PASS 2) SEGMENT CONTAINS PRIVILEGED INSTRUCTION(S)
 299
 300 END STARTUP, DS:STARTUP_DATA, SS:STARTUP_DATA
 301
 302
 ASSEMBLY COMPLETE, 1 WARNING, NO ERRORS.
 

Figure 8-4
Figure 8-4. Constructing Temporary GDT and Switching to Protected Mode (Lines 162-172 of List File)

Figure 8-5
Figure 8-5. Moving the GDT, IDT and TSS from ROM to RAM (Lines 196-261 of List File)

Figure 8-6
Figure 8-6. Task Switching (Lines 282-296 of List File)

8.9.3. MAIN.ASM Source Code

The file MAIN.ASM shown in Example 8-2 defines the data and stack segments for this application and can be substituted with the main module task written in a high-level language that is invoked by the IRET instruction executed by STARTUP.ASM.

Example 8-2. MAIN.ASM

 NAME main_module
 data SEGMENT RW
 dw 1000 dup(?)
 DATA ENDS
 stack stackseg 800
 CODE SEGMENT ER use32 PUBLIC
 main_start:
 nop
 nop
 nop
 CODE ENDS
 END main_start, ds:data, ss:stack
 
8.9.4. Supporting Files

The batch file shown in Example 8-3 can be used to assemble the source code files STARTUP.ASM and MAIN.ASM and build the final application.

Example 8-3. Batch File to Assemble and Build the Application

ASM386 STARTUP.ASM

ASM386 MAIN.ASM

BLD386 STARTUP.OBJ, MAIN.OBJ buildfile(EPROM.BLD) bootstrap(STARTUP) Bootload

BLD386 performs several operations in this example:

 It allocates physical memory location to segments and tables.

 It generates tables using the build file and the input files.

 It links object files and resolves references.

 It generates a boot-loadable file to be programmed into the EPROM.

Example 8-4 shows the build file used as an input to BLD386 to perform the above functions.

Example 8-4. Build File

 INIT_BLD_EXAMPLE;
 SEGMENT
 *SEGMENTS(DPL = 0)
 , startup.startup_code(BASE = 0FFFF0000H)
 ;
 TASK
 BOOT_TASK(OBJECT = startup, INITIAL,DPL = 0,
 NOT INTENABLED)
 , PROTECTED_MODE_TASK(OBJECT = main_module,DPL = 0,
 NOT INTENABLED)
 ;
 TABLE
 GDT (
 LOCATION = GDT_EPROM
 , ENTRY = (
 10: PROTECTED_MODE_TASK
 , startup.startup_code
 , startup.startup_data
 , main_module.data
 , main_module.code
 , main_module.stack
 )
 ),
 IDT (
 LOCATION = IDT_EPROM
 );
 MEMORY
 (
 RESERVE = (0..3FFFH
 -- Area for the GDT, IDT, TSS copied from ROM
 , 60000H..0FFFEFFFFH)
 , RANGE = (ROM_AREA = ROM (0FFFF0000H..0FFFFFFFFH))
 -- Eprom size 64K
 , RANGE = (RAM_AREA = RAM (4000H..05FFFFH))
 );
 END
 

8.10. P6 FAMILY MICROCODE UPDATE FEATURE

P6 family processors have the capability to correct specific errata through the loading of an Intel-supplied data block. This data block is referred to as a microcode update. This chapter describes the underlying mechanisms the BIOS needs to provide in order to utilize this feature during system initialization. It also describes a specification that provides for incorporating future releases of the microcode update into a system BIOS.

Intel considers the combination of a particular silicon revision and the microcode update as the equivalent stepping of the processor. Intel does not validate processors without the microcode update loaded. Intel completes a full-stepping level validation and testing for new releases of microcode updates.

A microcode update is used to correct specific errata in the processor. The BIOS, which incorporates an update loader, is responsible for loading the appropriate update on all processors during system initialization (refer to Figure 8-7). There are effectively two steps to this process. The first is to incorporate the necessary microcode updates into the BIOS, the second is to actually load the appropriate microcode update into the processor.

Figure 8-7
Figure 8-7. Integrating Processor Specific Updates

8.10.1. Microcode Update

A microcode update consists of an Intel-supplied binary that contains a descriptive header and data. No executable code resides within the update. This section describes the update and the structure of its data format.

Each microcode update is tailored for a particular stepping of a P6 family processor. It is designed such that a mismatch between a stepping of the processor and the update will result in a failure to load. Thus, a given microcode update is associated with a particular type, family, model, and stepping of the processor as returned by the CPUID instruction. In addition, the intended processor platform type must be determined to properly target the microcode update. The intended processor platform type is determined by reading a model-specific register MSR (17h) (refer to Table 8-6) within the P6 family processor. This is a 64-bit register that may be read using the RDMSR instruction (refer to Section 3.2., "Instruction Reference" Chapter 3, Instruction Set Reference, Volume 1 of the Programmer's Reference Manual). The three platform ID bits, when read as a binary coded decimal (BCD) number indicate the bit position in the microcode update header's, Processor Flags field, that is associated with the installed processor.

 Register Name:BBL_CR_OVRD
 MSR Address:017h
 Access:Read Only
 BBL_CR_OVRD is a 64-bit register accessed only when referenced as a Qword through a
 RDMSR instruction.
 
The microcode update is a data block that is exactly 2048 bytes in length. The initial 48 bytes of the update contain a header with information used to identify the update. The update header and its reserved fields are interpreted by software based upon the header version. The initial version of the header is 00000001h. An encoding scheme also guards against tampering of the update data and provides a means for determining the authenticity of any given update. Table 8-7 defines each of the fields and Figure 8-8 shows the format of the microcode update data block.
Оставьте свой комментарий !

Ваше имя:
Комментарий:
Оба поля являются обязательными

 Автор  Комментарий к данной статье