TaskSwitching How the x 86 processor assists with

  • Slides: 25
Download presentation
Task-Switching How the x 86 processor assists with context-switching among multiple program-threads

Task-Switching How the x 86 processor assists with context-switching among multiple program-threads

Program Model • Programs consist of data and instructions • Data consists of constants

Program Model • Programs consist of data and instructions • Data consists of constants and variables, which may be ‘persistent’ or ‘transient’ • Instructions may be ‘private’ or ‘shared’ • These observations lead to a conceptual model for the management of programs, and to special processor capabilities that assist in supporting that conceptual model

Conceptual Program-Model runtime library STACK created during runtime Private Data (transient) heap BSS created

Conceptual Program-Model runtime library STACK created during runtime Private Data (transient) heap BSS created at compile time Shared Instructions and Data (persistent) DATA TEXT Uninitialized Data (persistent) Initialized Data (persistent) Private Instructions (persistent)

Task Isolation • The CPU is designed to assist the system software in isolating

Task Isolation • The CPU is designed to assist the system software in isolating the private portions of one program from those of another while they both are residing in physical memory, while allowing them also to share certain instructions and data in a controlled way • This ‘sharing’ includes access to the CPU, whereby the tasks take turns at executing

IDT Multi-tasking GDT IDTR TR GDTR TSS 1 TSS 2 shared runtime library STACK

IDT Multi-tasking GDT IDTR TR GDTR TSS 1 TSS 2 shared runtime library STACK supervisor-space (ring 0) user-space (ring 3) SP SS heap BSS DATA TEXT Task #1 Task #2 DS IP CS

Context-Switching • The CPU can perform a ‘context-switch’ to save the current values of

Context-Switching • The CPU can perform a ‘context-switch’ to save the current values of all its registers (in the memory-area referenced by the TR register), and to load new values into all its registers (from the memory-area specified by a new Task-State Segment selector) • There are four ways to trigger this ‘taskswitch’ operation on x 86 processors

How to cause a task-switch • Use an ‘ljmp’ instruction (long jump): ljmp $task_selector,

How to cause a task-switch • Use an ‘ljmp’ instruction (long jump): ljmp $task_selector, $0 • Use an ‘lcall’ instruction (long call): lcall $task_selector, $0 • Use an ‘int-n’ instruction (with a task-gate): int $0 x 80 • Use an ‘iret’ instruction (with NT=1): iret

‘ljmp’ and ‘lcall’ • These instructions are similar – they both make use of

‘ljmp’ and ‘lcall’ • These instructions are similar – they both make use of a ‘selector’ for a Task-State Segment descriptor Base[31. . 24] A Limit 000 V P [19. . 16] L Base[ 15. . 0 ] D P 0 type Base[23. . 16] L Limit[ 15. . 0 ] TSS Descriptor-Format type: 16 bit. TSS( 0 x 1=available or 0 x 3=busy) or 32 bit. TSS( 0 x 9=available or 0 x. B=busy)

The two TSS formats • Intel introduced the Task-State Segment in the 80286 processor

The two TSS formats • Intel introduced the Task-State Segment in the 80286 processor (used in IBM-PC/AT) • The 80286 CPU had a 16 -bit architecture • Later Intel introduced its 80386 processor which had a 32 -bit architecture requiring a larger and more elaborate format for its Task-State Segment data-structure • The 286 TSS is now considered ‘obsolete’

The 80286 TSS format 16 -bits 22 words = field is ‘static’ = field

The 80286 TSS format 16 -bits 22 words = field is ‘static’ = field is ‘volatile’ link sp 0 ss 0 sp 1 ss 1 sp 2 ss 2 IP FLAGS AX CX DX BX SP BP SI DI ES CS SS DS LDTR 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42

The 80386 TSS format 32 -bits esp 0 esp 1 esp 2 PTDB 26

The 80386 TSS format 32 -bits esp 0 esp 1 esp 2 PTDB 26 longwords ss 0 ss 1 ss 2 EIP ss 0 EFLAGS ss 0 EAX ss 0 ECX ss 0 EDX ss 0 EBX ss 0 ESP ss 0 EBP ss 0 ESI ss 0 EDI ss 0 = field is ‘static’ = field is ‘volatile’ link IOMAP ES CS SS DS FS GS LDTR TRAP = field is ‘reserved’ I/O permission bitmap 0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100

Which to use: ‘ljmp’ or ‘lcall’? • Use ‘ljmp’ to switch to a different

Which to use: ‘ljmp’ or ‘lcall’? • Use ‘ljmp’ to switch to a different task in case you have no intention of returning • Use ‘lcall’ to switch to a different task in case you want to ‘return’ to this task later • The CPU treats ‘ljmp’ and ‘lcall’ differently in regard to the TSS, GDT and EFLAGS

No Task Reentrancy! • Since each task has just one ‘save area’ (in its

No Task Reentrancy! • Since each task has just one ‘save area’ (in its TSS), it must not be permitted for a task to be recursively reentered! • The CPU enforces this prohibition using a ‘busy’ bit within each task’s TSS descriptor • Whenever the TR register is loaded with a new selector-value, the CPU checks to be sure the task isn’t already ‘busy’; if it’s not, the task is entered, but gets marked ‘busy’

Task-Nesting • But it’s OK for one task to be nested within another, and

Task-Nesting • But it’s OK for one task to be nested within another, and another… initial TSS LINK lcall TSS #1 lcall TSS #2 lcall TSS #3 TSS #4 current TSS TR

The NT-bit in FLAGS • When the CPU switches to a new task via

The NT-bit in FLAGS • When the CPU switches to a new task via an ‘lcall’ instruction, it sets NT=1 in FLAGS (and it leaves the old TSS marked ‘busy’) • The new task can then ‘return’ to the old task by executing an ‘iret’ instruction (the old task is still ‘busy’, so returning to it with an ‘lcall’ or an ‘ljmp’ wouldn’t be possible)

Task-switch Semantics Field ljmp effect lcall effect iret effect new busy-bit changes to 1

Task-switch Semantics Field ljmp effect lcall effect iret effect new busy-bit changes to 1 stays = 1 old busy-bit changes to 1 is cleared new NT-flag Is cleared Is set to 1 no change old NT-flag no change is cleared new LINK-field no change new value old LINK-field no change

Task-Gate Descriptor • It is also possible to trigger a task-switch with a software

Task-Gate Descriptor • It is also possible to trigger a task-switch with a software or hardware interrupt, by using a Task-Gate Descriptor in the IDT D type P P 0 (=0 x 5) L Task-State Segment Selector Task-Gate Descriptor Format

‘Threads’ versus ‘Tasks’ • In some advanced applications, a task can consist of multiple

‘Threads’ versus ‘Tasks’ • In some advanced applications, a task can consist of multiple execution-threads • Like tasks, threads take turns executing (and thus require ‘context-switching’) • CPU doesn’t distinguish between ‘threads’ and ‘tasks’ – context-switching semantics are the same for both • Difference lies in ‘sharing’ of data/code

A task with multiple threads TSS 1 TSS 2 Each thread has its own

A task with multiple threads TSS 1 TSS 2 Each thread has its own TSS-segment supervisor-space (ring 0) user-space (ring 3) STACK 1 STACK 2 STACKS (each is thread-private) heap DATA 1 DATA 2 DATA (some shared, some private) CODE 1 CODE 2 TEXT (some shared, some private)

Demo program: ‘twotasks. s’ • We have constructed a simple demo that illustrates the

Demo program: ‘twotasks. s’ • We have constructed a simple demo that illustrates the CPU task-switching ability • It’s one program, but with two threads • Everything is in one physical segment, but the segment-descriptors create a number of different overlapping ‘logical’ segments • One task is the ‘supervisor’ thread: it ‘calls’ a ‘subordinate’ thread (to print a message)

A thread could use an LDT • To support isolation of memory-segments among distinct

A thread could use an LDT • To support isolation of memory-segments among distinct tasks or threads, the CPU allows use of ‘private’ descriptor-tables • Same format for the segment-descriptors • But selectors use a Table-Indicator bit 15 3 2 1 0 Descriptor-table index field T I RPL Format of a segment-selector (16 -bits) TI = Table-Indicator (0 = GDT, 1 = LDT) RPL = Requested Privilege-Level

LDT descriptors • Each Local Descriptor Table is described by its own ‘system’ segment-descriptor

LDT descriptors • Each Local Descriptor Table is described by its own ‘system’ segment-descriptor in the Global Descriptor Table Base[31. . 24] A Limit 000 V P [19. . 16] L Base[ 15. . 0 ] D P 0 type Base[23. . 16] L Limit[ 15. . 0 ] LDT Descriptor-Format Type-field: the ‘type’ code for any LDT segment-descriptor is 0 x 2

In-class Exercise #1 • In our ‘twotasks. s’ demo, the two threads will both

In-class Exercise #1 • In our ‘twotasks. s’ demo, the two threads will both execute at privilege-level zero • An enhanced version of this demo would have the ‘supervisor’ (Thread #1) execute in ring 0 and the ‘subordinate’ (Thread #2) execute in ring 3 • Can you modify the demo-program so it incorporates that suggested improvement?

More enhancements? • The demo-program could be made much more interesting if it used

More enhancements? • The demo-program could be made much more interesting if it used more than one subordinate thread, and if the supervisor thread took turns repeatedly making calls to each subordinate (i. e. , ‘time-sharing’) • You can arrange for a thread to be called more than once by using a ‘jmp’ after the ‘iret’ instruction (to re-execute thread)

In-class Exercise #2 • Modify the demo so it has two subordinate threads, each

In-class Exercise #2 • Modify the demo so it has two subordinate threads, each of which prints a message, and each of which can be called again and again (i. e. , add a jmp-instruction after iret): begin: ; entry-point to the thread . . . iret jmp begin