1 eindhoven university of technology Architectures of Digital

  • Slides: 51
Download presentation
1/ eindhoven university of technology Architectures of Digital Information Systems Part 3: Multitasking and

1/ eindhoven university of technology Architectures of Digital Information Systems Part 3: Multitasking and Memory Management dr. ir. A. C. Verschueren Eindhoven University of Technology Section of Digital Information Systems / faculty of Electrical Engineering

1/ eindhoven university of technology Let’s build a computer. . . computer system CPU:

1/ eindhoven university of technology Let’s build a computer. . . computer system CPU: 1 processor + memory my program (task) your program (task) device driver (task) operating system disk controller HW / interrupt faculty of Electrical Engineering

1/ eindhoven university of technology Operating system functions • Maintain a task list (administration)

1/ eindhoven university of technology Operating system functions • Maintain a task list (administration) – Know state of all tasks (ready-to-run/waiting/…) – Decide which task must run (task switching) • Manage communication resources – Synchronise communicating tasks • Translate interrupts into task restarts • Allocate and de-allocate memory for tasks / faculty of Electrical Engineering

1/ eindhoven university of technology Playing devil (1) Unprotected systems are open to attacks

1/ eindhoven university of technology Playing devil (1) Unprotected systems are open to attacks ! • Critical instructions can be executed – 'DI' (disable interrupts) – 'HALT' to crash the whole processor • Hardware ports can be read and written directly – Modify or delete specific files (difficult!) / – Format a random track with a single command faculty of Electrical Engineering

1/ eindhoven university of technology Playing devil (2) • Reading / writing of all

1/ eindhoven university of technology Playing devil (2) • Reading / writing of all memory locations possible – Modify O. S. task tables – Kill communication resources – Change another task's data or program (funny) – Fill memory with random data We assume the operating system and device drivers do not allow illegal operations! / faculty of Electrical Engineering

1/ eindhoven university of technology Protecting critical operations • Give the processor hardware 'modes'

1/ eindhoven university of technology Protecting critical operations • Give the processor hardware 'modes' to run in – 'User' mode: critical operations not allowed – 'System' mode: critical operations are permitted – More than two modes possible for ‘fine tuning’ • Switching from system- to user mode need not be protected (we trust the operating system) • Switching the other way around must be protected ! / faculty of Electrical Engineering

1/ eindhoven university of technology Intel 80286: more protection levels Kernel: PL 0 PL:

1/ eindhoven university of technology Intel 80286: more protection levels Kernel: PL 0 PL: Privilege Level O. S. core: PL 1 Device drivers: PL 2 User applications: PL 3 / – Code access: only in same level or higher levels – Data access: only in same level or lower levels – Stack access: only in same level (separate stacks!) faculty of Electrical Engineering

1/ eindhoven university of technology Switching from user to system mode (1) • Generally

1/ eindhoven university of technology Switching from user to system mode (1) • Generally done with a kind of software 'interrupt' – Hardware interrupt routines run in system mode too – They need the same mode switching logic but interrupts remain enabled here • Interrupt routine start addresses in protected table – Not possible to enter system mode at arbitrary address / – Called routine is responsible for checking parameters faculty of Electrical Engineering

1/ eindhoven university of technology Switching from user to system mode (2) • The

1/ eindhoven university of technology Switching from user to system mode (2) • The number of ‘software interrupts’ is limited – Signetics 32032 and Zilog Z 8000: ONE 'service call' • Other methods exist for protected switching – DEC Alpha: protected library of subroutines – Intel 80286: pseudo segments called ’call gates’ / faculty of Electrical Engineering

1/ eindhoven university of technology Protecting input/output port access • Declare 'IN' and 'OUT'

1/ eindhoven university of technology Protecting input/output port access • Declare 'IN' and 'OUT' instructions critical – What to do with the device driver tasks? • Enable and disable access on a per-port basis – Allows fine-tuning of what tasks are allowed to do • Use memory mapped I/O / – Let the memory protection handle the I/O protection faculty of Electrical Engineering

1/ eindhoven university of technology Memory read/write protection • Is the hardest of them

1/ eindhoven university of technology Memory read/write protection • Is the hardest of them all to do: There is a lot of it to protect! • Must be combined with memory management • Protection for different access types not enough – Read / Write / Execute (visible on bus system) • Protection for different memory uses is needed / faculty of Electrical Engineering – Code / Stack / Constants / Private data / Shared data

1/ eindhoven university of technology The single linear memory space model '0' program A

1/ eindhoven university of technology The single linear memory space model '0' program A subroutine X(A) data A data X(A) stack A/X(A) same code for subroutine X ! shared data A/B program B subroutine X(B) data B data X(B) stack B/X(B) Where to place the next task? / '-1' faculty of Electrical Engineering Allowed addresses for A accessible by A and B accessible by B Allowed addresses for B

1/ eindhoven university of technology Linear memory model disadvantages • Needs a 'linking loader’

1/ eindhoven university of technology Linear memory model disadvantages • Needs a 'linking loader’ to modify programs – Programs are loaded into different memory areas each time they are run – Problems with shared subroutine libraries • Inefficient use of memory • Protection must be based on memory areas • Programs are not protected against themselves / faculty of Electrical Engineering

1/ eindhoven university of technology The multiple linear memory spaces model '0' '-1' '0'

1/ eindhoven university of technology The multiple linear memory spaces model '0' '-1' '0' / '-1' program A data A shared data A/B subroutine X(A) data X(A) shared data A/B stack A/X(A) program B data B shared data A/B stack B/X(B) faculty of Electrical Engineering '-1' '0' '-1' subroutine X(B) data X(B) shared data A/B stack B/X(B)

1/ eindhoven university of technology Multiple linear memory spaces properties • Each program can

1/ eindhoven university of technology Multiple linear memory spaces properties • Each program can use the complete memory – More freedom in code/data placement – Higher efficiency by loading once in physical memory • Protects on a 'need-to-know' basis What is invisible cannot be accessed! – Self-protection still difficult (must be address based) • Requires ‘logical’ to ‘physical’ address translation / faculty of Electrical Engineering

1/ eindhoven university of technology The segmented memory model '0'. . program A .

1/ eindhoven university of technology The segmented memory model '0'. . program A . . '-1' '0'. . data A stack A/X(A) A program B B . . '-1' '0'. . / data X(A) '0' shared data A/B. . . . '0'. . stack B/X(B) data B faculty of Electrical Engineering '0'. . X(B) '0'. . subroutine X data X(B)

1/ eindhoven university of technology Segmented memory properties • Each memory segment in separate

1/ eindhoven university of technology Segmented memory properties • Each memory segment in separate address space – Completely avoids the placement problem – Dynamically growing and shrinking memory segments (like stacks) are easy • Protection simple: segment access rights – Address checking is a segment boundary check Segments visible on ‘need to know’ basis • Needs logical to physical address translation / faculty of Electrical Engineering

1/ eindhoven university of technology Segmented memory problem • Requires a major ‘philosophical’ change:

1/ eindhoven university of technology Segmented memory problem • Requires a major ‘philosophical’ change: ‘the address’ is split in two parts – A segment identification – An offset within the segment • Automatic segment selection is partially possible – Separate segments for code and stack are obvious / – Switching between different data segments requires software intervention! faculty of Electrical Engineering

1/ eindhoven university of technology Address translation • Not needed for linear memory organisation

1/ eindhoven university of technology Address translation • Not needed for linear memory organisation – Processor generated (logical) address real memory (physical) address – May be handy to attach access rights to addresses • Needed for multiple linear address spaces and segmented memories / – Complex for multiple linear address spaces the actual address must be checked faculty of Electrical Engineering

1/ eindhoven university of technology Table based direct address translation logical address physical access

1/ eindhoven university of technology Table based direct address translation logical address physical access address rights • This table grows very large: Translating 1 million addresses with 4 access rights bits requires a 3 Megabyte table! / faculty of Electrical Engineering

1/ eindhoven university of technology Address bounds checking (1) logical address '>=' compare /

1/ eindhoven university of technology Address bounds checking (1) logical address '>=' compare / '<' compare physical address faculty of Electrical Engineering physical offset access rights

1/ eindhoven university of technology Address bounds checking (2) • Parallel comparators are VERY

1/ eindhoven university of technology Address bounds checking (2) • Parallel comparators are VERY expensive – Use a lot of power and chip area – Number of address ranges would be limited • Physical address ranges must have same sizes as the logical address ranges – Memory which is organised into large (undividable) blocks is hard to manage / – Same problem in a purely segmented memory faculty of Electrical Engineering

1/ eindhoven university of technology Paging (1) logical address <n> <p> logical page 'page

1/ eindhoven university of technology Paging (1) logical address <n> <p> logical page 'page table' offset physical page <m> <p> physical address access rights • <p> bits of the address are not translated: 2 p words in a page have the same access faculty of Electrical Engineering rights /

1/ eindhoven university of technology Paging (2) • Paging is cheaper than full address

1/ eindhoven university of technology Paging (2) • Paging is cheaper than full address translation – Translating 1 million addresses with 1024 word pages requires a page table with only 1024 entries – With 10 bits physical page numbers and 4 access rights bits, the page table takes less than 2048 bytes! • Translating 32 bit addresses with 4096 word pages requires a page table with 1 million entries! / – Not all of these pages will be in use at the same time. . . faculty of Electrical Engineering

1/ eindhoven university of technology Multi-level paging logical address <x> <y> <p> 1 st

1/ eindhoven university of technology Multi-level paging logical address <x> <y> <p> 1 st level table index First level page table page offset Physical page <m> / <p> 2 nd level table index 2 nd level table present physical address faculty of Electrical Engineering Second level page table access rights

1/ eindhoven university of technology Multi-level paging example • 4 byte words, 32 bit

1/ eindhoven university of technology Multi-level paging example • 4 byte words, 32 bit addresses (2 bits select byte), 1024 word / 4096 byte pages (<p> = 10+2 bits) – Second level table: 1024 entries (<y> = 10 bits) • Entry contains 20 bit physical page number (<m> = 20), leaves 12 bits for access rights if each entry takes one word • Each second level page table fits in one page – First level page table: 1024 entries (<x> = 10 bits) • Entry contains 20 bits physical page number of 2 nd level table plus the 'table present bit' - fits easily in one word / • First level page table fits in one page faculty of Electrical Engineering

1/ eindhoven university of technology Multi-level paging (continued) • This address translation method is

1/ eindhoven university of technology Multi-level paging (continued) • This address translation method is very cheap – The example second level table handles 4 Mega. Byte • If code, data and stack fit in 8 Mega. Byte, we need 3 pages (12 Kilo. Bytes) for translation • Multi-level paging is not limited to 2 levels! / – Motorola 68020 can go. Sup earctoh FIVE levels of tables ing th e rspecify acthe o hm • Each table entry (not just last) can emory ugh 5 access tablerights, s for can also give length limit for next tableacces s is a bit slo faculty of Electrical Engineering w

1/ eindhoven university of technology Speedup: translation lookaside buffer logical address <n> <p> page

1/ eindhoven university of technology Speedup: translation lookaside buffer logical address <n> <p> page offset tag '=' compare 'hit!' <m> <p> physical address access rights • This 'Content Addressable Memory' lookaside buffer can reach 98% hits with ‘only’ 32 entries / faculty of Electrical Engineering

1/ eindhoven university of technology A 'set associative' lookaside buffer logical address <y> <x>

1/ eindhoven university of technology A 'set associative' lookaside buffer logical address <y> <x> <p> Cheap 'tag' , simp le RA M 'hit!' / physical page <m> <p> physical address faculty of Electrical Engineering access rights

1/ eindhoven university of technology The problem with set associative buffers • A ‘tag

1/ eindhoven university of technology The problem with set associative buffers • A ‘tag clash’ makes the lookaside buffer worthless – Two or more different pages used in short loop – With same <y> bits but different <z> (tag) bits 4 bit 8 bit But different translation tags<z> <y> <x> Same line in table ‘Wait. Here’ at address 35 E 6 h 3 5 E 6 ‘Data. Port’ at address 5537 h 5 5 37 / Wait. Here: JNB Data. Port. 1, Wait. Here TWO misses per loop faculty of Electrical Engineering !

1/ eindhoven university of technology N-way set associative lookaside buffers Same hit-ra te as

1/ eindhoven university of technology N-way set associative lookaside buffers Same hit-ra te as ‘Cont • Reduce (but do not solve) tag clashes ent A ddres sable’ logical address Tag Page <y> <x> <p> table a. r. 1 2 1 2 'tag' physical page <m> / mux <p> physical address hit logic faculty of Electrical Engineering 'hit!' set selection mux access rights

1/ eindhoven university of technology Lookaside buffer replacement strategy • With filled buffer, new

1/ eindhoven university of technology Lookaside buffer replacement strategy • With filled buffer, new translations replace old – With 1 -way set associative: <y> bits fix choice! • Best choice: remove one which will not be used – Difficult, but ‘Least Recently Used’ may be the same – LRU requires administration: small choice sets only – Used for N-way set associative lookaside buffers • Another strategy: remove one at random / – Works well with large choice sets (CAM buffers!) – Small probability of removing the wrong entry faculty of Electrical Engineering

1/ eindhoven university of technology Segmented memory address translation logical address segment offset Segment

1/ eindhoven university of technology Segmented memory address translation logical address segment offset Segment bases Segment limits '<'/'>' error! 'stack' physical address • Segment table is in main memory ! / faculty of Electrical Engineering access rights

1/ eindhoven university of technology Segmented translation speedup • Processor uses only a few

1/ eindhoven university of technology Segmented translation speedup • Processor uses only a few segments at once – Place currently used segment info in on-chip registers – Software decides which segments are loaded no replacement strategy needed in hardware! • Example: Intel 80386 uses 6 current segments – Code, stack and ‘default data’ / – Up to 3 ‘extra data’ segments referenced explicitly faculty of Electrical Engineering

1/ eindhoven university of technology Virtual memory (1) • The logically addressable memory size

1/ eindhoven university of technology Virtual memory (1) • The logically addressable memory size can exceed the physical memory size – Common situation with multiple linear memory spaces • No problem if the actually used amount of memory fits in physical memory – Rely on address translation to 'pack' the memory / faculty of Electrical Engineering

1/ eindhoven university of technology Virtual memory (2) • Memory in use > physical

1/ eindhoven university of technology Virtual memory (2) • Memory in use > physical memory: problem – Hold part of used memory in physical memory – Store remainder somewhere else, f. i. on a hard disk • Keep this invisible to processor: 'virtual memory' – Hardware stops invalid memory access – Starts routine to move data into physical memory / – Then re-tries the failed memory access which may be in the middle of an instruction! faculty of Electrical Engineering

1/ eindhoven university of technology The 'program locality principle' mean stack frame size p

1/ eindhoven university of technology The 'program locality principle' mean stack frame size p p (x+1) last instruction address (x) p (y-1: PUSH) (y+1: POP) last stack access (y) (z+1: Arrays, Strings) last data access (z) • Consecutive accesses are generally not far apart / – The 'working set' contains the active memory areas – Run at full speed if these are kept in real memory! faculty of Electrical Engineering

1/ eindhoven university of technology Virtual memory hardware support bits These work for pages

1/ eindhoven university of technology Virtual memory hardware support bits These work for pages as well as segments • Present bit: in memory if set, otherwise on disk – Processor aborts access if this bit is reset • Accessed bit: set on each read or write access – Detect activity for determining the working set • Written bit: set on each write access – No need to write back to disk if unchanged / faculty of Electrical Engineering

1/ eindhoven university of technology The ‘working set - clock’ algorithm (1) P=1 A

1/ eindhoven university of technology The ‘working set - clock’ algorithm (1) P=1 A P=0 A 'swap in' A A A / A access faculty of Electrical Engineering A A 'swap out' I need you !

1/ eindhoven university of technology The ‘working set - clock’ algorithm (2) • Swap

1/ eindhoven university of technology The ‘working set - clock’ algorithm (2) • Swap out writes only if Written bit set • Swap in sets Accessed and Present, resets W • This algorithm is often used (works very well) / – Working set pages/segments set A bit a lot they are not swapped out! – Fair swap out decisions, even under high system load – Will always find something to swap out (robust) faculty of Electrical Engineering

1/ eindhoven university of technology The fragmentation problem 16 K segmented memory ‘ 0’

1/ eindhoven university of technology The fragmentation problem 16 K segmented memory ‘ 0’ 4 K Memory is fragmented outside segments: 1. 5 2 KK external fragmentation 1. 5 KK 3. 5 K 1. 5 K 6 K fr ee, b. Ku t does 4. 5 not fi t !! / ‘ 16’ 2. 5 K faculty of Electrical Engineering 16 K paged memory, 1 K pages ‘ 0’ Memory is fragmented inside pages: internal fragmentation Unus able s pace inside pages !! ‘ 16’

1/ eindhoven university of technology Pages versus segments • Fixed-size pages ease swapping to/from

1/ eindhoven university of technology Pages versus segments • Fixed-size pages ease swapping to/from disk • Segments provide more complete protection • Intel 80386 uses segmenting AND paging – Protection based upon the segments (done first) – Virtual memory based upon paging (done last) – Two translation steps needed • The P, A and W bits are offered in hardware, managing virtual memory is done in software! / faculty of Electrical Engineering

1/ eindhoven university of technology Intel 80286 example: ‘segment selector’ segment number 13 bits

1/ eindhoven university of technology Intel 80286 example: ‘segment selector’ segment number 13 bits local/global 1 bit 'RPL’ 2 bits 16 bits • Global table with 8192 shared segments • Task-local table with 8192 private segments • 'Requested Privilege Level' allows lowering the protection level of a segment (towards PL 3) / faculty of Electrical Engineering

1/ eindhoven university of technology Intel 80286 memory segment descriptor ‘base’ 24 bits Location

1/ eindhoven university of technology Intel 80286 memory segment descriptor ‘base’ 24 bits Location ‘limit’ 16 bits Present Accessed 'PL’ Type & 1 bit 2 bits access rights No W 64 bits ritten bit ! Size Virtual memory – CODE readable, 'conforming' (for libraries) – DATA writable, stack (reverses limit checking) – TASK STATE (registers, 4 stack pointers, active segs. ) – LOCAL TABLE / faculty of Electrical Engineering (only in global segment table)

1/ eindhoven university of technology Intel 80286 calls and jumps • Within same segment

1/ eindhoven university of technology Intel 80286 calls and jumps • Within same segment only needs offset • Other segment at same PL needs offset & selector • To higher protected code (lower PL) uses 'call gate’ – These are stored in segment tables (‘pseudo-segment’) – CALL instruction points to this ‘pseudo-segment’ but the offset in instruction is overruled by call gate – Data copied automatically between stacks code segment offset stack copy Present 'PL’ selector 16 bits block size 1 bit 2 bits / faculty of Electrical Engineering

1/ eindhoven university of technology Intel 80286 traps and interrupts • Use 256 entry

1/ eindhoven university of technology Intel 80286 traps and interrupts • Use 256 entry 'interrupt descriptor table’ – Which contains ‘trap gates’ and ‘interrupt gates’ code segment selector offset 16 bits Present 'PL’ 1 bit 2 bits – These are call gates without stack data copying – An interrupt gate disables interrupts automatically / faculty of Electrical Engineering

1/ eindhoven university of technology Intel 80286 I/O protection • Global 'I/O Privilege Level'

1/ eindhoven university of technology Intel 80286 I/O protection • Global 'I/O Privilege Level' indicates the highest PL value at which ANY I/O is allowed – Higher PL level code traps on IN & OUT instructions • Each task has a bitmap in the task state segment – Each bit corresponds with an I/O port – Accessing I/O port with bit at 0 generates trap / – Size of bitmap variable, undefined ports always trap faculty of Electrical Engineering

1/ eindhoven university of technology Intel 80286 multitasking support • 'Task state' segments store

1/ eindhoven university of technology Intel 80286 multitasking support • 'Task state' segments store task information – Special register points to active task state segment • Task switch with JUMP through a 'task gate' task state segment selector / Present 'PL’ 1 bit 2 bits PL 0 only: kerne l! 1) Save register set in active task state segment 2) Get address of new task state and declare it active 3) Load register set from this segment, including PC 4) Restart program execution for the new task faculty of Electrical Engineering

1/ eindhoven university of technology The old-fashioned way: ‘windowing’ '0' address ‘Expa nded space

1/ eindhoven university of technology The old-fashioned way: ‘windowing’ '0' address ‘Expa nded space Mem ory’ window selection '-1' 1 N - 1 register 0 1 mux N - 2 'N' windows N-1 • Selection register is normally an output port / – Window selection is part of memory management – Should be managed by operating system! faculty of Electrical Engineering

1/ eindhoven university of technology Built-in windowing • Windowing logic can be built inside

1/ eindhoven university of technology Built-in windowing • Windowing logic can be built inside memory chips read address write page register Read. Only Memory 'core' ROM address: page register data – Standard stuff for all kinds of (Flash) ROM’s / – Can also save a lot of address pins! faculty of Electrical Engineering address input

1/ eindhoven university of technology ‘Memory mapper’ address extension 16 bits CPU address 4

1/ eindhoven university of technology ‘Memory mapper’ address extension 16 bits CPU address 4 bits 12 bits memory address / 16 entries 12 bits 24 bits – The 74 LS 610 provides 16 windows of 4096 bytes, each of these can select from 4096 of these windows in physical memory (total 16 million bytes!) faculty of Electrical Engineering