CS 194 24 Advanced Operating Systems Structures and













































- Slides: 45
CS 194 -24 Advanced Operating Systems Structures and Implementation Lecture 4 OS Structure (Con’t) Modern Architecture February 6 th, 2013 Prof. John Kubiatowicz http: //inst. eecs. berkeley. edu/~cs 194 -24
Goals for Today • OS Structure (Con’t): The Linux Kernel • Modern Computer Architecture Interactive is important! Ask Questions! Note: Some slides and/or pictures in the following are adapted from slides © 2013 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 2
Recall: OS Resources – at the center of it all! • What do modern OSs do? – Why all of these pieces running together? – Is this complexity necessary? • Control of Resources Independent Requesters – Access/No Access/ Partial Access » Check every access to see if it is allowed – Resource Multiplexing » When multiple valid requests Access Control and Multiplexing occur at same time – how to multiplex access? » What fraction of resource can requester get? – Performance Isolation » Can requests from one entity prevent requests from another? • What or Who is a requester? ? ? – Process? User? Public Key? – Think of this as a “Principle” 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 3
Recall: Microkernel Structure Figure ©Wikipedia Monolithic Kernel Microkernel • Moves as much from the kernel into “user” space – Small core OS running at kernel level – OS Services built from many independent user-level processes – Communication between modules with message passing • Benefits: – Easier to extend a microkernel – Easier to port OS to new architectures – More reliable (less code is running in kernel mode) – Fault Isolation (parts of kernel protected from other parts) – More secure • Detriments: – Performance overhead severe for naïve implementation 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 4
Recall: Modules-based Structure • Most modern operating systems implement modules – – Uses Each object-oriented approach core component is separate talks to the others over known interfaces is loadable as needed within the kernel • Overall, similar to layers but with more flexible 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 5
Recall: Exo. Kernel • Provide extremely thin layer to present hardware resources directly to users – As little abstraction as possible – Only Protection and Multiplexing of resources • On top of Exokernel is the Library. OS which provides much of the traditional functionality of OS in Library • Low-level abstraction layer 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 6
Concurrency • “Thread” of execution – Independent Fetch/Decode/Execute loop – Operating in some Address space • Uniprogramming: one thread at a time – – MS/DOS, early Macintosh, Batch processing Easier for operating system builder Get rid concurrency by defining it away Does this make sense for personal computers? • Multiprogramming: more than one thread at a time – Multics, UNIX/Linux, OS/2, Windows NT/2000/XP, Mac OS X – Often called “multitasking”, but multitasking has other meanings (talk about this later) • Many. Core Multiprogramming, right? 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 7
The Basic Problem of Concurrency • The basic problem of concurrency involves resources: – Hardware: single CPU, single DRAM, single I/O devices – Multiprogramming API: users think they have exclusive access to shared resources • OS Has to coordinate all activity – Multiple users, I/O interrupts, … – How can it keep all these things straight? • Basic Idea: Use Virtual Machine abstraction – Decompose hard problem into simpler ones – Abstract the notion of an executing program – Then, worry about multiplexing these abstract machines • Dijkstra did this for the “THE system” – Few thousand lines vs 1 million lines in OS 360 (1 K bugs) 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 8
Recall (61 C): What happens during execution? R 0 … R 31 F 0 … F 30 PC Addr 232 -1 Fetch Exec • Execution sequence: – – – 2/6/13 Fetch Instruction at PC Decode Execute (possibly using registers) Write results to registers/mem PC = Next Instruction(PC) Repeat … Data 1 Data 0 Inst 237 Inst 236 … Inst 5 Inst 4 Inst 3 Inst 2 Inst 1 Inst 0 Kubiatowicz CS 194 -24 ©UCB Fall 2013 PC PC Addr 0 Lec 4. 9
How can we give the illusion of multiple processors? CPU 1 CPU 2 CPU 3 CPU 1 Shared Memory CPU 2 CPU 3 CPU 1 CPU 2 Time • Assume a single processor. How do we provide the illusion of multiple processors? – Multiplex in time! • Each virtual “CPU” needs a structure to hold: – Program Counter (PC), Stack Pointer (SP) – Registers (Integer, Floating point, others…? ) • How switch from one CPU to the next? – Save PC, SP, and registers in current state block – Load PC, SP, and registers from new state block • What triggers switch? – Timer, voluntary yield, I/O, other things 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 10
Properties of this simple multiprogramming technique • All virtual CPUs share same non-CPU resources – I/O devices the same – Memory the same • Consequence of sharing: – Each thread can access the data of every other thread (good for sharing, bad for protection) – Threads can share instructions (good for sharing, bad for protection) – Can threads overwrite OS functions? • This (unprotected) model common in: – Embedded applications – Windows 3. 1/Machintosh (switch only with yield) – Windows 95—ME? (switch with both yield and timer) 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 11
What needs to be saved in Modern X 86? 64 -bit Register Set Traditional 32 -bit subset EFLAGS Register 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 12
Modern Technique: SMT/Hyperthreading • Hardware technique – Exploit natural properties of superscalar processors to provide illusion of multiple processors – Higher utilization of processor resources • Can schedule each thread as if were separate CPU – However, not linear speedup! – If have multiprocessor, should schedule each processor first • Original technique called “Simultaneous Multithreading” – See http: //www. cs. washington. edu/research/smt/ – Alpha, SPARC, Pentium 4 (“Hyperthreading”), Power 5 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 13
Chip-scale features of Sandy. Bridge • Significant pieces: – Four OOO cores with Hyperthreads » New Advanced Vector e. Xtensions (256 -bit – – FP) » AES instructions » Instructions to help with Galois-Field mult » 4 -ops/cycle Integrated GPU System Agent (Memory and Fast I/O) Shared L 3 cache divided in 4 banks On-chip Ring bus network » Both coherent and non-coherent transactions » High-BW access to L 3 Cache • Integrated I/O – Integrated memory controller (IMC) » Two independent channels of DDR 3 DRAM – High-speed PCI-Express (for Graphics cards) – DMI Connection to South. Bridge (PCH) 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 14
How to protect threads from one another? • Need three important things: 1. Protection of memory » Every task does not have access to all memory 2. Protection of I/O devices » Every task does not have access to every device 3. Protection of Access to Processor: Preemptive switching from task to task » Use of timer » Must not be possible to disable timer from usercode 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 15
Recall: Program’s Address Space – For a 32 -bit processor there are 232 = 4 billion addresses • What happens when you read or write to an address? – – Perhaps Nothing acts like regular memory ignores writes causes I/O operation Program Address Space • Address space the set of accessible addresses + state associated with them: » (Memory-mapped I/O) – Perhaps causes exception (fault) 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 16
Providing Illusion of Separate Address Space: Load new Translation Map on Switch Data 2 Code Data Heap Stack 1 Heap 1 Code 1 Stack 2 Prog 1 Virtual Address Space 1 Prog 2 Virtual Address Space 2 Data 1 Heap 2 Code 2 OS code Translation Map 1 OS data Translation Map 2 OS heap & Stacks Physical Address Space 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 17
X 86 Memory model with segmentation 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 18
The Six x 86 Segment Registers • CS - Code Segment • SS - Stack Segment – “Stack segments are data segments which must be read/write segments. Loading the SS register with a segment selector for a nonwritable data segment generates a general-protection exception (#GP)” • DS - Data Segment • ES/FS/GS - Extra (usually data) segment registers – FS and GS used for thread-local storage/by glibc • The “hidden part” is like a cache so that segment descriptor info doesn’t have to be looked up each time. 19 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 19
UNIX Process • Process: Operating system abstraction to represent what is needed to run a single program – Originally: a single, sequential stream of execution in its own address space – Modern Process: multiple threads in same address space! • Two parts: – Sequential Program Execution Streams » Code executed as one or more sequential stream of execution (threads) » Each thread includes its own state of CPU registers » Threads either multiplexed in software (OS) or hardware (simultaneous multithrading/hyperthreading) – Protected Resources: » Main Memory State (contents of Address Space) » I/O state (i. e. file descriptors) • This is a virtual machine abstraction – Some might say that the only point of an OS is to support a clean Process abstraction 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 20
How do we multiplex processes? • The current state of process held in a process control block (PCB): – This is a “snapshot” of the execution and protection environment – Only one PCB active at a time • Give out CPU time to different processes (Scheduling): – Only one process “running” at a time – Give more time to important processes • Give pieces of resources to different processes (Protection): – Controlled access to non-CPU resources – Sample mechanisms: » Memory Mapping: Give each process their own address space » Kernel/User duality: Arbitrary multiplexing of I/O through system calls 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Process Control Block Lec 4. 21
Modern “Lightweight” Process with Threads • Thread: a sequential execution stream within process (Sometimes called a “Lightweight process”) – Process still contains a single Address Space – No protection between threads • Multithreading: a single program made up of a number of different concurrent activities – Sometimes called multitasking, as in Ada… • Why separate the concept of a thread from that of a process? – Discuss the “thread” part of a process (concurrency) – Separate from the “address space” (Protection) – Heavyweight Process with one thread 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 22
Single and Multithreaded Processes • Threads encapsulate concurrency: “Active” component • Address spaces encapsulate protection: “Passive” part – Keeps buggy program from trashing the system • Why have multiple threads per address space? 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 23
Preview: System-Level Control of x 86 • Full support for Process Abstraction involves a lot of system-level state – This is state that can only be accessed in kernel mode! – We will be talking about a number of these pieces as we go through the term… • There is a tradeoff between amount of system state and cost of switching from thread to thread! 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 24
Additional system state for I/O • Platform Controller Hub – Used to be “South. Bridge, ” but no “North. Bridge” now – Connected to processor with proprietary bus » Direct Media Interface – Code name “Cougar Point” for Sandy. Bridge processors • Types of I/O on PCH: Sandy. Bridge System Configuration 2/6/13 – – – USB Ethernet Audio BIOS support More PCI Express (lower speed than on Processor) – Sata (for Disks) Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 25
Linux Structure 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 26
Layout of Linux Sources • Layout of basic linux sources: kubitron@kubi(16)% ls arch/ drivers/ block/ firmware/ COPYING fs/ CREDITS include/ crypto/ init/ Documentation/ ipc/ kubitron@kubi(17)% Kbuild kernel/ lib/ MAINTAINERS Makefile mm/ modules. builtin modules. order Module. symvers net/ README REPORTING-BUGS samples/ scripts/ security/ sound/ System. map tools/ usr/ virt/ vmlinux* vmlinux. o • Specific Directories: – – – – – – 2/6/13 arch: block: crypto: Documentation: drivers: firmware: fs: include: init: ipc: kernel: lib: mm: net: samples: scripts: security: sound: usr: tools: virt: Architecture-specific source Block I/O layer Crypto API Kernel source Documentation Device drivers Device firmware needed to use certain drivers The VFS and individual filesystems Kernel headers Kernel boot and initialization Interprocess Communication code Core subsystems, such as the scheduler Helper routines Memory management subsystem and the vm Networking subsystem Sample, demonstrative code Scripts used to build the kernel Linux Security Module Sound subsystem Early user-space code (called initramfs) Tools helpful for developing Linux Virtualization infrastructure Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 27
System Calls • Challenge: Interaction Despite Isolation – How to isolate processes and their resources… » While still permitting them to request help from the kernel » Letting them interact with resources while maintaining usage policies such as security, Qo. S, etc – Letting processes interact with one another in a controlled way » Through messages, shared memory, etc • Enter the System Call interface – Layer between the hardware and user-space processes – Programming interface to the services provided by the OS • Mostly accessed by programs via a high-level Application Program Interface (API) rather than directly – Get at system calls by linking with libraries in glibc Call to printf() Printf() in the C library Write() system call • Three most common APIs are: – Win 32 API for Windows – POSIX API for POSIX-based systems (including virtually all versions of UNIX, Linux, and Mac OS X) – Java API for the Java virtual machine (JVM) 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 28
Example of System Call usage • System call sequence to copy the contents of one file to another file: • Many crossings of the User/Kernel boundary! – The cost of traversing this boundary can be high 2/6/13 Kubiatowicz CS 194 -24 ©UCB Fall 2013 Lec 4. 29
Example: Use strace to trace syscalls • prompt% strace wc production. log execve("/usr/bin/wc", ["wc", "production. log"], [/* 52 vars */]) = 0 brk(0) = 0 x 1987000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0 x 7 ff 24 b 8 f 7000 access("/etc/ld. so. preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld. so. cache", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=137151, . . . }) = 0 mmap(NULL, 137151, PROT_READ, MAP_PRIVATE, 3, 0) = 0 x 7 ff 24 b 8 d 5000 close(3) = 0 open("/lib 64/libc. so. 6", O_RDONLY) = 3 read(3, "177 ELF2113 3 >