Operating Systems Practical Session 1 System Calls A

  • Slides: 28
Download presentation
Operating Systems Practical Session 1 System Calls

Operating Systems Practical Session 1 System Calls

A few administrative notes… • Course homepage: • http: //www. cs. bgu. ac. il/~os

A few administrative notes… • Course homepage: • http: //www. cs. bgu. ac. il/~os 172/Main • Assignments: q. Extending xv 6 (a pedagogical OS) q. Submission in pairs q. Frontal checking: 1. Assume the grader may ask anything 2. Must register to exactly one checking session.

Operating System An operating system (OS) is system software that manages computer hardware and

Operating System An operating system (OS) is system software that manages computer hardware and software resources and provides common services for computer programs.

Main Components for today • Process – Describes an execution of a program and

Main Components for today • Process – Describes an execution of a program and all of its requirements • Kernel – The Heart of the OS, manages processes and other resources • System Calls – Means of communications between the process and the kernel

Process Kernel Space User Space Run Kernel code Has: - Stack - Heap -

Process Kernel Space User Space Run Kernel code Has: - Stack - Heap - System wide view - Direct Hardware access Run user code Has: - Stack - Heap System calls

System Calls • A System Call is an interface between a user application and

System Calls • A System Call is an interface between a user application and a service provided by the operating system (or kernel). kernel • These can be roughly grouped into five major categories: 1. 2. 3. 4. 5. Process control (e. g. create/terminate process) File Management (e. g. read, write) Device Management (e. g. logically attach a device) Information Maintenance (e. g. set time or date) Communications (e. g. send messages)

System Calls - motivation • A process is not supposed to have a direct

System Calls - motivation • A process is not supposed to have a direct access the hardware/kernel. It can’t access the kernel memory or functions. • This is strictly enforced (‘protected mode’) for good reasons: • Can jeopardize other processes running. • Cause physical damage to devices. • Alter system behavior. • The system call mechanism provides a safe mechanism to request specific kernel operations.

Jumping from user space to kernel space • A process running in user space

Jumping from user space to kernel space • A process running in user space cannot run code/access data structures in the kernel space • In x 86 arch, in order to jump to kernel space, it is common that the process will use interrupts • When jumping to kernel space, the process (kernel) must store a “backup” for its current execution state (so that the kernel will be able to resume the execution later), this backup is referred to as a trapframe.

System Calls - interface • Calls are usually made with C/C++ library functions: User

System Calls - interface • Calls are usually made with C/C++ library functions: User Application C - Library getpid() Kernel Load arguments, eax _NR_getpid, kernel mode (int 80) System Call syscall_exit return sys_getpid() Call Sys_Call_table[eax] resume_userspace return User-Space Kernel-Space Remark: Invoking int 0 x 80 is common although newer techniques for “faster” control transfer (SYSCALL/SYSRET) are provided by both AMD’s and Intel’s architecture.

XV 6 CODE 10

XV 6 CODE 10

System Calls - interface syscall. h // System call numbers #define SYS_fork 1 #define

System Calls - interface syscall. h // System call numbers #define SYS_fork 1 #define SYS_exit 2 #define SYS_wait 3 User Application C - Library Kernel#define SYS_pipe System Call 4 #define SYS_read 5 getpid() Load arguments, #define SYS_kill 6 eax _NR_getpid, #define SYS_exec 7 kernel mode (int 80) #define SYS_fstat 8 Call #define SYS_chdir 9 Sys_Call_table[eax] #define SYS_dup 10 #define SYS_getpid 11 #define return SYS_sbrk 12 syscall_exit #define SYS_sleep 13 #define SYS_uptime 14 resume_userspace #define SYS_open 15 #define SYS_write 16 return #define SYS_mknod 17 #define SYS_unlink 18 #define SYS_link 19 #define SYS_mkdir 20 User-Space Kernel-Space #define SYS_close 21 Remark: Invoking int 0 x 80 is common although newer techniques for “faster” control transfer are provided by both AMD’s and Intel’s architecture. • Calls are usually made with C/C++ library functions: sys_getpid()

usys. S #include "syscall. h" #include "traps. h" . globl fork;  System Calls

usys. S #include "syscall. h" #include "traps. h" . globl fork; System Calls - interface fork : #define SYSCALL(name) . globl name; name: movl $SYS_ ## name, %eax; int $T_SYSCALL; ret • Calls are usually made with movl $SYS_fork, %eax; int $T_SYSCALL; C/C++ library functions: ret sys_getpid() User Application C - Library Kernel System Call SYSCALL(fork) SYSCALL(exit) getpid() Load arguments, SYSCALL(wait) eax _NR_getpid, SYSCALL(pipe) kernel mode (int 80) SYSCALL(read) Call SYSCALL(write) Sys_Call_table[eax] SYSCALL(close). globl fork; SYSCALL(kill) fork : return SYSCALL(exec) syscall_exit SYSCALL(open) movl $1, %eax; SYSCALL(mknod) resume_userspace int $64; SYSCALL(unlink) ret SYSCALL(fstat) return SYSCALL(link) SYSCALL(mkdir) SYSCALL(chdir) User-Space Kernel-Space SYSCALL(dup) SYSCALL(getpid) Remark: Invoking int 0 x 80 is common although newer techniques for “faster” control transfer are provided SYSCALL(sbrk) by both AMD’s and Intel’s architecture. SYSCALL(sleep)

System Calls - interface trapasm. S alltraps • . globl Calls are usually made

System Calls - interface trapasm. S alltraps • . globl Calls are usually made with C/C++ library functions: alltraps: # Build trap frame. pushl. User %ds. Application C - Library Kernel pushl %es pushl %fs getpid() Load arguments, pushl %gs eax _NR_getpid, pushal kernel mode (int 80) System Call . . . syscall_exit return # Call trap(tf), where tf=%esp pushl %esp resume_userspace call trap. return . . User-Space sys_getpid() Call Sys_Call_table[eax] Kernel-Space Remark: Invoking int 0 x 80 is common although newer techniques for “faster” control transfer are provided by both AMD’s and Intel’s architecture.

System Calls - interface trap. c syscall. c int (*syscalls[])(void) = { • static

System Calls - interface trap. c syscall. c int (*syscalls[])(void) = { • static Calls are usually made with C/C++ library functions: Void trap(struct trapframe* tf) [SYS_fork] sys_fork, [SYS_exit] sys_exit, . User. Application. [SYS_close] sys_close, getpid() }; { C - Library Load arguments, eax _NR_getpid, kernel mode (int 80) num = proc->tf->eax; if(num > 0 && num < NELEM(syscalls) && syscalls[num]) { syscall_exit proc->tf->eax = syscalls[num](); } else { resume_userspace cprintf("%d %s: unknown sys call %dn" , proc->pid, proc->name, num); proc->tf->eax = return -1; } . . . System Call if(tf->trapno == T_SYSCALL){ if(proc->killed) exit(); Call proc->tf = tf; Sys_Call_table[eax] syscall(); if(proc->killed) exit(); return; } sys_getpid() void syscall(void) { int num; Kernel . . . } User-Space Kernel-Space Remark: Invoking int 0 x 80 is common although newer techniques for “faster” control transfer are provided by both AMD’s and Intel’s architecture.

System Calls - interface • Calls are usually made with C/C++ library functions: trapasm.

System Calls - interface • Calls are usually made with C/C++ library functions: trapasm. S User Application. . . C - Library getpid() addl $4, %esp Kernel Load arguments, eax _NR_getpid, kernel mode (int 80) User-Space Call Sys_Call_table[eax] return sys_getpid() # Return falls through to trapret. . globl trapret: syscall_exit popal popl %gs resume_userspace popl %fs popl %es return popl %ds addl $0 x 8, %esp # trapno and errcode iret System Call Kernel-Space Remark: Invoking int 0 x 80 is common although newer techniques for “faster” control transfer are provided by both AMD’s and Intel’s architecture.

System Calls – tips • Kernel behavior can be enhanced by altering the system

System Calls – tips • Kernel behavior can be enhanced by altering the system calls themselves: imagine we wish to write a message (or add a log entry) whenever a specific user is opening a file. We can re-write the system call open with our new open function and load it to the kernel (need administrative rights). Now all “open” requests are passed through our function. • We can examine which system calls are made by a program by invoking strace<arguments>.

Process Control Kernel Space Proc 1 (running) Proc 2 (sleep) Proc 3 (ready) Proc

Process Control Kernel Space Proc 1 (running) Proc 2 (sleep) Proc 3 (ready) Proc 4 (ready)

Process Control Block enum procstate { UNUSED, EMBRYO, SLEEPING, RUNNABLE, RUNNING, ZOMBIE }; //

Process Control Block enum procstate { UNUSED, EMBRYO, SLEEPING, RUNNABLE, RUNNING, ZOMBIE }; // Per-process state struct proc { uint sz; pde_t* pgdir; char *kstack; enum procstate; int pid; struct proc *parent; struct trapframe *tf; struct context *context; void *chan; int killed; struct file *ofile[NOFILE]; struct inode *cwd; char name[16]; }; // // // // Size of process memory (bytes) Page table Bottom of kernel stack for this process Process state Process ID Parent process Trap frame for current syscall swtch() here to run process If non-zero, sleeping on chan If non-zero, have been killed Open files Current directory Process name (debugging)

THE KILL SYSTEM CALL (XV 6) /*** sysproc. c ***/ int sys_kill(void) { int

THE KILL SYSTEM CALL (XV 6) /*** sysproc. c ***/ int sys_kill(void) { int pid; /*** syscall. c ***/ if(argint(0, &pid) < 0) return -1; return kill(pid); } /*** proc. c ***/ // Kill the process with the given pid. // Process won't exit until it returns // to user space (see trap in trap. c). int kill(int pid) { struct proc *p; acquire(&ptable. lock); for(p = ptable. proc; p < &ptable. proc[NPROC]; p++){ if(p->pid == pid){ p->killed = 1; // Wake process from sleep if necessary. if(p->state == SLEEPING) p->state = RUNNABLE; release(&ptable. lock); return 0; } release(&ptable. lock); return -1; } static int (*syscalls[])(void) = { [SYS_chdir] sys_chdir, [SYS_close] sys_close, [SYS_dup] sys_dup, [SYS_exec] sys_exec, [SYS_exit] sys_exit, [SYS_fork] sys_fork, [SYS_fstat] sys_fstat, [SYS_getpid] sys_getpid, [SYS_kill] sys_kill, [SYS_link] sys_link, [SYS_mkdir] sys_mkdir, [SYS_mknod] sys_mknod, [SYS_open] sys_open, [SYS_pipe] sys_pipe, [SYS_read] sys_read, [SYS_sbrk] sys_sbrk, [SYS_sleep] sys_sleep, [SYS_unlink] sys_unlink, [SYS_wait] sys_wait, [SYS_write] sys_write, [SYS_uptime] sys_uptime, }; 19

Fork • pid_t fork(void); • Fork is used to create a new process. It

Fork • pid_t fork(void); • Fork is used to create a new process. It creates a duplicate of the original process (including all file descriptors, registers, instruction pointer, etc’). • Once the call is finished, the process and its copy go their separate ways. Subsequent changes to one should not effect the other. • The fork call returns a different value to the original process (parent) and its copy (child): in the child process this value is zero, and in the parent process it is the PID of the child process. • When fork is invoked the parent’s information should be copied to its child – however, this can be wasteful if the child will not need this information (see exec()…). To avoid such situations, Copy On Write (COW) is used for the data section.

Copy On Write (COW) • How does Linux manage COW? fork() Parent Process Child

Copy On Write (COW) • How does Linux manage COW? fork() Parent Process Child Process DATA STRUCTURE (task_struct) RW RO protection fault! Copying is expensive. The child process will point to the parent’s pages Well, no other choice but to allocate a new RW copy of each required page write information

Process control An example: int i = 3472; printf("my process pid is %dn", getpid());

Process control An example: int i = 3472; printf("my process pid is %dn", getpid()); fork_id=fork(); if (fork_id==0){ i= 6794; printf(“child pid %d, i=%dn", getpid(), i); } else printf(“parent pid %d, i=%dn", getpid(), i); return 0; Output: my process pid is 8864 child pid 8865, i=6794 parent pid 8864, i=3472 Program flow: PID = 8864 i = 3472 fork () PID = 8865 fork_id = 8865 i=3472 fork_id=0 i = 6794

Process control • Exit – A process can terminate itself using the exit system

Process control • Exit – A process can terminate itself using the exit system call – The call for the exit can be either explicit or implicit – The exit system call receives a single integer argument that will be the exit status of the process

Process control - zombies • When a process ends, the memory and resources associated

Process control - zombies • When a process ends, the memory and resources associated with it are deallocated. • However, the entry for that process is not removed from the process table. • This allows the parent to collect the child’s exit status. • When this data is not collected by the parent the child is called a “zombie”. Such a leak is usually not worrisome in itself, however, it is a good indicator for problems to come.

Process control • Wait • pid_t wait(int *status); • pid_t waitpid(pid_t pid, int *status,

Process control • Wait • pid_t wait(int *status); • pid_t waitpid(pid_t pid, int *status, int options); • The wait command is used for waiting on child processes whose state changed (the process terminated, for example). • The process calling wait will suspend execution until one of its children (or a specific one) terminates. • Waiting can be done for a specific process, a group of processes or on any arbitrary child with waitpid. • Once the status of a zombie process is collected that process is removed from the process table by the collecting process.

Process control • exec* • int execv(const char *path, char *const argv[]); • int

Process control • exec* • int execv(const char *path, char *const argv[]); • int execvp(const char *file, char *const argv[]); • exec…. • The exec() family of function replaces current process image with a new process image (text, data, bss, stack, etc). • Since no new process is created, PID remains the same. • Exec functions do not return to the calling process unless an error occurred (in which case -1 is returned and errno is set with a special value). • The system call is execve(…)

Process control – simple shell #define… … int main(int argc, char **argv){ … while(true){

Process control – simple shell #define… … int main(int argc, char **argv){ … while(true){ type_prompt(); read_command(command, params); pid=fork(); if (pid<0){ if (errno==EAGAIN) printf(“ERROR cannot allocate sufficient memoryn”); continue; } if (pid>0) wait(&status); else execvp(command, params); }

Other system calls • File Management: – open – close – read – write

Other system calls • File Management: – open – close – read – write – lseek – etc…