Programs and Processes Jeff Chase Duke University The

  • Slides: 34
Download presentation
Programs and Processes Jeff Chase Duke University

Programs and Processes Jeff Chase Duke University

The Operating System • An operating system: – Runs programs; sets up execution contexts

The Operating System • An operating system: – Runs programs; sets up execution contexts for programs – Enables programs to interact with the outside world – Enforces isolation among programs – Mediates interactions among programs User Applications Operating System(s) Substrate / Architecture

Today • What is a program? – A little bit of C on “classical

Today • What is a program? – A little bit of C on “classical OS” • How does a program run? • How are programs built? • What does the computer look like to a program?

A simple C program int main() { }

A simple C program int main() { }

What’s in a program?

What’s in a program?

What’s in a program? code instructions (“text”) procedures data global variables (“static”) constants (“immutable”)

What’s in a program? code instructions (“text”) procedures data global variables (“static”) constants (“immutable”) symbols (import/export) Names interfaces references

A simple module int val = 0; int p 1(char *s) { return 1;

A simple module int val = 0; int p 1(char *s) { return 1; } int p 2() { char *s; int i; s = "hellon"; i = p 1(s); return(i); } state P 1() API P 2() P 3() P 4() E. g. , a library

Calling the module #include <stdio. h> extern int p 1(); interface extern int p

Calling the module #include <stdio. h> extern int p 1(); interface extern int p 2(); signatures (prototypes) int main() { int i; state P 1() P 2() P 3() P 4() Program i = p 2(); printf("%dn", i); }

. section __TEXT, __text, regular, pure_instructions. globl _p 1. align 4, 0 x 90

. section __TEXT, __text, regular, pure_instructions. globl _p 1. align 4, 0 x 90 _p 1: ## @p 1. cfi_startproc ## BB#0: pushq %rbp. globl _p 2 Ltmp 2: . align 4, 0 x 90. cfi_def_cfa_offset 16 _p 2: ## @p 2 Ltmp 3: . cfi_startproc. cfi_offset %rbp, -16 …. movq %rsp, %rbp ret Ltmp 4: . cfi_endproc. cfi_def_cfa_register %rbp movl $1, %eax. section movq %rdi, -8(%rbp) __TEXT, __cstring, cstring_literals popq %rbp L_. str: ## @. str ret. asciz "hellon". cfi_endproc. comm _val, 4, 2 . subsections_via_symbols ## @val

Global data (“static”) int g; int g 0 = 0; int g 1 =

Global data (“static”) int g; int g 0 = 0; int g 1 = 1; . globl _g 0 ## @g 0. zerofill __DATA, __common, _g 0, 4, 2. section __DATA, __data. globl _g 1 ## @g 1. align 2 _g 1: . long 1 ## 0 x 1. comm_g, 4, 2 @g ##

The Birth of a Program (C/Ux) myprogram. c int j; char* s = “hellon”;

The Birth of a Program (C/Ux) myprogram. c int j; char* s = “hellon”; myprogram. o assembler data object file int p() { j = write(1, s, 6); return(j); } data p: compiler …. . store this store that push jsr _write ret etc. myprogram. s header files libraries and other object files or archives linker data program myprogram (executable file)

What’s in an Object File or Executable? Header “magic number” indicates type of file/image.

What’s in an Object File or Executable? Header “magic number” indicates type of file/image. Section table an array of (offset, len, start. VA) sections Used by linker; may be removed after final link step and strip. Also includes info for debugger. header text program instructions p data immutable data (constants) “hellon” wdata writable global/static data j, s symbol table j, s , p, sbuf relocation records int j = 327; char* s = “hellon”; char sbuf[512]; int p() { int k = 0; j = write(1, s, 6); return(j); }

But Java programs are interpreted They run on an “abstract machine” (e. g. ,

But Java programs are interpreted They run on an “abstract machine” (e. g. , JVM) implemented in software. ”bytecode” http: //www. media-art-online. org/java/help/how-it-works. html

http: //forensics. spreitzenbarth. de/2012/08/27/co mparison-of-dalvik-and-java-bytecode/

http: //forensics. spreitzenbarth. de/2012/08/27/co mparison-of-dalvik-and-java-bytecode/

What’s the point? “Program” is an abstraction • There are many different representations of

What’s the point? “Program” is an abstraction • There are many different representations of programs, even of executable programs. • Executable programs are compiled and packaged to run on an abstract machine. • Details of the program depend on the platform: the machine and system software. • Abstraction(s) is/are crucial in computer systems because they help accommodate rapid change.

Running a program sections code (“text”) constants initialized data Process segments data Program Thread

Running a program sections code (“text”) constants initialized data Process segments data Program Thread virtual memory When a program launches, the OS creates an execution context (process) to run it, with a thread to run the program, and a virtual memory to store the running program’s code and data.

VAS example (32 -bit) • The program uses virtual memory through its process’ Virtual

VAS example (32 -bit) • The program uses virtual memory through its process’ Virtual Address Space: 0 x 7 fffffff Reserved Stack • An addressable array of bytes… • Containing every instruction the process thread can execute… • And every piece of data those instructions can read/write… – i. e. , read/write == load/store on memory • Partitioned into logical segments with distinct purpose and use. • Every memory reference is interpreted in the context of the. VAS. – Resolves to a location in machine memory Dynamic data (heap/BSS) Static data Text (code) 0 x 0

“Classic Linux Address Space” N http: //duartes. org/gustavo/blog/category/linux

“Classic Linux Address Space” N http: //duartes. org/gustavo/blog/category/linux

int P(int a){…} void C(int x){ int y=P(x); } How do C and P

int P(int a){…} void C(int x){ int y=P(x); } How do C and P share information? Via a shared, in-memory stack

int P(int a){…} void C(int x){ int y=P(x); } What info is stored on

int P(int a){…} void C(int x){ int y=P(x); } What info is stored on the stack? C’s registers, call arguments, RA, P's local vars

Review of the stack • Each stack frame contains a function’s • • Local

Review of the stack • Each stack frame contains a function’s • • Local variables Parameters Return address Saved values of calling function’s registers • The stack enables recursion

Code 0 x 8048347 void C () { A (0); } 0 x 8048354

Code 0 x 8048347 void C () { A (0); } 0 x 8048354 void B () { C (); } 0 x 8048361 void A (int tmp){ if (tmp) B (); } 0 x 804838 c int main () { A (1); return 0; } Memory 0 xfffffff … Stack A tmp=0 RA=0 x 8048347 SP C const=0 RA=0 x 8048354 SP B RA=0 x 8048361 SP A tmp=1 RA=0 x 804838 c SP main 0 x 0 SP const 1=1 const 2=0

Code Memory 0 xfffffff 0 x 8048361 0 x 804838 c void A (int

Code Memory 0 xfffffff 0 x 8048361 0 x 804838 c void A (int bnd){ if (bnd) A (bnd-1); } int main () { A (3); return 0; } How can recursion go wrong? Can overflow the stack … Keep adding frame after frame … 0 x 0 Stack SP A bnd=0 RA=0 x 8048361 SP A bnd=1 RA=0 x 8048361 SP bnd=2 A RA=0 x 8048361 P S bnd=3 A RA=0 x 804838 c P S const 1=3 main const 2=0

Code void cap (char* b){ for (int i=0; b[i]!=‘�’; i++) 0 x 8048361 }

Code void cap (char* b){ for (int i=0; b[i]!=‘’; i++) 0 x 8048361 } b[i]+=32; int main(char*arg) { char wrd[4]; strcpy(arg, wrd); cap (wrd); return 0; 0 x 804838 c } What can go wrong? Can overflow wrd variable … Overwrite cap’s RA Memory Stack 0 xfffffff SP … 0 x 0 b= 0 x 00234 cap RA=0 x 804838 c SP wrd[3] wrd[2] wrd[1] main wrd[0] 0 x 00234 const 2=0

Assembler directives: quick peek From x 86 Assembly Language Reference Manual The. align directive

Assembler directives: quick peek From x 86 Assembly Language Reference Manual The. align directive causes the next data generated to be aligned modulo integer bytes. The. ascii directive places the characters in string into the object module at the current location but does not terminate the string with a null byte (). The. comm directive allocates storage in the data section. The storage is referenced by the identifier name. Size is measured in bytes and must be a positive integer. The. globl directive declares each symbol in the list to be global. Each symbol is either defined externally or defined in the input file and accessible in other files. The. long directive generates a long integer (32 -bit, two's complement value) for each expression into the current section. Each expression must be a 32–bit value and must evaluate to an integer value.

Basic hints on using Unix • Find a properly installed Unix system: linux. cs.

Basic hints on using Unix • Find a properly installed Unix system: linux. cs. duke. edu, or Mac. OS with Xcode and its command line tools will do nicely. • Learn a little about the Unix shell command language: e. g. , look ahead to the shell lab, Lab #2. On Mac. OS open the standard Terminal utility. • Learn some basic commands: cd, ls, cat, grep, more/less, pwd, rm, cp, mv, diff, and an editor of some kind (vi, emacs, …). Spend one hour. • Learn basics of make. Look at the makefile. Run “make –i” to get it to tell you what it is doing. Understand what it is doing. • Wikipedia is a good source for basics. Use the man command to learn about commands (1), syscalls (2), or C libraries (3). E. g. : type “man man”. • Know how to run your programs under a debugger: gdb. If it crashes you can find out where. It’s easy to set breakpoints, print variables, etc. • If your program doesn’t compile, deal with errors from the top down. Try “make >out 2>out”. It puts all output in the file “out” to examine at leisure. • Put source in a revision system like git or svn, but Do. Not. Share. It.

Running a program Can a program launch multiple running instances on the same platform?

Running a program Can a program launch multiple running instances on the same platform? Program

Running a program Can a program launch multiple running instances on the same platform?

Running a program Can a program launch multiple running instances on the same platform? Program It depends. On some platforms (e. g. , Android) an app is either active or it is not.

Abstraction • Separate: – Interface from internals – Specification from implementation • Abstraction is

Abstraction • Separate: – Interface from internals – Specification from implementation • Abstraction is a double-edged sword. – “Don’t hide power. ” • More than an interface… This course is (partly) about the use of abstraction(s) in complex software systems. We want abstractions that are simple, rich, efficient to implement, and long-lasting.

Interface and abstraction

Interface and abstraction

Abstraction(s) • A means to organize knowledge – Capture what is common and essential

Abstraction(s) • A means to organize knowledge – Capture what is common and essential – Generalize and abstract away the details – Specialize as needed – Concept hierarchy • A design pattern or element – Templates for building blocks – Instantiate as needed • E. g. : class, subclass, and instance

Standards, wrappers, adapters “Plug-ins” “Plug-compatible” Another layer of software can overcome superficial or syntactic

Standards, wrappers, adapters “Plug-ins” “Plug-compatible” Another layer of software can overcome superficial or syntactic differences if the fundamental are right.

Virtualization?

Virtualization?