Runtime organization Lecture 23 3172008 Prof Hilfinger CS

  • Slides: 37
Download presentation
Run-time organization Lecture 23 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 1

Run-time organization Lecture 23 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 1

Status • We have covered the front-end phases – Lexical analysis – Parsing –

Status • We have covered the front-end phases – Lexical analysis – Parsing – Semantic analysis • Next are the back-end phases – Optimization – Code generation • We’ll do code generation first. . . 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 2

Run-time environments • Before discussing code generation, we need to understand what we are

Run-time environments • Before discussing code generation, we need to understand what we are trying to generate – The term virtual machine refers to the compiler’s target – Can be just a bare hardware architecture (small embedded systems) – Can be an interpreter, as for Java, or an interpreter that does additional compilation, as in modern Java JITs – For now, we’ll stick to hardware + conventions for using it (“API”) + some runtime-support library • There a number of standard techniques/conventions for structuring executable code that are widely used 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 3

Outline • Management of run-time resources • Correspondence between static (compile-time) and dynamic (run-time)

Outline • Management of run-time resources • Correspondence between static (compile-time) and dynamic (run-time) structures • Storage organization 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 4

Run-time Resources • Execution of a program is initially under the control of the

Run-time Resources • Execution of a program is initially under the control of the operating system • When a program is invoked: – The OS allocates space for the program – The code is loaded into part of the space – The OS jumps to the entry point (i. e. , “main”) 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 5

Memory Layout (Example) Low Address Code Memory Other Space High Address 3/17/2008 Prof. Hilfinger

Memory Layout (Example) Low Address Code Memory Other Space High Address 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 6

Notes • These pictures are simplifications – E. g. , not all memory need

Notes • These pictures are simplifications – E. g. , not all memory need be contiguous • Other Space = Data Space • Compiler is responsible for: – Generating code – Orchestrating use of the data area 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 7

Code Generation Goals • Two goals: – Correctness – Speed • Minimize instruction counts

Code Generation Goals • Two goals: – Correctness – Speed • Minimize instruction counts • Keep variables easy to access (static offsets, e. g. ) • Maximize use of registers (“top of the memory hierarchy”) • Most complications in code generation come from trying to be fast as well as correct, because this requires attention to special cases. 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 8

Assumptions about Execution 1. Execution is sequential; control moves from one point in a

Assumptions about Execution 1. Execution is sequential; control moves from one point in a program to another in a well-defined order 2. When a procedure is called, control eventually returns to the point immediately after the call Do these assumptions always hold? 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 9

Activations and Lifetimes (Extents) • An invocation of procedure P is an activation of

Activations and Lifetimes (Extents) • An invocation of procedure P is an activation of P • The lifetime of an activation of P is – All the steps to execute P – Including all the steps in procedures P calls • The lifetime of a variable x is the portion of execution in which x is defined • Lifetime is a dynamic (run-time) concept • … As opposed to scope, which is a static concept 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 10

Activation Trees • Assumption (2) requires that when P calls Q, then Q returns

Activation Trees • Assumption (2) requires that when P calls Q, then Q returns before P does • Lifetimes of procedure activations are properly nested • Activation lifetimes can be depicted as a tree 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 11

Example (from Java) class Main { int g() { return 1; } int f()

Example (from Java) class Main { int g() { return 1; } int f() {return g(); } void main() { g(); f(); } } Main g f g 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 12

Example 2 class Main { int g() { return 1; } int f(int x)

Example 2 class Main { int g() { return 1; } int f(int x) { if (x == 0) { return g(); } else { return f(x - 1); } } void main() { f(2); } } What is the activation tree for this example? 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 13

Example 2 class Main { int g() { return 1; } int f(int x)

Example 2 class Main { int g() { return 1; } int f(int x) { if (x == 0) { return g(); } else { return f(x - 1); } } void main() { f(2); } } Main f f f g 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 14

Notes • The activation tree depends on run-time behavior • The activation tree may

Notes • The activation tree depends on run-time behavior • The activation tree may be different for every program input • Since activations are properly nested, a stack can track currently active procedures 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 15

Example class Main { int g() { return 1; } int f() { return

Example class Main { int g() { return 1; } int f() { return g(); } void main() { g(); f(); } } Main Stack Main 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 16

Example class Main { int g() { return 1; } int f() { return

Example class Main { int g() { return 1; } int f() { return g(); } void main() { g(); f(); } } Main Stack Main g g 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 17

Example class Main { int g() { return 1; } int f() { return

Example class Main { int g() { return 1; } int f() { return g(); } void main() { g(); f(); } } Main g 3/17/2008 Stack f Prof. Hilfinger CS 164 Lecture 23 Main f 18

Example class Main { int g() { return 1; } int f() { return

Example class Main { int g() { return 1; } int f() { return g(); } void main() { g(); f(); } } Main g Stack f g 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 Main f g 19

Revised Memory Layout Low Address Code Memory Stack High Address 3/17/2008 Prof. Hilfinger CS

Revised Memory Layout Low Address Code Memory Stack High Address 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 20

Activation Records • The information needed to manage one procedure activation is called an

Activation Records • The information needed to manage one procedure activation is called an activation record (AR) or frame • If procedure F calls G, then G’s activation record contains a mix of info about F and G. 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 21

What is in G’s AR when F calls G? • F is “suspended” until

What is in G’s AR when F calls G? • F is “suspended” until G completes, at which point F resumes. G’s AR contains information needed to resume execution of F. • G’s AR may also contain: – Space to save registers used by F or G – Space for G’s local variables – Temporary space for intermediate results, arguments and return values for functions G calls. • Depending on architecture and compiler, registers may hold part of AR (at times). 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 22

What Information is Needed to Return from G? • Return address • Contents of

What Information is Needed to Return from G? • Return address • Contents of (some) registers prior to call • Information to establish address of AR for G’s caller: – This address is called the dynamic link – Often kept in a register, but this is sometimes not necessary. • Various other machine status prior to calling G 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 23

Example 2, Revisited class Main { int g() { return 1; } int f(int

Example 2, Revisited class Main { int g() { return 1; } int f(int x) { if (x == 0) { return g(); } else { return f(x - 1); (**) } } AR void main() { f(3); (*) } } for Main: argument x return address AR for f: 3/17/2008 space for f’s result Prof. Hilfinger CS 164 Lecture 23 dynamic link 24

Stack After Two Calls to f (*) and (**) denote return addresses Main .

Stack After Two Calls to f (*) and (**) denote return addresses Main . . . f 1’s result 3 (*) f 1 f 2’s result Main has no argument or local variables and its result is never used; its AR is uninteresting Only one of many possible AR designs Would also work for C, Pascal, FORTRAN, etc. 2 f 2 3/17/2008 (**) Prof. Hilfinger CS 164 Lecture 23 25

The Main Point • There is nothing magic about this organization – Can rearrange

The Main Point • There is nothing magic about this organization – Can rearrange order of frame elements – Can divide caller/callee responsibilities differently – An organization is better if it improves execution speed or simplifies code generation • The compiler must determine, at compile-time, the layout of activation records and generate code that correctly accesses locations in the activation record Thus, the AR layout and the code generator must be designed together! 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 26

Registers • Real compilers hold as much of the frame as possible in registers

Registers • Real compilers hold as much of the frame as possible in registers – Especially the method result and arguments • Registers also typically hold start of frame (frame pointer) and top of stack. 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 27

Globals • All references to a global variable point to the same object –

Globals • All references to a global variable point to the same object – Don’t generally store a global in an activation record • Globals are assigned a fixed address once – Variables with fixed address are “statically allocated” • Depending on the language, there may be other statically allocated values 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 28

Memory Layout with Static Data Low Address Code Memory Static Data Stack 3/17/2008 Prof.

Memory Layout with Static Data Low Address Code Memory Static Data Stack 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 High Address 29

Heap Storage • A value that outlives the procedure that creates it cannot be

Heap Storage • A value that outlives the procedure that creates it cannot be kept in the AR: Bar foo() { return new Bar } The Bar value must survive deallocation of foo’s AR • Language implementations with dynamically allocated data use a heap to store dynamic data – (confusingly, not the same as the heap used for priority queues!) 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 30

Notes • The code area contains object code – For most languages, fixed size

Notes • The code area contains object code – For most languages, fixed size and read only • The static area contains data (not code) with fixed addresses (e. g. , global data) – Fixed size, may be readable or writable • The stack contains an AR for each currently active procedure – Each AR usually fixed size, contains locals • Heap contains all other data – In C, heap is managed by malloc and free 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 31

Notes (Cont. ) • Both the heap and the stack grow • Must take

Notes (Cont. ) • Both the heap and the stack grow • Must take care that they don’t grow into each other • Solution: start heap and stack at opposite ends of memory and let the grow towards each other 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 32

Memory Layout with Heap Code Memory Low Address Static Data Heap Stack 3/17/2008 Prof.

Memory Layout with Heap Code Memory Low Address Static Data Heap Stack 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 High Address 33

Memory Layout with Heap (Alternative) Code Memory Low Address Static Data Stack Heap 3/17/2008

Memory Layout with Heap (Alternative) Code Memory Low Address Static Data Stack Heap 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 High Address 34

Data Layout • Low-level details of machine architecture are important in laying out data

Data Layout • Low-level details of machine architecture are important in laying out data for correct code and maximum performance • Chief among these concerns is alignment 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 35

Alignment • Many installed machines are (still) 32 bit – 8 bits in a

Alignment • Many installed machines are (still) 32 bit – 8 bits in a byte – 4 bytes in a word – Machines are either byte or word addressable • Data is word aligned if it begins at a word boundary • Most machines have some alignment restrictions – Or performance penalties for poor alignment • New machines use 64 -bit or 32/64 -bit hardware and APIs. 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 36

Alignment (Cont. ) • Example: A string “Hello” Takes 5 characters (without a terminating

Alignment (Cont. ) • Example: A string “Hello” Takes 5 characters (without a terminating ) • To word align next datum, add 3 “padding” characters to the string • The padding is not part of the string, it’s just unused memory 3/17/2008 Prof. Hilfinger CS 164 Lecture 23 37