Computer Systems Programming languages Jakub Yaghob Nave view
Computer Systems Programming languages Jakub Yaghob
Naïve view of a compiler Source code in my favorite programming language The compiler Error messages Executable for my favorite operating system
Formal view of a compiler l From slides of the course Compiler Principles l l Let’s have an input language Lin generated by a grammar Gin Let’s have an output language Lout generated by a grammar Gout or accepted by an automaton Aout The compiler is a mapping Lin→Lout, where for all win in Lin exist wout in Lout. The mapping does not exist for win not in Lin Don’t worry! l You have to visit Automata and Grammars (NTIN 071) course (obligatory) and then Compiler Principles (NSWI 098) course (elective)
Naïve view of a grammar l Formal description of a language l l Rules Lexical elements iteration-statement: while ( expression ) statement do statement while ( expression ) ; for ( expressionopt ; expressionopt ) statement
More practical view of a translation Interface Source code Preprocessor . pp . asm Libraries Objects Executable code Linker Compiler . obj Assembler
Memory organization l Memory organization during procedural program execution Code Static data Stack Code Constants Initialized static data Uninitialized static data Stack for thread 1 Stack for thread n Heap
Linker/librarian/loader l Library l l l Linking l l A collection of compiled source modules and other resources Static, dynamic “Gluing” the results of the different translations and libraries together into one executable for given OS Relocations Positions independent code Loader l l Part of OS, loads the executable into memory Relocation again
Linking A. C APP. EXE Code. A CC A. O Constants. A Static data. A CC B. O Constants. B Static data. B Code. P Constants. P PQ. LIB Static data. P Code. Q Constants. Q Static data. Q Linker B. C Code. B Code. A Code. B Code. Q Constants. A Constants. B Constants. Q Static data. A Static data. B Static data. Q
Run-time l Static language support l l l Compiler Library interface l Header files Dynamic language support l l l Run-time program environment l Storage organization l Memory content before execution l Constructors and destructors of global objects Libraries Calling convention
Function call – activation record (stack frame) Return value l l Actual parameters l Return address Control link Saved machine status Local data Temporaries Saved machine status l Return address to the code Registers Control link l Activation record of the caller
Calling convention l l l Public name mangling Call/return sequence for functions and procedures l Housekeeping responsibility Parameter passing l Registers, stack l Order of passed parameters Return value l Registers, stacks Registers role l Parameter passing, scratch, preserved
Public name mangling l Real meaning l mangle l l mandlovat rozsekat, roztrhat, rozbít, rozdrtit, těžce poškodit, potlouci, pohmožditi přen. pokazit, znetvořit, k nepoznání změnit, překroutit, zkomolit Examples: long f 1(int i, const char *m, struct s *p) _f 1 @f 1@12 _f 1@12 ? f 1@@YAJHPBDPAUs@@@Z _f 1 __Z 2 f 1 i. PKc. P 1 s f 1 ? f 1@@YAJHPEBDPEAUs@@@Z MSVC IA-32 C __cdecl MSVC IA-32 C __fastcall MSVC IA-32 C __stdcall MSVC IA-32 C++ GCC IA-32 C++ MSVC IA-64 C++
Call/return sequence C Parameters, return value Caller’s activation record Links, machine state Local and temporal data Caller’s responsibility Parameters, return value Callee’s activation record FP Links, machine state Local and temporal data Callee’s responsibility
Parameter passing l Call by value l l Actual parameter is evaluated and the value is passed Input parameters, the parameter is like a local variable C Call by reference l l l The caller passes a pointer to the variable Input/output parameters & in C++ Big. V fnc(int v, int &rv); Big. V r = fnc(a, b); 1234 a=5 2345 b=8 Big. V rv = 2345 v=5 RA
Variables l l l Named memory holding a value Has a type Storage l Static data l l Stack l l Local variables in C Heap l l Global variables in C Dynamic memory in C/C# Dictionary l l In Python Not a storage, it is a dynamic structure
Heap l l Storage for dynamic memory Allocate l Use all features from dynamic memory allocation l l Free blocks evidence Allocation algorithms § l Extremely simple and fast incremental allocation Deallocate l Explicit action in some languages l l C, C++ Automatic deallocation by garbage collection l l Remove burden and errors Works only with good knowledge of live objects and references
Garbage collection l Automatic removal of unused memory blocks l Advantages l l Disadvantages l l No dangling pointers, no double free, no memory leaks, allows heap consolidation and fast allocation Performance impact, even execution stall, unpredictable behavior GC strategies l Tracing l l Q Reference counting l l Reachable objects from live objects P Problems with cycles, space and speed overhead Advanced versions for languages with heavy use Q P
Portability l Source code portability l CPU architecture l Different type sizes § l Fixed type sizes § l C#, Java Compiler l Different language “flavors” § l l C, C++ - gcc, msvc, clang, … Use only syntax and library from a language standard OS l Different system/library calls § l Linux, Windows Sometimes easy § BSD sockets
Portability by VM l “Binary” portability l l Old technique for ensuring portability of a code among different HW Used by many “modern” languages l l Compiler translates a source code to the intermediate language l l l Abstract instructions Java: bytecode, C#: CIL Native VM compiled for a given architecture l l Java, C# Java: JRE, C#: CLR VM interprets intermediate language in a sandbox
Solving speed problems l JIT l l l Just-in-Time Translate intermediate code to the native code on demand AOT l l Ahead-of-Time Translate the whole program in intermediate code to the native code during the installation
- Slides: 20