Languages and Compilers SProg og Oversttere Bent Thomsen
Languages and Compilers (SProg og Oversættere) Bent Thomsen Department of Computer Science Aalborg University With acknowledgement to Juan Carlos Guzmán and Elsa Gunter who’s slides this lecture is based on. 1
Some more Programming Language Design Issues • A Programming model (sometimes called the computer) is defined by the language semantics – More about this in the semantics course • Programming model given by the underlying system – Hardware platform and operating system • The mapping between these two programming models (or computers) that the language processing system must define can be influenced in both directions – E. g low level features in high level languages • Pointers, arrays, for-loops – Hardware support for fast procedure calls 2
Programming Language Implementation • Develop layers of machines, each more primitive than the previous • Translate between successive layers • End at basic layer • Ultimately hardware machine at bottom • To design programming languages and compilers, we thus need to understand a bit about computers ; -) 3
Why So Many Computers? • It is economically feasible to produce in hardware (or firmware) only relatively simple computers • More complex or abstract computers are built in software • There are exceptions – EDS machine to run prolog (or rather WAM) – Alice Machine to run Hope 4
Machines • Hardware computer: built out of wires, gates, circuit boards, etc. – An elaboration of the Von Neumann Machine • Software simulated computer: that implemented in software, which runs on top of another computer • • • Data Primitive Operations Sequence Control Data Access Storage Management Operating Environment 5
Basic Computer Architecture External files Main memory Cache memory CPU Program counter Interpreter Data registers Arithmetic/Logic Unit 6
Memory and data • Memory – Registers • PC, data or address – Main memory (fixed length words 32 or 64 bits) – Cache – External • Disc, CD-ROM, memory stick, tape drives – Order of magnitude in access speed • Nanoseconds vs milliseconds • Built-in data types – integers, floating point, fixed length strings, fixed length bit strings 7
Hardware computer • Operations – – Arithmetic on primitive data Tests (test for zero, positive or negative) Primitive access and modification Jumps (unconditional, return) • Sequence control – Next instruction in PC (location counter) – Some instructions modify PC • Data access – Reading and writing – Words from main memory, Blocks from external storage • Storage management – Wait for data or multi-programming – Paging – Cache (32 K usually gives 95% hit rate) 8
Micro Program interpretation and execution Fetch next instruction Decode instruction Operation and operands Fetch designated operands Branch to designated operation Execute Primitive Operation Execute halt 9
Virtual Computers • How can we execute programs written in the high-level computer, given that all we have is the low-level computer? – Compilation • Translate instructions to the high-level computer to those of the low-level – Simulation (interpretation) • create a virtual machine – Sometimes the simulation is done by hardware • This is called firmware 10
A Six-Level Computer Level 5 Applications Application Level Compilers, Editors, Navigators Assembly Language Level 3 Assembler, Linker, Loader Operating System Machine Level 2 Software Level 4 Operating System Instruction Set Architecture Level Microprogram or hardware Microarchitecture Level 0 Hardware Digital Logic Level from Andrew S. Tanenbaum, Structured Computer Organization, 4 th Edition, Prentice Hall, 1999. Hardware Level 1 11
A Web Application Input Output Web Application Computer (HTML web pages) Web Virtual Computer (Navigator implemented in C or Java) C Virtual Computer (standard C libraries) Operating System Virtual Computer (implemented in “machine Instructions”) Firmware Virtual Computer (implemented in microcode) Actual Hardware Computer 12
Back to programming language design • Now we know a bit about hardware, firmware and virtual machines • What is the impact on programming language design? • We need to make decisions about primitive data, data objects and program structures (syntax design) • But first we have to talk about binding time! 13
Binding • Binding: an association between an attribute and its entity • Binding Time: when does it happen? • … and, when can it happen? 14
Binding of Data Objects and Variables • Attributes of data objects and variables have different binding times • If a binding is made before run time and remains fixed through execution, it is called static • If the binding first occurs or can change during execution, it is called dynamic 15
Binding Time Static • • • Language definition time Language implementation time Program writing time Compile time Link time Load time Dynamic • Run time – – At the start of execution (program) On entry to a subprogram or block When the expression is evaluated When the data is accessed 16
X = X + 10 • • • Set of types for variable X Type of variable X Set of possible values for variable X Value of variable X Scope of X – lexical or dynamic scope • Representation of constant 10 – Value (10) – Value representation (10102) • big-endian vs. little-endian – Type (int) – Storage (4 bytes) • stack or global allocation • Properties of the operator + – Overloaded or not 17
Little- vs. Big-Endians • Big-endian – A computer architecture in which, within a given multi-byte numeric representation, the most significant byte has the lowest address (the word is stored `big-end-first'). – Motorola and Sun processors • Little-endian – a computer architecture in which, within a given 16 - or 32 -bit word, bytes at lower addresses have lower significance (the word is stored `little-end-first'). – Intel processors from The Jargon Dictionary - http: //info. astrian. net/jargon 18
Static vs. Dynamic Scope • Under static, sometimes called lexical, scope, sub 1 will always reference the x defined in big • Under dynamic scope, the x it references depends on the dynamic state of execution procedure big; var x: integer; procedure sub 1; begin {sub 1}. . . x. . . end; {sub 1} procedure sub 2; var x: integer; begin {sub 2}. . . sub 1; . . . end; {sub 2} begin {big}. . . sub 1; sub 2; . . . end; {big} 19
Scope of Variable • Range of program that can reference that variable (ie access the corresponding data object by the variable’s name) • Variable is local to program or block if it is declared there • Variable is nonlocal to program unit if it is visible there but not declared there 20
Static Scoping • Scope computed at compile time, based on program text • To determine the name of a used variable we must find statement declaring variable • Subprograms and blocks generate hierarchy of scopes – Subprogram or block that declares current subprogram or contains current block is its static parent • General procedure to find declaration: – First see if variable is local; if yes, done – If non-local to current subprogram or block recursively search static parent until declaration is found – If no declaration is found this way, undeclared variable error detected 21
Example program main; var x : integer; procedure sub 1; var x : integer; begin { sub 1 } …x… end; { sub 1 } begin { main } …x… end; { main } 22
Dynamic Scope • Now generally thought to have been a mistake • Main example of use: original versions of LISP – Common LISP uses static scope – Perl allows variables to be decalred to have dynamic scope • Determined by the calling sequence of program units, not static layout • Name bound to corresponding variable most recently declared among still active subprograms and blocks 23
Example program main; var x : integer; procedure sub 1; begin { sub 1 } …x… end; { sub 1 } procedure sub 2; var x : integer; begin { sub 2 } … call sub 1 … end; { sub 2 } … call sub 2… end; { main } 24
Binding Times summary • Language definition time: – language syntax and semantics, scope discipline • Language implementation time: – interpreter versus compiler, – aspects left flexible in definition, – set of available libraries • Compile time: – some initial data layout, internal data structures • Link time (load time): – binding of values to identifiers across program modules • Run time (execution time): – actual values assigned to non-constant identifiers The Programming language designer and compiler implementer have to make decisions about binding times 25
Syntax Design Criteria • Readability – syntactic differences reflect semantic differences – verbose, redundant • Writeability – concise • Ease of translation – simple language – simple semantics • Lack of ambiguity – dangling else – Fortran’s A(I, J) • Ease of verifiability – simple semantics 26
Lexical Elements • • • Character set Identifiers Operators Keywords Noise words Elementary data • Comments • Blank space • Layout – Free- and fixed-field formats – numbers • integers • floating point – strings – symbols • Delimiters 27
Some nitty gritty decisions • Primitive data – Integers, floating points, bit strings – Machine dependent or independent (standards like IEEE) – Boxed or unboxed • Character set – ASCII, EBCDIC, UNICODE • Identifiers – Length, special start symbol (#, $. . . ), type encode in start letter • Operator symbols – Infix, prefix, postfix, precedence • Comments – REM, /* …*/, //, !, … • Blanks • Delimiters and brackets • Reserved words or Keywords (noise words) 28
Syntactic Elements • • Definitions Declarations Expressions Statements • • Separate subprogram definitions (Module system) Separate data definitions Nested subprogram definitions Separate interface definitions 29
Overall Program Structure • Subprograms – shallow definitions • C – nested definitions • Pascal • Data (OO) – shallow definitions • C++, Java, Smalltalk • Separate Interface – C, Fortran – ML, Ada • Mixed data and programs – C – Basic • Others – Cobol • Data description separated from executable statements • Data and procedure division 30
Some advice from an expert • • • Programming languages are for people Design for yourself and your friends Give the programmer as much control as possible Aim for brevity Admit what hacking is 31
- Slides: 31