Inductive Definitions COS 510 David Walker Inductive Definitions

  • Slides: 60
Download presentation
Inductive Definitions COS 510 David Walker

Inductive Definitions COS 510 David Walker

Inductive Definitions Inductive definitions play a central role in the study of programming languages

Inductive Definitions Inductive definitions play a central role in the study of programming languages They specify the following aspects of a language: • Concrete syntax (via CFGs) • Abstract syntax (via CFGs/ML datatypes) • Static semantics (via typing rules) • Dynamic semantics (via evaluation rules)

Reading • Read Pierce’s Text: – Chapter 2 (skim definitions; understand 2. 4) •

Reading • Read Pierce’s Text: – Chapter 2 (skim definitions; understand 2. 4) • we will use sets, relations, functions, sequences • you should know basics such as mathematical induction, reflexivity, transitivity, symmetry, total and partial orders, domains and ranges of functions, etc. – Chapter 3

Inductive Definitions • An inductive definition consists of: – One or more judgments (ie:

Inductive Definitions • An inductive definition consists of: – One or more judgments (ie: assertions) – A set of rules for deriving these judgments • For example: – Judgment is “n nat” – Rules: • zero nat • if n nat, then succ(n) nat.

Inference Rule Notation Inference rules are normally written as: J 1 . . .

Inference Rule Notation Inference rules are normally written as: J 1 . . . J Jn where J and J 1, . . . , Jn are judgements. (For axioms, n = 0. )

An example For example, the rules for deriving n nat are usually written: zero

An example For example, the rules for deriving n nat are usually written: zero nat n nat succ(n) nat

Derivation of Judgments • A judgment J is derivable iff either – there is

Derivation of Judgments • A judgment J is derivable iff either – there is an axiom J – or there is a rule J 1 . . . J Jn – such that J 1, . . . , Jn are derivable

Derivation of Judgments • We may determine whether a judgment is derivable by working

Derivation of Judgments • We may determine whether a judgment is derivable by working backwards. • For example, the judgment succ(zero)) nat is derivable as follows: a derivation (ie: a proof) zero nat (zero) succ(zero) nat (succ) succ(zero)) nat optional: names of rules used at each step

Binary Trees • Here is a set of rules defining the judgment t tree

Binary Trees • Here is a set of rules defining the judgment t tree stating that t is a binary tree: empty tree t 1 tree t 2 tree node (t 1, t 2) tree • Prove that the following is a valid judgment: node(empty, empty)) tree

Rule Induction • By definition, every derivable judgment – is the consequence of some

Rule Induction • By definition, every derivable judgment – is the consequence of some rule. . . – whose premises are derivable • That is, the rules are an exhaustive description of the derivable judgments • Just like an ML datatype definition is an exhaustive description of all the objects in the type being defined

Rule Induction • To show that every derivable judgment has a property P, it

Rule Induction • To show that every derivable judgment has a property P, it is enough to show that – For every rule, J 1 . . . J Jn if J 1, . . . , Jn have the property P, then J has property P This is the principal of rule induction.

Example: Natural Numbers • Consider the rules for n nat • We can prove

Example: Natural Numbers • Consider the rules for n nat • We can prove that the property P holds of every n such that n nat by rule induction: – Show that P holds of zero; – Assuming that P holds of n, show that P holds of succ(n). • This is just ordinary mathematical induction!!

Example: Binary Tree • Similarly, we can prove that every binary tree t has

Example: Binary Tree • Similarly, we can prove that every binary tree t has a property P by showing that – empty has property P; – If t 1 has property P and t 2 has property P, then node(t 1, t 2) has property P. • This might be called tree induction.

Example: The Height of a Tree • Consider the following equations: – hgt(empty) =

Example: The Height of a Tree • Consider the following equations: – hgt(empty) = 0 – hgt(node(t 1, t 2)) = 1 + max(hgt(t 1), hgt(t 2)) • Claim: for every binary tree t there exists a unique integer n such that hgt(t) = n. • That is, the above equations define a function.

Example: The Height of a Tree • We will prove the claim by rule

Example: The Height of a Tree • We will prove the claim by rule induction: – If t is derivable by the axiom empty tree – then n = 0 is determined by the first equation: hgt(empty) = 0 – is it unique? Yes.

Example: The Height of a Tree • If t is derivable by the rule

Example: The Height of a Tree • If t is derivable by the rule t 1 tree t 2 tree node (t 1, t 2) tree then we may assume that: • exists a unique n 1 such that hgt(t 1) = n 1; • exists a unique n 2 such that hgt(t 2) = n 2; Hence, there exists a unique n, namely 1+max(n 1, n 2) such that hgt(t) = n.

Example: The Height of a Tree This is awfully pedantic, but it is useful

Example: The Height of a Tree This is awfully pedantic, but it is useful to see the details at least once. • It is not obvious a priori that a tree has a welldefined height! • Rule induction justified the existence of the function hgt. It is “obvious” from the equations that there is at most one n such that hgt(t) = n. The proof shows that there exists at least one.

Inductive Definitions in PL • In this course, we will be looking at inductive

Inductive Definitions in PL • In this course, we will be looking at inductive definitions that determine – abstract syntax – static semantics (typing) – dynamic semantics (evaluation) – other properties of programs and programming languages

Inductive Definitions First up: Syntax

Inductive Definitions First up: Syntax

Abstract vs Concrete Syntax • the concrete syntax of a program is a string

Abstract vs Concrete Syntax • the concrete syntax of a program is a string of characters: – ‘(’ ‘ 3’ ‘+’ ‘ 2’ ‘)’ ‘*’ ‘ 7’ • the abstract syntax of a program is a tree representing the computationally relevant portion of the program: * + 3 7 2

Abstract vs Concrete Syntax • the concrete syntax of a program contains many elements

Abstract vs Concrete Syntax • the concrete syntax of a program contains many elements necessary for parsing: – parentheses – delimiters for comments – rules for precedence of operators • the abstract syntax of a program is much simpler; it does not contain these elements – precedence is given directly by the tree structure

Abstract vs Concrete Syntax • in this class, we work with abstract syntax –

Abstract vs Concrete Syntax • in this class, we work with abstract syntax – we want to define what programs mean – will work with the simple ASTs • nevertheless, we need a notation for writing down abstract syntax trees – when we write (3 + 2) * 7, you should visualize the tree: * + 3 7 2

Arithmetic Expressions, Informally • Informally, an arithmetic expression e is – a boolean value

Arithmetic Expressions, Informally • Informally, an arithmetic expression e is – a boolean value – an if statement (if e 1 then e 2 else e 3) – the number zero – the successor of a number – the predecessor of a number – a test for zero (is. Zero e)

Arithmetic Expressions, Formally • An arithmetic expression e is – a boolean value: true

Arithmetic Expressions, Formally • An arithmetic expression e is – a boolean value: true exp false exp – an if statement (if e 1 then e 2 else e 3): t 1 exp t 2 exp t 3 exp if t 1 then t 2 else t 3 exp

Arithmetic Expressions, formally • An arithmetic expression e is – a boolean, an if

Arithmetic Expressions, formally • An arithmetic expression e is – a boolean, an if statement, a zero, a successor, a predecessor or a 0 test: true exp zero exp false exp succ e exp e 1 exp e 2 exp e 3 exp if e 1 then e 2 else e 3 exp e exp pred e exp iszero e exp

BNF • Defining every bit of syntax by inductive definitions can be lengthy and

BNF • Defining every bit of syntax by inductive definitions can be lengthy and tedious • Syntactic definitions are an especially simple form of inductive definition: – context insensitive – unary predicates • There is a very convenient abbreviation: BNF

Arithmetic Expressions, in BNF e : : = true | false | if e

Arithmetic Expressions, in BNF e : : = true | false | if e then e else e | 0 | succ e | pred e | iszero e pick a new letter (Greek symbol/word) to represent any object in the set of objects being defined separates alternatives (7 alternatives implies 7 inductive rules) subterm/ subobject is any “e” object

An alternative definition b : : = true | false e : : =

An alternative definition b : : = true | false e : : = b | if e then e else e | 0 | succ e | pred e | iszero e corresponds to two inductively defined judgements: 2. e exp 1. b bool the key rule is an inclusion of booleans in expressions: b bool b exp

Metavariables b : : = true | false e : : = b |

Metavariables b : : = true | false e : : = b | if e then e else e | 0 | succ e | pred e | iszero e • b and e are called metavariables • they stand for classes of objects, programs, and other things • they must not be confused with program variables

2 Functions defined over Terms constants(true) = {true} constants (false) = {false} constants (0)

2 Functions defined over Terms constants(true) = {true} constants (false) = {false} constants (0) = {0} constants(succ e) = constants(pred e) = constants(iszero e) = constants e constants (if e 1 then e 2 else e 3) = Ui=1 -3 (constants ei) size(true) = 1 size(false) = 1 size(0) = 1 size(succ e) = size(pred e) = size(iszero e) = size e + 1 size(if e 1 then e 2 else e 3) = Ui=1 -3 (size ei) +1

A Lemma • The number of distinct constants in any expression e is no

A Lemma • The number of distinct constants in any expression e is no greater than the size of e: | constants e | ≤ size e • How to prove it?

A Lemma • The number of distinct constants in any expression e is no

A Lemma • The number of distinct constants in any expression e is no greater than the size of e: | constants e | ≤ size e • How to prove it? – By rule induction on the rules for “e exp” – More commonly called induction on the structure of e – a form of “structural induction”

Structural Induction • Suppose P is a predicate on expressions. – structural induction: •

Structural Induction • Suppose P is a predicate on expressions. – structural induction: • for each expression e, we assume P(e’) holds for each subexpression e’ of e and go on to prove P(e) • result: we know P(e) for all expressions e – you’ll use this idea every single week in the rest of the course.

Back to the Lemma • The number of distinct constants in any expression e

Back to the Lemma • The number of distinct constants in any expression e is no greater than the size of e: | constants e | ≤ size e • Proof: By induction on the structure of e. case e is 0, true, false: . . . case e is succ e’, pred e’, iszero e’: . . . case e is (if e 1 then e 2 else e 3): . . . always state method first separate cases (1 case per rule)

The Lemma • Lemma: | constants e | ≤ size e 2 -column proof

The Lemma • Lemma: | constants e | ≤ size e 2 -column proof • Proof: . . . case e is 0, true, false: | constants e | = |{e}| (by def of constants) =1 (simple calculation) = size e (by def of size) calculation justification

A Lemma • Lemma: | constants e | ≤ size e. . . case

A Lemma • Lemma: | constants e | ≤ size e. . . case e is pred e’: | constants e | = |constants e’| ≤ size e’ < size e (def of constants) (IH) (by def of size)

A Lemma • Lemma: | constants e | ≤ size e. . . case

A Lemma • Lemma: | constants e | ≤ size e. . . case e is (if e 1 then e 2 else e 3): | constants e | = |Ui=1. . 3 constants ei| (def of constants) ≤ Sumi=1. . 3 |constants ei| (property of sets) ≤ Sumi=1. . 3 (size ei) (IH on each ei) < size e (def of size)

A Lemma • Lemma: | constants e | ≤ size e. . . other

A Lemma • Lemma: | constants e | ≤ size e. . . other cases are similar. QED this had better be true use Latin to show off

A Lemma • In reality, this lemma is so simple that you might not

A Lemma • In reality, this lemma is so simple that you might not bother to write down all the details – “By induction on the structure of e. ” is a sufficient statement • BUT, when you omit the details of a proof, you had better be sure it is trivial! – when in doubt, present the details. • NEVER hand-wave through a proof – it is better to admit you don’t know then to fake it – if you cannot do part of the proof for homework, explicitly state the part of the proof that fails (if I had lemma X here, then. . . )

What is a proof? • A proof is an easily-checked justification of a judgment

What is a proof? • A proof is an easily-checked justification of a judgment (ie: a theorem) – different people have different ideas about what “easily-checked” means – the more formal a proof, the more “easilychecked” – in this class, we have a pretty high bar • If there is one thing you’ll learn in this class, it is how to write a proof!

Inductive Definitions Next up: Evaluation

Inductive Definitions Next up: Evaluation

Evaluation • There are many different ways to formalize the evaluation of expressions •

Evaluation • There are many different ways to formalize the evaluation of expressions • In this course we will use different sorts of operational semantics – direct expression of how an interpreter works – can be implemented in ML directly – easy to prove things about – scales up to complete languages easily

Values • A value is an object that has been completely evaluated • The

Values • A value is an object that has been completely evaluated • The values in our language of arithmetic expressions are v : : = true | false | zero | succ v • These values are a subset of the expressions • By calling “succ v” a value, we’re treating “succ v” like a piece of data; “succ v” is not function application – “succ zero” is a value that represents 1 – “succ (succ zero)” is the value that represents 2 – we are counting in unary • Remember, there is an inductive definition behind all this

Defining evaluation • single-step evaluation judgment: e e’ • in English, we say “expression

Defining evaluation • single-step evaluation judgment: e e’ • in English, we say “expression e evaluates to e’ in a single step”

Defining evaluation • single-step evaluation judgment: e e’ • evaluation rules for booleans: if

Defining evaluation • single-step evaluation judgment: e e’ • evaluation rules for booleans: if true then e 2 else e 3 e 2 if false then e 2 else e 3

Defining evaluation • single-step evaluation judgment: e e’ • evaluation rules for booleans: if

Defining evaluation • single-step evaluation judgment: e e’ • evaluation rules for booleans: if true then e 2 else e 3 e 2 if false then e 2 else e 3 what if the first position in the “if” is not true or false?

Defining evaluation • single-step evaluation judgment: e e’ • evaluation rules for booleans: if

Defining evaluation • single-step evaluation judgment: e e’ • evaluation rules for booleans: if true then e 2 else e 3 e 2 rules like this do the “real work” if false then e 2 else e 3 a “search” rule e 1’ if e 1 then e 2 else e 3 if e 1’ then e 2 else e 3

Defining evaluation • single-step evaluation judgment: e e’ • evaluation rules for numbers: e

Defining evaluation • single-step evaluation judgment: e e’ • evaluation rules for numbers: e e’ succ e’ e e’ pred e’ e e’ iszero e’ iszero (succ v) false pred (succ v) v iszero (zero) true

Defining evaluation • single-step evaluation judgment: e e’ • other evaluation rules: – there

Defining evaluation • single-step evaluation judgment: e e’ • other evaluation rules: – there are none! • Consider the term iszero true – – We call such terms stuck They aren’t values, but no rule applies They are nonsensical programs An interpreter for our language will either raise an exception when it tries to evaluate a stuck program or maybe do something random or even crash! – It is a bad scene.

Defining evaluation • Multistep evaluation: e * e’ • In English: “e evaluates to

Defining evaluation • Multistep evaluation: e * e’ • In English: “e evaluates to e’ in some number of steps (possibly 0)”: e * e (reflexivity) e e’’ * e’ e * e’ (transitivity)

Single-step Induction • We have defined the evaluation rules inductively, so we get a

Single-step Induction • We have defined the evaluation rules inductively, so we get a proof principle: – Given a property P of the single-step rules – For each rule: e 1’. . ek ek’ – we get to assume P(ei e’ ei’) for i = 1. . k and must prove the e conclusion P(e e’) – Result: we know P(e e’) for all valid judgments with the form e e’ – called induction on the structure of the operational semantics

Multi-step Induction – Given a property P of the multi-step rules – For each

Multi-step Induction – Given a property P of the multi-step rules – For each rule: e 1 * e 1’. . e * e’ ek * ek’ – we get to assume P(ei * ei’) for i = 1. . k and must prove the conclusion P(e * e’)

Multi-step Induction – In other words, given a property P of the multi-step rules

Multi-step Induction – In other words, given a property P of the multi-step rules – we must prove: • P(e * e) • P(e * e’) when e e’’ * e’ e * e’ and we get to assume P(e’’ * e’) and (of course) any properties we have proven already of the single step relation e e’’ • this means, to prove things about multi-step rules, we normally first need to prove a lemma about the single-step rules

A Theorem • Remember the function size(e) from earlier • Theorem: if e *

A Theorem • Remember the function size(e) from earlier • Theorem: if e * e’ then size(e’) <= size(e) • Proof: ?

A Theorem • Remember the function size(e) from earlier • Theorem: if e *

A Theorem • Remember the function size(e) from earlier • Theorem: if e * e’ then size(e’) <= size(e) • Proof: By induction on the structure of the multi-step operational rules.

A Theorem • Remember the function size(e) from earlier • Theorem: if e *

A Theorem • Remember the function size(e) from earlier • Theorem: if e * e’ then size(e’) <= size(e) • Proof: By induction on the structure of the multi-step operational rules. – consider the transitivity rule: e e’’ * e’ e * e’ –. . . we are going to need a similar property of the single step evaluation function

A Lemma • Lemma: if e e’ then size(e’) <= size(e) • Proof: ?

A Lemma • Lemma: if e e’ then size(e’) <= size(e) • Proof: ?

A Lemma • Lemma: if e e’ then size(e’) <= size(e) • Proof: By

A Lemma • Lemma: if e e’ then size(e’) <= size(e) • Proof: By induction on the structure of the multi-step operational rules. – one case for each rule, for example: – case: e e’ succ e’ – case: pred (succ v) v

A Lemma • Once we have proven the lemma, we can then prove theorem

A Lemma • Once we have proven the lemma, we can then prove theorem – Theorem: if e * e’ then size(e’) <= size(e) – When writing out a proof, always write lemmas in order to make it clear there is no circularity in the proof! • The consequence of our theorem: evaluation always terminates – our properties are starting to get more useful!

Summary • Everything in this class will be defined using inductive rules • These

Summary • Everything in this class will be defined using inductive rules • These rules give rise to inductive proofs • How to succeed in this class: – Dave: how do we prove X? – Student: by induction on the structure of Y. that’s the only tricky part