Type Inference II David Walker COS 441 Type

Type Inference Goal: l Given unannotated program, find its type or report it does

Constraint Generation Typing rules for annotated programs: G |-- e : t Note: given

Rule comparison implicit equality easy to implement because t can’t contain variables G |--

Non-local Constraints Does this type check? fun f (x) = fun g (y) =

Non-local Constraints But remember, it was easy to check when types were declared in

Solving Constraints A solution to a system of type constraints is a substitution S

Most General Solutions S is the principal (most general) solution of a constraint q

principal solutions give rise to most general reconstruction of typing information for a term:

Unification: An algorithm that provides the principal solution to a set of constraints (if

Unification: Unification systematically simplifies a set of constraints, yielding a substitution l during simplification,

Unification Machine We can specify unification as a transition system: l (S, q) ->

Unification Machine Functions: -----------------------(S, {s 11 -> s 12= s 21 -> s 22}

Occurs Check What is the solution to {a = a -> a}?

Occurs Check What is the solution to {a = a -> a}? There is

Irreducible States Recall: final states have the form (S, { }) Stuck states (S,

Termination We want unification to terminate (to give us a type reconstruction algorithm) In

Termination We associate an ordering with constraints l q < q’ if and only

Termination Lemma: Every step reduces the size of q l Proof: By cases (ie:

Correctness we know the algorithm terminates we want to prove that a series of

Defining the invariants A complete solution for (S, q) is a substitution T such

Properties of Solutions Lemma 1: Every final state (S, { }) has a complete

Properties of Solutions Lemma 2 l No stuck state has a complete solution (or

Properties of Solutions If (S, q) -> (S’, q’) then 1. T is complete

Proof cases ----------------(S, {int=int} U q) -> (S, q) If T is complete for

Proof cases ----------------------- (a not in FV(s)) (S, {a=s} U q) -> ([s/a] o

Summary: Unification By termination, (I, q) ->* (S, q’) where (S, q’) is irreducible.

Summary: Unification (cont. ). . . Moreover: l If q’ is not { }

Summary: Type Inference Type inference algorithm. l Given a context G, and untyped term

Slides: 30

Download presentation

Type Inference II David Walker COS 441

Type Inference Goal: l Given unannotated program, find its type or report it does not type check Overview: generate type constraints (equations) from unannotated programs l solve equations l

Constraint Generation Typing rules for annotated programs: G |-- e : t Note: given G and e, at most one type t Typing rules for unannotated programs: G |-- u => e : t, q Note: given G and u, there may be many possible types for u; the many possibilities represented using type schemes u has a type t if q has a solution Remember: fun f (x) = f x

Rule comparison implicit equality easy to implement because t can’t contain variables G |-- e 1 : bool G |-- e 2 : t G |-- e 3 : t --------------------------------G |-- if e 1 then e 2 else e 3 : t G |-- u 1 ==> e 1 : t 1, q 1 G |-- u 2 ==> e 2 : t 2, q 2 G |-- u 3 ==> e 3 : t 3, q 3 ---------------------------------G |-- if u 1 then u 2 else u 3 ==> if e 1 then e 2 else e 3 : a, q 1 U q 2 U q 3 U {t 1 = bool, a = t 2, a = t 3} equality harder to implement because t 2, t 3 can contain variables that may be further constrained elsewhere

Non-local Constraints Does this type check? fun f (x) = fun g (y) = (if true then x else y, . . . ) It depends: fun f (x) = fun g (y) = (if true then x else y, x + y) fun f (x) = fun g (y) = (if true then x else y, x + (if y then 3 else 4))

Non-local Constraints But remember, it was easy to check when types were declared in advance: fun f (x: int) = fun g (y: int) = (if true then x else y, x + y) fun f (x: int) = fun g (y: bool) = (if true then x else y, x + (if y then 3 else 4))

Solving Constraints A solution to a system of type constraints is a substitution S S |= q iff applying S makes left- & right-hand sides of each equation equal A solution S is better than T if it is “more general” l l intuition: S makes fewer or less-defined substitutions (leaves more variables alone) T <= S if and only if T = U o S for some U

Most General Solutions S is the principal (most general) solution of a constraint q if S |= q (it is a solution) l if T |= q then T <= S (it is the most general one) l Lemma: If q has a solution, then it has a most general one We care about principal solutions since they will give us the most general types for terms

principal solutions give rise to most general reconstruction of typing information for a term: l fun f(x: a): a = x l l is a most general reconstruction fun f(x: int): int = x l is not

Unification: An algorithm that provides the principal solution to a set of constraints (if one exists) l If one exists, it will be principal

Unification: Unification systematically simplifies a set of constraints, yielding a substitution l during simplification, we maintain (S, q) S is the solution so far l q are the constraints left to simplify l Starting state of unification process: (I, q) l Final state of unification process: (S, { }) l identity substitution is most general

Unification Machine We can specify unification as a transition system: l (S, q) -> (S’, q’) Base types & simple variables: ----------------(S, {int=int} U q) -> (S, q) ------------------(S, {bool=bool} U q) -> (S, q) --------------(S, {a=a} U q) -> (S, q)

Unification Machine Functions: -----------------------(S, {s 11 -> s 12= s 21 -> s 22} U q) -> (S, {s 11 = s 21, s 12 = s 22} U q) Variable definitions ----------------------- (a not in FV(s)) (S, {a=s} U q) -> ([s/a] o S, [s/a]q) ---------------------- (a not in FV(s)) (S, {s=a} U q) -> ([s/a] o S, [s/a]q)

Occurs Check What is the solution to {a = a -> a}?

Occurs Check What is the solution to {a = a -> a}? There is none! l The “occurs check” detects this situation l ---------------------- (a not in FV(s)) (S, {s=a} U q) -> ([a=s] o S, [s/a]q) occurs check

Irreducible States Recall: final states have the form (S, { }) Stuck states (S, q) are such that every equation in q has the form: int = bool l s 1 -> s 2 = s (s not function type) la=s (s contains a) l or is symmetric to one of the above l Stuck states arise when constraints are unsolvable

Termination We want unification to terminate (to give us a type reconstruction algorithm) In other words, we want to show that there is no infinite sequence of states l (S 1, q 1) -> (S 2, q 2) ->. . .

Termination We associate an ordering with constraints l q < q’ if and only if q contains fewer variables than q’ l q contains the same number of variables as q’ but fewer type constructors (ie: fewer occurrences of int, bool, or “->”) l l This is a lexicographic ordering l l There is no infinite decreasing sequence of constraints To prove termination, we must demonstrate that every step of the algorithm reduces the size of q according to this ordering

Termination Lemma: Every step reduces the size of q l Proof: By cases (ie: induction) on the definition of the reduction relation. ----------------(S, {int=int} U q) -> (S, q) ------------------(S, {bool=bool} U q) -> (S, q) --------------(S, {a=a} U q) -> (S, q) -----------------------(S, {s 11 -> s 12= s 21 -> s 22} U q) -> (S, {s 11 = s 21, s 12 = s 22} U q) ------------ (a not in FV(s)) (S, {a=s} U q) -> ([s/a] o S, [s/a]q)

Correctness we know the algorithm terminates we want to prove that a series of steps: (I, q 1) -> (S 2, q 2) -> (S 3, q 3) ->. . . -> (S, {}) solves the initial constraints q 1 we’ll do that by induction on the length of the sequence, but we’ll need to define the invariants that are preserved from step to step

Defining the invariants A complete solution for (S, q) is a substitution T such that 1. 2. T <= S T |= q Intuition: T extends S and solves q A principal solution T for (S, q) is complete for (S, q) and 3. for all T’ such that 1. and 2. hold, T’ <= T Intuition: T is the most general solution (it’s the least restrictive)

Properties of Solutions Lemma 1: Every final state (S, { }) has a complete solution. l It is S since: l S <= S l S |= { } l every substitution is a solution to the empty set of constraints

Properties of Solutions Lemma 2 l No stuck state has a complete solution (or any solution at all) l it is impossible for a substitution to make the necessary equations equal l int bool int t 1 -> t 2. . .

Properties of Solutions If (S, q) -> (S’, q’) then 1. T is complete for (S, q) iff T is complete for (S’, q’) 2. T is principal for (S, q) iff T is principal for (S’, q’) Proof: By induction (cases) on the definition of (S, q) -> (S’, q’) in the forward direction, this is the preservation theorem for the unification machine!

Proof cases ----------------(S, {int=int} U q) -> (S, q) If T is complete for (S, {int=int} U q) then (by definition): (1) T <= S and (2) T |= {int=int} U q To prove T is complete for (S, q), we show: T <= S (by (1)) T |= q (by (2) and definition of |= )

Proof cases ----------------------- (a not in FV(s)) (S, {a=s} U q) -> ([s/a] o S, [s/a]q) If T is complete for (S, {a=s} U q) then (by definition): (1) T <= S (2) T |= {a=s} U q To prove T is complete for ([a=s] o S, q[s/a]), we show: T <= [s/a] o S T |= q (by (2) and definition of |= ) How? Maybe for homework!

Summary: Unification By termination, (I, q) ->* (S, q’) where (S, q’) is irreducible. Moreover: l If q’ = { } then l l l S is principal for (S, {}) (by lemma 1) S is principal for (I, q) (by lemma 3) Since S is principal for (I, q), S is more general than any other solution T for q such that T <= I. Since all T <= I, T <= S for all solutions to q. Hence, S is a principal solution for q.

Summary: Unification (cont. ). . . Moreover: l If q’ is not { } (and (I, q) ->* (S, q’) where (S, q’) is irreducible) then l (S, q) is stuck. Consequently, (S, q) has no complete solution. By lemma 3, even (I, q) has no complete solution and therefore q has no solution at all.

Summary: Type Inference Type inference algorithm. l Given a context G, and untyped term u: Find e, t, q such that G |- u ==> e : t, q l Find principal solution S of q via unification l l l Apply S to e, ie our solution is S(e) l l if no solution exists, there is no reconstruction S(e) contains schematic type variables a, b, c, etc that may be instantiated with any type Since S is principal, S(e) characterizes all reconstructions.

End