Banana Algebra Syntactic Language Extension via an Algebra
Banana Algebra: Syntactic Language Extension via an Algebra of Languages and Transformations Jacob Andersen Claus Brabrand [ jacand@cs. au. dk ] [ brabrand@itu. dk ] Aarhus University IT University of Copenhagen Claus Brabrand, ITU BANANA ALGEBRA [1] ETAPS/LDTA, York March 28, 2009
Abstract We propose an algebra of languages and transformations as a means for extending languages syntactically. The algebra provides a layer of high-level abstractions built on top of languages (captured by CFGs) and transformations (captured by constructive catamorphisms). The algebra is self-contained in that any term of the algebra specifying a transformation can be reduced to a constant catamorphism, before the transformation is run. Thus, the algebra comes "for free" without sacrificing the strong safety and efficiency properties of constructive catamorphisms. The entire algebra as presented in the paper is implemented as the Banana Algebra Tool which may be used to syntactically extend languages in an incremental and modular fashion via algebraic composition of previously defined languages and transformations. We demonstrate and evaluate the tool via several kinds of extensions. Claus Brabrand, ITU BANANA ALGEBRA [2] ETAPS/LDTA, York March 28, 2009
Outline Introduction: "What is a Banana? " Bananas for Language Transformation Language Extension Pattern Banana Algebra Examples Implementation Related Work Conclusion Claus Brabrand, ITU BANANA ALGEBRA [3] ETAPS/LDTA, York March 28, 2009
What is a 'Banana' ? Datatype; "list": list = Num N | Cons N * list N Banana ("sum-of-list"): [Num n] [Cons n l] = = n n + [l] (aka. "Catamorphism" ) Separation of recursion and evaluation Implicit recursion on input structure bottom-up re-combination of intermediate results (| n. n , (n, l ). n+l |) Claus Brabrand, ITU BANANA ALGEBRA [4] ETAPS/LDTA, York March 28, 2009
Language Transformation Bananas (statically typed): (| LS -> LT [ ] c |) LS -> LT Source language: 'LS' list = Num N | Cons N * list Target language: 'LT' tree = Nil | Leaf N | Node N * tree Nonterminal-typing: ' ' [list -> tree] Reconstructors: 'c ' [Num n] = Leaf n [Cons n l] = Node n (Nil) [l] Claus Brabrand, ITU BANANA ALGEBRA [5] Type-check'able! ETAPS/LDTA, York March 28, 2009
"Growing Languages with Metamorphic Syntax Macros" [ Claus Brabrand | Michael Schwartzbach ] ( PEPM 2002 ) Banana Properties "The metafront System: Safe and Extensible Parsing and Transformation" [ Claus Brabrand | Michael Schwartzbach ] ( LDTA 2003, SCP J. 2007 ) Banana properties: Simple Safe Efficient (Expressive) (corresponds to: “simple recursion”) (syntactically safe + always terminate) (linear time in size of input + output) (…enough for interesting extensions) Banana Algebra “for free” (16 banana ops): Modular Incremental Simple Safe Efficient (Expressive) Claus Brabrand, ITU Statically reduce: Banana Algebra (term) Banana (term) BANANA ALGEBRA [6] ETAPS/LDTA, York March 28, 2009
Outline Introduction: "What is a Banana? " Bananas for Language Transformation Language Extension Pattern Banana Algebra Examples Implementation Related Work Conclusion Claus Brabrand, ITU BANANA ALGEBRA [7] ETAPS/LDTA, York March 28, 2009
Language Extension Pattern Numeral extension: 'LS' Exp : : : var Id lam Id * Exp app Exp * Exp zero succ Exp pred Exp Lambda-Calculus: 'LT' Exp : var Id : lam Id * Exp : app Exp * Exp Nonterminal typing: ' ' [Exp -> Exp] Reconstructors: 'c ' [var V] [lam V E] [app E 1 E 2] [zero] [succ E] [pred E] Claus Brabrand, ITU = = = var lam app V Catamorphism: V [E ] (| LS -> LT [ ] c |) [E 1] [E 2] z (var z) Using very simple s [E ] numeral encoding [E ] (lam z (var z)) BANANA ALGEBRA [8] ETAPS/LDTA, York March 28, 2009
Algebraic Solution ln+l l + ln l l l (| ln -> l [Exp -> Exp] [zero] = lam z (var z) [succ E] = lam s [E ] [pred E] = app [E ]. . . idx |) l Exp : var Id : lam Id * Exp : app Exp * Exp Claus Brabrand, ITU BANANA ALGEBRA ln Exp : zero : succ Exp : pred Exp [9] ETAPS/LDTA, York March 28, 2009
Banana Algebra Languages (L): Transformations (X): { CFG } (| L -> L [ ] c |) l x v LL L+L src ( X ) tgt ( X ) let v = L in L letx w = X in L w XL X+X X X idx ( L ) let v = L in X letx w = X in X Claus Brabrand, ITU BANANA ALGEBRA [ 10 ] ETAPS/LDTA, York March 28, 2009
Algebraic Laws Idempotency of '+': L L + L Commutativity of '+': L 1 + L 2 + L 1 Associativity of '+': L 1 + (L 2 + L 3) (L 1 + L 2) + L 3 Source-identity: Target-identity: L tgt(idx(L)) L src(idx(L)) … Claus Brabrand, ITU BANANA ALGEBRA [ 11 ] ETAPS/LDTA, York March 28, 2009
Outline Introduction: "What is a Banana? " Bananas for Language Transformation Language Extension Pattern Banana Algebra Examples Implementation Related Work Conclusion Claus Brabrand, ITU BANANA ALGEBRA [ 12 ] ETAPS/LDTA, York March 28, 2009
Example Revisited --- "ln 2 l. x" --- let l = "l. l" in let ln = "ln. l" in idx(l) + (| ln -> l [Exp Exp. zero = Exp. succ = Exp. pred = |) --- "l. l" --{ --- "ln. l" --- { Id = [a-z] [a-z 0 -9]* ; Exp. var : Id ; Exp. lam : "\" Id ". " Exp ; Exp. app : "(" Exp ")" ; -> Exp] 'z. z' ; 's. $1' ; '($1 z. z)' ; Exp. zero : "zero" ; Exp. succ : "succ" "(" Exp ")" ; Exp. pred : "pred" "(" Exp ")" ; } } Claus Brabrand, ITU BANANA ALGEBRA [ 13 ] ETAPS/LDTA, York March 28, 2009
Numerals + Booleans …with Nums & Bools? …with Nums + lb+l l ln l idx ln l Claus Brabrand, ITU + l+ln+lb l BANANA ALGEBRA l l [ 14 ] …with Bools + idx lb l l lb ETAPS/LDTA, York March 28, 2009
Java + Repeat 575 lines --- "java. l" --{ Java . . . "try" Stm "catch" . . . Name. id : Id ; } --- "repeat. l" --{ Stm. repeat : "repeat" Stm "until" "(" Exp ")" "; " ; --- "repeat 2 java. x" --let java = "java. l" in let repeat = "repeat. l" in idx(java) + (| repeat -> java [Exp -> Exp, Stm -> Stm] 7 lines ! Stm. repeat = 'do $1 while (!($2)); ' ; |) Claus Brabrand, ITU BANANA ALGEBRA [ 15 ] ETAPS/LDTA, York March 28, 2009 }
Concrete vs. Abstract Syntax Concrete syntax: Stm. repeat = 'do $1 while (!($2)); ' ; Exp (with explicit assoc. /prec. ): Exp. or. exp 1 Exp 1. and. exp 2 Exp 2. add. exp 3 Exp 7. neg. exp 8 Exp 8. par. var. num : : : Exp 1 "||" Exp ; Exp 1 ; Exp 2 "&&" Exp 1 ; Exp 2 ; Exp 3 "+" Exp 2 ; Exp 3 ; : : : "!" Exp 8 "(" Exp ")" Id Int. Const ; ; ; NB: Tool supports BOTH ! Claus Brabrand, ITU BANANA ALGEBRA Abstract syntax: Stm. repeat = Stm. do(<1>, Exp. exp 1( Exp 1. exp 2( Exp 2. exp 3( Exp 3. exp 4( Exp 4. exp 5( Exp 5. exp 6( Exp 6. exp 7( Exp 7. neg( Exp 8. par(<2>) ))))) ; (unambiguous: concrete abstract) [ 16 ] ETAPS/LDTA, York March 28, 2009
"FUN" Example The "FUN" Language: (at Aarhus University) used for Teaching Functional Programming Fun grammar transform Literals Basically The Lambda Calculus with…: numerals, booleans, arithmetic, boolean logic, local definitions, pairs, literals, lists, signs, comparisons, dynamic types, fixed-point combinators, … Literals→Nums Unsigned arithmetic + booleans + definitions + pairs Nums→λ + Bools→λ + Defs→λ + Pairs→λ Lambda Calculus Claus Brabrand, ITU BANANA ALGEBRA [ 17 ] ETAPS/LDTA, York March 28, 2009
"FUN" Example Fun Component re-use Fun grammar transform Fun + Fun. Signed Fun grammar transform + Fun. Signed GT Literals→Nums Signed arith→Nums Unsigned arithmetic + booleans + definitions + pairs Nums→λ + Bools→λ + Defs→λ + Pairs→λ Lambda Calculus Claus Brabrand, ITU BANANA ALGEBRA [ 18 ] ETAPS/LDTA, York March 28, 2009
"FUN" Example Fun + Fun. Signed + Fun. Compare + Fun. Typesafe Fun GT + Fun. Signed GT + Fun. Compare GT + Fun. Typesafe GT 245 x Banana Algebra ops 4 MB Banana ! Unsigned arithmetic + booleans + definitions + pairs Nums→λ + Bools→λ + Defs→λ + Pairs→λ Lambda Calculus Claus Brabrand, ITU BANANA ALGEBRA [ 19 ] ETAPS/LDTA, York March 28, 2009
"FUN" Usage Statistics Usage statistics (245 x operators) in "FUN": 58 x 51 x 28 x 23 x 17 x 14 x 10 x 9 x 8 x 4 x 4 x 2 x Claus Brabrand, ITU { …cfg… } Constant languages "file. l" Language inclusions L + L Language additions v Language variables (|L L [ ] c|) Constant transformations X + X Transformation additions "file. x" Transformation inclusions let-in Local definitions idx(L) Identity transformations X X Compositions L L Language restriction w Transformation variables src(X) Source extractions BANANA ALGEBRA [ 20 ] ETAPS/LDTA, York March 28, 2009
Other Examples Self-Application (The tool on itself!): [L 1 << L 2] [X 1 << X 2] = = '(L 1 L 2) + L 2' '(X 1 src(X 2)) + X 2' SQL embedding (in <bigwig>): Stm. select = 'factor (<2>) { if (<3>) return ( # + (<1>) ); }' My-Java (endless variations): ((java loops) + sql) o syntaxe_francais Claus Brabrand, ITU BANANA ALGEBRA [ 21 ] ETAPS/LDTA, York March 28, 2009
Implementation The 'Banana Algebra' Tool: (3, 600 lines of O'Caml) [ http: //www. itu. dk/people/brabrand/banana-algebra/ ] Uses (underlying technologies): 'dk. brics. grammar': for parsing, unparsing, and ambiguity analysis ! 'XSugar': for transformation: "concrete syntax abstract XML syntax" 'XSLT': for transformation: "XML XML" Claus Brabrand, ITU BANANA ALGEBRA [ 22 ] ETAPS/LDTA, York March 28, 2009
Outline Introduction: "What is a Banana? " Bananas for Language Transformation Language Extension Pattern Banana Algebra Examples Implementation Related Work Conclusion Claus Brabrand, ITU BANANA ALGEBRA [ 23 ] ETAPS/LDTA, York March 28, 2009
Related Work (I/III) Macro Systems: "Growing Languages with Metamorphic Syntax Macros" [ Claus Brabrand | Michael Schwartzbach ] ( PEPM 2002 ) "The metafront System: Safe and Extensible Parsing and Transformation" [ Claus Brabrand | Michael Schwartzbach ] ( LDTA 2003 , SCP J. 2007 ) Claus Brabrand, ITU BANANA ALGEBRA [ 24 ] ETAPS/LDTA, York March 28, 2009
Related Work (II/III) Attribute Grammars: Language transformation (and extension)… …via computation on AST's (using "inherited" or "synthesized" or … attributes) E. g. , Eli, Jast. Add, Silver, Both; … compared to bananas: More ambitious (expressivity) No termination guarantees (safety) Transformation "indirect" (simplicity) Rewrite Systems: Language transformation (and extension)… …via syntactic rewriting, using encodings…: S T gradually rewrite "S-syntax" to "T-syntax" E. g. , Elan, TXL, ASF+SDF, Stratego/XT, … Claus Brabrand, ITU BANANA ALGEBRA [ 25 ] ETAPS/LDTA, York March 28, 2009
Related Work (III/III) Functional Programming: Catas mimicked by "disciplined style" of fun. programming …aided by: Basically ye olde issue: GPL vs. DSL Traversal functions (auto-synthesized from datatypes) Combinator libraries "Shortcut fusion" (to eliminate ' ' at compile-time) Category Theory: A lot of this work can be viewed as Category Theory: Claus Brabrand, ITU BANANA ALGEBRA [ 26 ] ETAPS/LDTA, York March 28, 2009
Conclusion IF bananas are sufficiently: "Niche" (Expressive) THEN you get…: Banana Algebra “for free” (16 banana ops): Incremental Modular Simple Safe Efficient Claus Brabrand, ITU Statically reduce: Banana Algebra (term) Banana (term) BANANA ALGEBRA [ 27 ] ETAPS/LDTA, York March 28, 2009
BONUS SLIDES - Reduction Semantics If you want all the details: "Syntactic Language Extension via an Algebra of Languages and Transformations " [ Jacob Andersen | Claus Brabrand ] ( ITU Technical Report, Dec. 2008 ) Claus Brabrand, ITU BANANA ALGEBRA [ 28 ] ETAPS/LDTA, York March 28, 2009
Reduction Semantics Environments: ENVL = VARL EXPL environment of languages ENVX = VARX EXPX environment of transformations Reduction relations: ENVL ENVX EXPL ' L' ENVL ENVX EXPX ' X' Abbreviations: , |- L L l. . . as a short-hand for: ( , , L, l) ' L' , |- X X x. . . as a short-hand for: ( , , X, x) ' X' Claus Brabrand, ITU BANANA ALGEBRA [ 29 ] ETAPS/LDTA, York March 28, 2009
Semantics (L) [CONL] , l L l , , [RESL] , , , [ADDL] , Claus Brabrand, ITU wfl l [VARL] L L l L' L l' L L' L l' L + L' L l BANANA ALGEBRA [ 30 ] l , l l' v L (v) l' l ~l l' ETAPS/LDTA, York March 28, 2009
Semantics (L) [SRCL] [TGTL] [LETL] X X (| l. S -> l. T [ ] c |) , , Claus Brabrand, ITU src (X) L l. S , L , L tgt (X) L l. T l [v=l], L' L l' let v=L in L' L l' BANANA ALGEBRA [ 31 ] ETAPS/LDTA, York March 28, 2009
Semantics (X) [CONX] [VARX] , (| l. S -> l. T [ ] c |) , LS , (| LS -> LT [ ] c |) X (| l. S -> l. T [ ] c |) , L l S w X (w) , , [ADDX] , Claus Brabrand, ITU LT L l T , , [RESX] , X X x X' X x' X + X' X x BANANA ALGEBRA [ 32 ] x wfx X X x L L l X L X x x' x l x ~x x' ETAPS/LDTA, York March 28, 2009
Semantics (X) [COMPX] [IDXX] , , X X (| l. S -> l. T [ ] c |) X' X (| l. S' -> l. T' [ '] c' |) , X' X X (| l. S -> l. T' [ ' ] c' c |) , , [LETL] l. T l l S' L L l idx (L) X (| l -> l [id (l)] idc(l) |) , Claus Brabrand, ITU X X x , , [w=x] X' X x' letx w=X in X' X x' BANANA ALGEBRA [ 33 ] ETAPS/LDTA, York March 28, 2009
BONUS SLIDES - More Examples - Claus Brabrand, ITU BANANA ALGEBRA [ 34 ] ETAPS/LDTA, York March 28, 2009
Numeral & Boolean Extension Numeral Extension (catamorphism): [var V] [lam V E] [app E 1 E 2] [zero] [succ E] [pred E] = = = var lam app [V ] [E 1] [E 2] z (var z) s [E ] (lam z (var z)) Exp : : : var Id lam Id * Exp app Exp * Exp zero succ Exp pred Exp : var Id : lam Id * Exp : app Exp * Exp Boolean Extension (catamorphism): [var V] [lam V E] [app E 1 E 2] [true] [false] [if E 1 E 2 E 3] Claus Brabrand, ITU = = = var lam app [V ] [E 1] [E 2] a (lam b (var a)) a (lam b (var b)) (app [E 1] [E 2]) [E 3] BANANA ALGEBRA [ 35 ] ETAPS/LDTA, York Exp : : : var Id lam Id * Exp app Exp * Exp true false if Exp Exp : var Id : lam Id * Exp : app Exp * Exp March 28, 2009
Lambda with Booleans + lb+l l lb l l l (| lb -> l [Exp -> Exp] [true] = 'a. b. a' [false] = 'a. b. b' [if E 1 E 2 E 3] = '(([E 1] [E 2]) [E 3])' |) idx l Exp : var Id : lam Id * Exp : app Exp * Exp Claus Brabrand, ITU BANANA ALGEBRA lb Exp : true : false : if Exp Exp [ 36 ] ETAPS/LDTA, York March 28, 2009
Incremental Development --- "li. l" --- "l. l" --{ Id Exp. var Exp. lam Exp. app = : : : [a-z] [a-z 0 -9]* Id "\" Id ". " Exp "(" Exp ")" ; ; } --- "ln. l" --{ Exp. zero Exp. succ Exp. pred : : : "zero" "succ" Exp "pred" Exp ; ; ; } { --- "li 2 l. x" --let l = "l. l" in idx(l) + (| "li. l" -> l [Exp -> Exp] Exp. id : 'z. z' ; |) --- "ln 2 l. x" --let l = "l. l" in idx(l) + (| "ln. l" -> l [Exp -> Exp] Exp. zero : 'z. z' Exp. succ : 'x. $1' Exp. pred : '($1 z. z)' |) Exp. id : "id" ; } --- "ln 2 li. x" --- ; ; ; let l = "l. l" in idx(l) + (| ln -> l+"li. l" [Exp -> Exp] Exp. zero : 'id' ; Exp. succ : 'x. $1' ; Exp. pred : '($1 id)' ; |) --- "ln 2 l. x" --Claus Brabrand, ITU BANANA ALGEBRA [ 37 ] "li 2 l. x" York o "ln 2 li. x" March 28, 2009 ETAPS/LDTA,
Example cont'd Both statically reduce to same catamorphism: (| { Id = [a-z] [0 -9 a-z]* Exp. app Exp. lam Exp. pred Exp. succ Exp. var Exp. zero : : : ; "(" Exp ")" ; "" Id ". " Exp ; "pred" Exp ; "succ" Exp ; Id ; "zero" ; } { Id -> = [a-z] [0 -9 a-z]* ; Exp. app : "(" Exp ")" ; Exp. lam : "" Id ". " Exp ; Exp. var : Id ; } [Exp -> Exp, Id->Id] Exp. app Exp. lam Exp. pred Exp. succ Exp. var Exp. zero : : : Exp. app($1, $2) ; Exp. lam($1, $2) ; Exp. app($1, Exp. lam(Id("z"), Exp. var(Id("z")))) Exp. lam(Id("x"), $1) ; Exp. var($1) ; Exp. lam(Id("z"), Exp. var(Id("z"))) ; ; |) Claus Brabrand, ITU BANANA ALGEBRA [ 38 ] ETAPS/LDTA, York March 28, 2009
Usage Scenarios Programmers: May extend existing languages (~ syntax macros) Developers: May embed DSLs into host languages (SQL in Java) Developers (and teachers): May incrementally specify multi-layered languages Compiler writers: May rely on tool and implement only a small core (and then specify the rest externally as extensions) Claus Brabrand, ITU BANANA ALGEBRA [ 40 ] ETAPS/LDTA, York March 28, 2009
BONUS SLIDES - Parsing & Error Reporting - Claus Brabrand, ITU BANANA ALGEBRA [ 41 ] ETAPS/LDTA, York March 28, 2009
Parsing (XSugar): Variant of Earley's algorithm: O ( | |3 ) Can parse any context-free grammar Closed under union of languages Support for production priority Tool easily adapts to other parsing algorithms Claus Brabrand, ITU BANANA ALGEBRA [ 42 ] ETAPS/LDTA, York March 28, 2009
Ambiguity: parsing unparsing . . . L Parsing: ASTL / ~L . L Unparsing: BANANA ALGEBRA . Canonical whitespace Grammar ambiguity Claus Brabrand, ITU ASTL / ~L [ 43 ] ETAPS/LDTA, York March 28, 2009
Ambiguity Analysis: "Analyzing Ambiguity of Context-Free Grammars" [ Claus Brabrand | Robert Giegerich | Anders Møller ] ( CIAA 2007 ) " ) on: Using implementation ( "dk. brics. grammar [ by Anders Møller ] Source language; Target language; and/or …all intermediate languages (somewhat expensive) (Note: Ambiguity analysis comes with XSugar tool) Claus Brabrand, ITU BANANA ALGEBRA [ 44 ] ETAPS/LDTA, York March 28, 2009
Error Reporting Error reporting: Prototype Static parse-error (O'Caml-lex): *** In ln 2 l. x (4, 4)-(4, 7): Parse error at "Exp" Static transformation error (XSugar): *** Parse error at character 6 (line 1, column 7) in /tmp/shape 84 e 645. txt Could be improved (is actually a parse-error in a cata reconstructor) Dynamic parse-error (XSugar): *** Parse error at character 23 (line 1, column 24) in /dev/stdin Dynamic transformation error: impossible : -) Claus Brabrand, ITU BANANA ALGEBRA [ 45 ] ETAPS/LDTA, York March 28, 2009
- Slides: 44