PROSE Inductive Program Synthesis for the Mass Markets







![PBE Timeline Flash. Fill Flash. Extract Flash. Relate 2010 -2012 [POPL 11] 2012 -2014 PBE Timeline Flash. Fill Flash. Extract Flash. Relate 2010 -2012 [POPL 11] 2012 -2014](https://slidetodoc.com/presentation_image/2cb88dad6ad80ca276192c3945868754/image-8.jpg)

![PBE Timeline Flash. Fill Flash. Extract Flash. Relate 2010 -2012 [POPL 11] 2012 -2014 PBE Timeline Flash. Fill Flash. Extract Flash. Relate 2010 -2012 [POPL 11] 2012 -2014](https://slidetodoc.com/presentation_image/2cb88dad6ad80ca276192c3945868754/image-10.jpg)





















![Ambiguity resolution Option 1: machine-learned robustness-based ranking [Singh & Gulwani 15] • Idioms/patterns from Ambiguity resolution Option 1: machine-learned robustness-based ranking [Singh & Gulwani 15] • Idioms/patterns from](https://slidetodoc.com/presentation_image/2cb88dad6ad80ca276192c3945868754/image-32.jpg)










- Slides: 42
PROSE: Inductive Program Synthesis for the Mass Markets Alex Polozov Microsoft PROSE team polozov@cs. washington. edu prose-contact@microsoft. com https: //microsoft. github. io/prose Jan 20, 2017 UC Berkeley 1
PROgram Synthesis using Examples Sumit Gulwani Prateek Jain Ranvijay Kumar Mark Plesko Alex Polozov Mohammad Raza Jan 20, 2017 UC Berkeley Vu Le Danny Simmons Daniel Perelman Abhishek Udupa 2
Hackathon • Jan 20, 2017 UC Berkeley 3
Outline ► Programming by Examples PROSE Framework • Backpropagation: technical insights Mass-Market Deployment • Challenges & lessons Next Generation of Synthesis • Predictive, interactive, debuggable Jan 20, 2017 UC Berkeley 4
Motivation 99% of spreadsheet users do not know programming Data scientists spend 80% time extracting & cleaning data Jan 20, 2017 UC Berkeley 5
Flash Fill Jan 20, 2017 UC Berkeley 6
PBE Architecture Refined intent Debugging Program Synthesizer Jan 20, 2017 UC Berkeley Translator Intended program in Python/C#/C++/… 7
PBE Timeline Flash. Fill Flash. Extract Flash. Relate 2010 -2012 [POPL 11] 2012 -2014 [PLDI 14] 2012 -2015 [PLDI 15] (text transformations) (text extraction) (table transformations) Jan 20, 2017 UC Berkeley … 8
“Project Flash. Meta” • Jan 20, 2017 UC Berkeley 9
PBE Timeline Flash. Fill Flash. Extract Flash. Relate 2010 -2012 [POPL 11] 2012 -2014 [PLDI 14] 2012 -2015 [PLDI 15] (text transformations) (text extraction) (table transformations) Jan 20, 2017 UC Berkeley … Flash. Meta (PBE framework) 2014 -2015 [OOPSLA 15] PROSE SDK 2015 -present 10
Outline Programming by Examples ► PROSE Framework • Backpropagation: technical insights Mass-Market Deployment • Challenges & lessons Next Generation of Synthesis • Predictive, interactive, debuggable Jan 20, 2017 UC Berkeley 11
PROSE I/O Specification Input Meta-synthesizer framework Synthesis Strategies PROSE App Synthesizer Programs Output DSL Definition Jan 20, 2017 UC Berkeley 12
PROSE I/O Specification Input Meta-synthesizer framework Synthesis Strategies PROSE App Synthesizer Programs Output DSL Definition Jan 20, 2017 UC Berkeley 13
Key Insights • Jan 20, 2017 UC Berkeley 14
Backpropagation in one slide Examples for Substring(s, P 1, P 2) Seattle, WA Examples for P 1 Examples for P 2 Seattle, WA [Polozov & Gulwani 15] Jan 20, 2017 UC Berkeley 15
Backpropagation, a. k. a. Inverse Semantics • Jan 20, 2017 UC Berkeley Input Output 100 76 -100 51 51 -75 16
Backpropagation, a. k. a. Inverse Semantics • Jan 20, 2017 100 76 -100 51 51 -75 UC Berkeley Input Output 100 76 -100 51 51 -75 17
Backpropagation, a. k. a. Inverse Semantics • Jan 20, 2017 100 76 -100 51 51 -75 UC Berkeley Input Output 100 76 -100 51 51 -75 100 51 18
Backpropagation, a. k. a. Inverse Semantics • 100 76 -100 51 51 -75 Input Output 100 76 -100 51 51 -75 100 51 Jan 20, 2017 100 7 51 5 … 100 76 51 … 51 UC Berkeley 100 76 -10 51 51 -7 19
Backpropagation, a. k. a. Inverse Semantics • 100 76 -100 51 51 -75 Input Output 100 76 -100 51 51 -75 100 51 Jan 20, 2017 100 7 51 5 … 100 76 51 … 51 100 76 -10 51 51 -7 100 6 -100 51 1 -75 UC Berkeley 20
Backpropagation, a. k. a. Inverse Semantics • 100 76 -100 51 51 -75 Input Output 100 76 -100 51 51 -75 100 51 Jan 20, 2017 100 7 51 5 100 6 -100 51 1 -75 100 76 … 51 100 76 -10 51 51 -7 100 -100 51 -75 UC Berkeley 21
Backpropagation, a. k. a. Inverse Semantics • 100 76 -100 51 51 -75 Input Output 100 76 -100 51 51 -75 100 51 Jan 20, 2017 100 7 51 5 100 6 -100 51 1 -75 100 76 … 51 100 -100 51 100 76 -10 -75 UC Berkeley 51 51 -7 100 0 51 5 22
Backpropagation, a. k. a. Inverse Semantics Backpropagation • 100 76 -100 51 51 -75 Input Output 100 76 -100 51 51 -75 100 51 Jan 20, 2017 100 7 51 5 100 6 -100 51 1 -75 100 76 … 51 100 -100 51 100 76 -10 -75 UC Berkeley 51 51 -7 100 0 51 5 23
Backpropagation, a. k. a. Inverse Semantics Conditional backpropagation Input Output Backpropagation • 100 76 -100 51 51 -75 100 51 Jan 20, 2017 100 7 51 5 100 6 -100 51 1 -75 100 76 … 51 100 -100 51 100 76 -10 -75 UC Berkeley 51 51 -7 100 0 51 5 24
Backpropagation, a. k. a. Inverse Semantics Conditional backpropagation Input Output Backpropagation • 100 76 -100 51 51 -75 Jan 20, 2017 7 51 5 100 6 -100 51 1 -75 100 76 … 51 100 -100 51 -75 UC Berkeley 51 51 -75 51 100 76 -10 … 51 76 -100 Ex. BK: breaking numbers is unlikely 100 51 51 -7 100 0 51 5 25
Backpropagation, a. k. a. Inverse Semantics • Jan 20, 2017 UC Berkeley 26
Performance & Number of Examples Jan 20, 2017 UC Berkeley 27
Outline Programming by Examples PROSE Framework • Backpropagation: technical insights ► Mass-Market Deployment • Challenges & lessons Next Generation of Synthesis • Predictive, interactive, debuggable Jan 20, 2017 UC Berkeley 28
Ambiguity Resolution Jan 20, 2017 UC Berkeley 29
Example PP 65 == Concat(First capital letter, Sub. Str(last Sub. Str(second capital letter, Concat(First capital letter, second capital PP 43 == Concat(first capital letter, “A”) PP 2 1==Concat(First capital letter, first letter of lastletter) word) Concat(“I”, first letter of last word) followed by lowercase word, +1)) last word)) Jan 20, 2017 Input P 1 P 2 P 3 P 4 P 5 P 6 Isaac Asimov IA IA IA Kyokutei. Bakin KB KB IB KA KB KB Howard Roger Garis HR HG IG HA HRoger G HG Enid Blyton EB EB IB EA EB EB Edwy S. Brooks ES EB IB EA ES. B EB Barbara Cartland BC BC IC BA BC BC Margaret Atwood MA MA IA MA MA MA Iain M. Banks IM IB IB IA IM. B IB John Smith III JS JI II JA JS JS UC Berkeley 30
Anecdotes • Flash Fill was not accepted to Excel until it solved the most common scenarios from one example Adam Smith Adam Alice Williams Alic • Some users still do not know you can give two! Jan 20, 2017 UC Berkeley 31
Ambiguity resolution Option 1: machine-learned robustness-based ranking [Singh & Gulwani 15] • Idioms/patterns from test data can influence search & ranking • E. g. : bucketing 100 76 -100 51 51 -75 86 Option 2: interactive clarification • Pick an input or a subset of inputs to use for disambiguation Jan 20, 2017 UC Berkeley 32
Distinguishing Inputs PP 65 == Concat(First capital letter, Sub. Str(last Sub. Str(second capital letter, Concat(First capital letter, second capital PP 43 == Concat(first capital letter, “A”) PP 2 1==Concat(First capital letter, first letter of lastletter) word) Concat(“I”, first letter of last word) followed by lowercase word, +1)) last word)) Jan 20, 2017 Input P 1 P 2 P 3 P 4 P 5 P 6 Isaac Asimov IA IA IA Kyokutei. Bakin KB KB IB KA KB KB Howard Roger Garis HR HG IG HA HRoger G HG Enid Blyton EB EB IB EA EB EB Edwy S. Brooks ES EB IB EA ES. B EB Barbara Cartland BC BC IC BA BC BC Margaret Atwood MA MA IA MA MA MA Iain M. Banks IM IB IB IA IM. B IB John Smith III JS JI II JA JS JS UC Berkeley 33
Ambiguity resolution – Summary • Jan 20, 2017 UC Berkeley 34
Development Should I process the string “ 25 -06 -11” with regexes? Treat it as a numeric computation? A date? • * Once you learn the skill… Jan 20, 2017 UC Berkeley 35
Noise Input 2/3/2011 1/11/2017 10/4/2016 Output Thu Wed thu 1. It is easier to prevent a mistake in a spec than to fix it. 2. How did you know it was a Thursday in the first place? Jan 20, 2017 UC Berkeley 36
Outline Programming by Examples PROSE Framework • Backpropagation: technical insights Mass-Market Deployment • Challenges & lessons ► Next Generation of Synthesis • Predictive, interactive, debuggable Jan 20, 2017 UC Berkeley 37
Predictive Program Synthesis • Jan 20, 2017 UC Berkeley 38
Example: Text splitting • Any number of arbitrary delimiter strings • A string be used as a delimiter in some places but not in the others • A delimiter may be empty Jan 20, 2017 UC Berkeley 39
Interactive Program Synthesis • Jan 20, 2017 UC Berkeley 40
Interactive Program Synthesis Refined intent �� �� Best program Translator User (Debugging) Program Synthesis Framework Interactive questions Hypothesizer Test inputs Deployable code in Python/R/C#/… Jan 20, 2017 UC Berkeley 41
Summary • Decomposition of PBE into a meta-algorithm & backprop functions • PROSE: Modular and accessible for industrial software development • Deductive reasoning ensures real-time response on wrangling tasks • Key challenges of industrial PBE: ambiguity resolution and debugging • Interactive clarification is the most effective disambiguation model • Should be a first-class citizen in the synthesis frameworks • Come try for yourself tomorrow! https: //microsoft. github. io/prose Jan 20, 2017 UC Berkeley Thank you! 42