PROSE Inductive Program Synthesis for the Mass Markets

  • Slides: 42
Download presentation
PROSE: Inductive Program Synthesis for the Mass Markets Alex Polozov Microsoft PROSE team polozov@cs.

PROSE: Inductive Program Synthesis for the Mass Markets Alex Polozov Microsoft PROSE team polozov@cs. washington. edu prose-contact@microsoft. com https: //microsoft. github. io/prose Jan 20, 2017 UC Berkeley 1

PROgram Synthesis using Examples Sumit Gulwani Prateek Jain Ranvijay Kumar Mark Plesko Alex Polozov

PROgram Synthesis using Examples Sumit Gulwani Prateek Jain Ranvijay Kumar Mark Plesko Alex Polozov Mohammad Raza Jan 20, 2017 UC Berkeley Vu Le Danny Simmons Daniel Perelman Abhishek Udupa 2

Hackathon • Jan 20, 2017 UC Berkeley 3

Hackathon • Jan 20, 2017 UC Berkeley 3

Outline ► Programming by Examples PROSE Framework • Backpropagation: technical insights Mass-Market Deployment •

Outline ► Programming by Examples PROSE Framework • Backpropagation: technical insights Mass-Market Deployment • Challenges & lessons Next Generation of Synthesis • Predictive, interactive, debuggable Jan 20, 2017 UC Berkeley 4

Motivation 99% of spreadsheet users do not know programming Data scientists spend 80% time

Motivation 99% of spreadsheet users do not know programming Data scientists spend 80% time extracting & cleaning data Jan 20, 2017 UC Berkeley 5

Flash Fill Jan 20, 2017 UC Berkeley 6

Flash Fill Jan 20, 2017 UC Berkeley 6

PBE Architecture Refined intent Debugging Program Synthesizer Jan 20, 2017 UC Berkeley Translator Intended

PBE Architecture Refined intent Debugging Program Synthesizer Jan 20, 2017 UC Berkeley Translator Intended program in Python/C#/C++/… 7

PBE Timeline Flash. Fill Flash. Extract Flash. Relate 2010 -2012 [POPL 11] 2012 -2014

PBE Timeline Flash. Fill Flash. Extract Flash. Relate 2010 -2012 [POPL 11] 2012 -2014 [PLDI 14] 2012 -2015 [PLDI 15] (text transformations) (text extraction) (table transformations) Jan 20, 2017 UC Berkeley … 8

“Project Flash. Meta” • Jan 20, 2017 UC Berkeley 9

“Project Flash. Meta” • Jan 20, 2017 UC Berkeley 9

PBE Timeline Flash. Fill Flash. Extract Flash. Relate 2010 -2012 [POPL 11] 2012 -2014

PBE Timeline Flash. Fill Flash. Extract Flash. Relate 2010 -2012 [POPL 11] 2012 -2014 [PLDI 14] 2012 -2015 [PLDI 15] (text transformations) (text extraction) (table transformations) Jan 20, 2017 UC Berkeley … Flash. Meta (PBE framework) 2014 -2015 [OOPSLA 15] PROSE SDK 2015 -present 10

Outline Programming by Examples ► PROSE Framework • Backpropagation: technical insights Mass-Market Deployment •

Outline Programming by Examples ► PROSE Framework • Backpropagation: technical insights Mass-Market Deployment • Challenges & lessons Next Generation of Synthesis • Predictive, interactive, debuggable Jan 20, 2017 UC Berkeley 11

PROSE I/O Specification Input Meta-synthesizer framework Synthesis Strategies PROSE App Synthesizer Programs Output DSL

PROSE I/O Specification Input Meta-synthesizer framework Synthesis Strategies PROSE App Synthesizer Programs Output DSL Definition Jan 20, 2017 UC Berkeley 12

PROSE I/O Specification Input Meta-synthesizer framework Synthesis Strategies PROSE App Synthesizer Programs Output DSL

PROSE I/O Specification Input Meta-synthesizer framework Synthesis Strategies PROSE App Synthesizer Programs Output DSL Definition Jan 20, 2017 UC Berkeley 13

Key Insights • Jan 20, 2017 UC Berkeley 14

Key Insights • Jan 20, 2017 UC Berkeley 14

Backpropagation in one slide Examples for Substring(s, P 1, P 2) Seattle, WA Examples

Backpropagation in one slide Examples for Substring(s, P 1, P 2) Seattle, WA Examples for P 1 Examples for P 2 Seattle, WA [Polozov & Gulwani 15] Jan 20, 2017 UC Berkeley 15

Backpropagation, a. k. a. Inverse Semantics • Jan 20, 2017 UC Berkeley Input Output

Backpropagation, a. k. a. Inverse Semantics • Jan 20, 2017 UC Berkeley Input Output 100 76 -100 51 51 -75 16

Backpropagation, a. k. a. Inverse Semantics • Jan 20, 2017 100 76 -100 51

Backpropagation, a. k. a. Inverse Semantics • Jan 20, 2017 100 76 -100 51 51 -75 UC Berkeley Input Output 100 76 -100 51 51 -75 17

Backpropagation, a. k. a. Inverse Semantics • Jan 20, 2017 100 76 -100 51

Backpropagation, a. k. a. Inverse Semantics • Jan 20, 2017 100 76 -100 51 51 -75 UC Berkeley Input Output 100 76 -100 51 51 -75 100 51 18

Backpropagation, a. k. a. Inverse Semantics • 100 76 -100 51 51 -75 Input

Backpropagation, a. k. a. Inverse Semantics • 100 76 -100 51 51 -75 Input Output 100 76 -100 51 51 -75 100 51 Jan 20, 2017 100 7 51 5 … 100 76 51 … 51 UC Berkeley 100 76 -10 51 51 -7 19

Backpropagation, a. k. a. Inverse Semantics • 100 76 -100 51 51 -75 Input

Backpropagation, a. k. a. Inverse Semantics • 100 76 -100 51 51 -75 Input Output 100 76 -100 51 51 -75 100 51 Jan 20, 2017 100 7 51 5 … 100 76 51 … 51 100 76 -10 51 51 -7 100 6 -100 51 1 -75 UC Berkeley 20

Backpropagation, a. k. a. Inverse Semantics • 100 76 -100 51 51 -75 Input

Backpropagation, a. k. a. Inverse Semantics • 100 76 -100 51 51 -75 Input Output 100 76 -100 51 51 -75 100 51 Jan 20, 2017 100 7 51 5 100 6 -100 51 1 -75 100 76 … 51 100 76 -10 51 51 -7 100 -100 51 -75 UC Berkeley 21

Backpropagation, a. k. a. Inverse Semantics • 100 76 -100 51 51 -75 Input

Backpropagation, a. k. a. Inverse Semantics • 100 76 -100 51 51 -75 Input Output 100 76 -100 51 51 -75 100 51 Jan 20, 2017 100 7 51 5 100 6 -100 51 1 -75 100 76 … 51 100 -100 51 100 76 -10 -75 UC Berkeley 51 51 -7 100 0 51 5 22

Backpropagation, a. k. a. Inverse Semantics Backpropagation • 100 76 -100 51 51 -75

Backpropagation, a. k. a. Inverse Semantics Backpropagation • 100 76 -100 51 51 -75 Input Output 100 76 -100 51 51 -75 100 51 Jan 20, 2017 100 7 51 5 100 6 -100 51 1 -75 100 76 … 51 100 -100 51 100 76 -10 -75 UC Berkeley 51 51 -7 100 0 51 5 23

Backpropagation, a. k. a. Inverse Semantics Conditional backpropagation Input Output Backpropagation • 100 76

Backpropagation, a. k. a. Inverse Semantics Conditional backpropagation Input Output Backpropagation • 100 76 -100 51 51 -75 100 51 Jan 20, 2017 100 7 51 5 100 6 -100 51 1 -75 100 76 … 51 100 -100 51 100 76 -10 -75 UC Berkeley 51 51 -7 100 0 51 5 24

Backpropagation, a. k. a. Inverse Semantics Conditional backpropagation Input Output Backpropagation • 100 76

Backpropagation, a. k. a. Inverse Semantics Conditional backpropagation Input Output Backpropagation • 100 76 -100 51 51 -75 Jan 20, 2017 7 51 5 100 6 -100 51 1 -75 100 76 … 51 100 -100 51 -75 UC Berkeley 51 51 -75 51 100 76 -10 … 51 76 -100 Ex. BK: breaking numbers is unlikely 100 51 51 -7 100 0 51 5 25

Backpropagation, a. k. a. Inverse Semantics • Jan 20, 2017 UC Berkeley 26

Backpropagation, a. k. a. Inverse Semantics • Jan 20, 2017 UC Berkeley 26

Performance & Number of Examples Jan 20, 2017 UC Berkeley 27

Performance & Number of Examples Jan 20, 2017 UC Berkeley 27

Outline Programming by Examples PROSE Framework • Backpropagation: technical insights ► Mass-Market Deployment •

Outline Programming by Examples PROSE Framework • Backpropagation: technical insights ► Mass-Market Deployment • Challenges & lessons Next Generation of Synthesis • Predictive, interactive, debuggable Jan 20, 2017 UC Berkeley 28

Ambiguity Resolution Jan 20, 2017 UC Berkeley 29

Ambiguity Resolution Jan 20, 2017 UC Berkeley 29

Example PP 65 == Concat(First capital letter, Sub. Str(last Sub. Str(second capital letter, Concat(First

Example PP 65 == Concat(First capital letter, Sub. Str(last Sub. Str(second capital letter, Concat(First capital letter, second capital PP 43 == Concat(first capital letter, “A”) PP 2 1==Concat(First capital letter, first letter of lastletter) word) Concat(“I”, first letter of last word) followed by lowercase word, +1)) last word)) Jan 20, 2017 Input P 1 P 2 P 3 P 4 P 5 P 6 Isaac Asimov IA IA IA Kyokutei. Bakin KB KB IB KA KB KB Howard Roger Garis HR HG IG HA HRoger G HG Enid Blyton EB EB IB EA EB EB Edwy S. Brooks ES EB IB EA ES. B EB Barbara Cartland BC BC IC BA BC BC Margaret Atwood MA MA IA MA MA MA Iain M. Banks IM IB IB IA IM. B IB John Smith III JS JI II JA JS JS UC Berkeley 30

Anecdotes • Flash Fill was not accepted to Excel until it solved the most

Anecdotes • Flash Fill was not accepted to Excel until it solved the most common scenarios from one example Adam Smith Adam Alice Williams Alic • Some users still do not know you can give two! Jan 20, 2017 UC Berkeley 31

Ambiguity resolution Option 1: machine-learned robustness-based ranking [Singh & Gulwani 15] • Idioms/patterns from

Ambiguity resolution Option 1: machine-learned robustness-based ranking [Singh & Gulwani 15] • Idioms/patterns from test data can influence search & ranking • E. g. : bucketing 100 76 -100 51 51 -75 86 Option 2: interactive clarification • Pick an input or a subset of inputs to use for disambiguation Jan 20, 2017 UC Berkeley 32

Distinguishing Inputs PP 65 == Concat(First capital letter, Sub. Str(last Sub. Str(second capital letter,

Distinguishing Inputs PP 65 == Concat(First capital letter, Sub. Str(last Sub. Str(second capital letter, Concat(First capital letter, second capital PP 43 == Concat(first capital letter, “A”) PP 2 1==Concat(First capital letter, first letter of lastletter) word) Concat(“I”, first letter of last word) followed by lowercase word, +1)) last word)) Jan 20, 2017 Input P 1 P 2 P 3 P 4 P 5 P 6 Isaac Asimov IA IA IA Kyokutei. Bakin KB KB IB KA KB KB Howard Roger Garis HR HG IG HA HRoger G HG Enid Blyton EB EB IB EA EB EB Edwy S. Brooks ES EB IB EA ES. B EB Barbara Cartland BC BC IC BA BC BC Margaret Atwood MA MA IA MA MA MA Iain M. Banks IM IB IB IA IM. B IB John Smith III JS JI II JA JS JS UC Berkeley 33

Ambiguity resolution – Summary • Jan 20, 2017 UC Berkeley 34

Ambiguity resolution – Summary • Jan 20, 2017 UC Berkeley 34

Development Should I process the string “ 25 -06 -11” with regexes? Treat it

Development Should I process the string “ 25 -06 -11” with regexes? Treat it as a numeric computation? A date? • * Once you learn the skill… Jan 20, 2017 UC Berkeley 35

Noise Input 2/3/2011 1/11/2017 10/4/2016 Output Thu Wed thu 1. It is easier to

Noise Input 2/3/2011 1/11/2017 10/4/2016 Output Thu Wed thu 1. It is easier to prevent a mistake in a spec than to fix it. 2. How did you know it was a Thursday in the first place? Jan 20, 2017 UC Berkeley 36

Outline Programming by Examples PROSE Framework • Backpropagation: technical insights Mass-Market Deployment • Challenges

Outline Programming by Examples PROSE Framework • Backpropagation: technical insights Mass-Market Deployment • Challenges & lessons ► Next Generation of Synthesis • Predictive, interactive, debuggable Jan 20, 2017 UC Berkeley 37

Predictive Program Synthesis • Jan 20, 2017 UC Berkeley 38

Predictive Program Synthesis • Jan 20, 2017 UC Berkeley 38

Example: Text splitting • Any number of arbitrary delimiter strings • A string be

Example: Text splitting • Any number of arbitrary delimiter strings • A string be used as a delimiter in some places but not in the others • A delimiter may be empty Jan 20, 2017 UC Berkeley 39

Interactive Program Synthesis • Jan 20, 2017 UC Berkeley 40

Interactive Program Synthesis • Jan 20, 2017 UC Berkeley 40

Interactive Program Synthesis Refined intent �� �� Best program Translator User (Debugging) Program Synthesis

Interactive Program Synthesis Refined intent �� �� Best program Translator User (Debugging) Program Synthesis Framework Interactive questions Hypothesizer Test inputs Deployable code in Python/R/C#/… Jan 20, 2017 UC Berkeley 41

Summary • Decomposition of PBE into a meta-algorithm & backprop functions • PROSE: Modular

Summary • Decomposition of PBE into a meta-algorithm & backprop functions • PROSE: Modular and accessible for industrial software development • Deductive reasoning ensures real-time response on wrangling tasks • Key challenges of industrial PBE: ambiguity resolution and debugging • Interactive clarification is the most effective disambiguation model • Should be a first-class citizen in the synthesis frameworks • Come try for yourself tomorrow! https: //microsoft. github. io/prose Jan 20, 2017 UC Berkeley Thank you! 42