Syntax Matters Tom Henzinger IST Austria with Barbara

Syntax Matters Tom Henzinger (IST Austria) with Barbara Jobstmann (Verimag) Maria Mateescu (EPFL) Verena Wolf (Saarbruecken)

Science Experiment Theory

Mathematics 14

Mathematics Semantics Syntax 14

Mathematics Semantics Syntax 14 1110 XIV

Syntax Matters 1. Expressiveness 0

Syntax Matters 1. Expressiveness 0 2. Succinctness IIII

Syntax Matters 1. Expressiveness 0 2. Succinctness IIII 3. Operations +1 addition multiplication 14 x 34 = 42 56 476

Syntax is More than Notation 1. Expressiveness 0 2. Succinctness IIII 3. Operations +1 addition multiplication (* 14 34) = 14 x 34 = 42 56 476

Syntax is More than Notation 1. Expressiveness 0 2. Succinctness IIII 3. Operations +1 addition multiplication Two languages are equivalent if there is a linear translation from each to the other (e. g. prefix – infix, binary – decimal).

Computer Science Semantics 0 1 2 3 4. . . Syntax 1 1 2 6 24. . . (defun f (n) (if (<= n 1) 1 (* n (f (- n 1)))))

Computer Science Semantics Syntax 1 1 2 6 24. . . 0 1 2 3 4. . . infinite (defun f (n) (if (<= n 1) 1 (* n (f (- n 1))))) finite

Computer Science Semantics Syntax 1 1 2 6 24. . . 0 1 2 3 4. . . infinite (defun f (n) (if (<= n 1) 1 (* n (f (- n 1))))) finite recursively enumerable

Computer Science Semantics 0 1 2 3 4. . . Syntax 1 1 2 6 24. . . (defun f (n) (if (<= n 1) 1 (* n (f (- n 1))))) Composition: (defun h (n) (f (g n)))

Computer Science Semantics Syntax x : = 0; y : = 0; while true do x : = x + 1; y : = y + 1 end

Computer Science Semantics 2, 2 1, 1 0, 0 1, 0 2, 1 Syntax 3, 2 x : = 0; y : = 0; while true do x : = x + 1; y : = y + 1 end

Computer Science Semantics Syntax (inc x)* 0, 0 1, 0 2, 0

Computer Science Semantics (inc x)* 0, 2 (inc y) * 0, 1 0, 0 Syntax 1, 0 2, 0

Computer Science Semantics (inc x)* 0, 2 1, 2 2, 2 0, 1 1, 1 2, 1 0, 0 1, 0 Syntax 2, 0 (inc y) * Composition: (inc x)* || (inc y)*

Markovian Population Models

Markovian Population Models +

Markovian Population Models State: ( 8 , 6 )

Markovian Population Models State: ( 9 , 7 , 5 ) Transition: State: ( 8 , 6 )

Markovian Population Models -discrete state -location unaware

Markovian Population Models -discrete state -location unaware -stochastic transition -continuous time

Markovian Population Models 9, 7, 5 8, 6, 6 deterministic

Markovian Population Models 9, 7, 5 8, 6, 6 deterministic 8, 6, 6 discrete time 0. 6 9, 7, 5 0. 4

Markovian Population Models 9, 7, 5 8, 6, 6 deterministic 8, 6, 6 discrete time 0 0. 6 9, 7, 5 1 0. 4 0

Markovian Population Models 9, 7, 5 8, 6, 6 deterministic 8, 6, 6 discrete time 1 0. 6 9, 7, 5 0. 6 0. 4

Markovian Population Models 9, 7, 5 8, 6, 6 deterministic 8, 6, 6 discrete time 2 0. 6 9, 7, 5 0. 36 0. 4 0. 64

Markovian Population Models 9, 7, 5 8, 6, 6 deterministic 8, 6, 6 discrete time 3 0. 6 9, 7, 5 0. 216 (9, 7, 5): 0. 36 (8, 6, 6): 0. 64 0. 784 (9, 7, 5): 0. 216 (8, 6, 6): 0. 784

Markovian Population Models 9, 7, 5 8, 6, 6 deterministic 8, 6, 6 discrete time 8, 6, 6 continuous time exit rate 0. 5 exp residence time 2 0. 6 9, 7, 5 0. 4 0. 5

Markovian Population Models 9, 7, 5 1 0. 5 8, 6, 6 0 continuous time 0 exit rate 0. 5 exp residence time 2

Markovian Population Models 9, 7, 5 0. 6 0. 5 8, 6, 6 0. 4 continuous time 1 exit rate 0. 5 exp residence time 2

Markovian Population Models 9, 7, 5 0. 4 0. 5 8, 6, 6 0. 6 continuous time 1. 8 exit rate 0. 5 exp residence time 2

Markovian Population Models 9, 7, 5 8, 6, 6 nondeterministic 8, 6, 6 discrete time 8, 6, 6 continuous time 0. 6 9, 7, 5 0. 4 0. 5

Markovian Population Models 9, 7, 5 8, 6, 6 nondeterministic 8, 6, 6 discrete time MDP 8, 6, 6 continuous time 0. 6 9, 7, 5 0. 3 0. 4 0. 7 9, 7, 5 0. 5

Markovian Population Models 9, 7, 5 8, 6, 6 nondeterministic 8, 6, 6 discrete time MDP 8, 6, 6 continuous time exit rate 2 exp residence time 0. 5 0. 6 0. 4 9, 7, 5 0. 3 0. 7 0. 5 9, 7, 5 1. 5

Markovian Population Models 9, 7, 3 0. 2 8, 6, 4 8, 6, 5 0. 2 7, 5, 5 0. 2 0. 1 0. 2 8, 6, 6 CTMC 7, 5, 6 0. 2 0. 1 9, 7, 5 0. 2 0. 1 9, 7, 4 0. 1 0. 2 7, 5, 7 0. 2

Markovian Population Models + 0. 2 0. 1 Syntax: set of transition classes (finite object)

Markovian Population Models 0. 2 + 0. 1 Syntax: set of transition classes (finite object) 0. 1 9, 7, 5 0. 2 8, 6, 6 0. 2 Semantics: CTMC (infinite object)

Markovian Population Models + 0. 2 0. 1 Syntax: set of transition classes (finite object) 0. 5 9, 7, 5 0. 6 8, 6, 6 Semantics: CTMC (infinite object)

Markovian Population Models 0. 2 + 0. 1 Syntax: set of transition classes (finite object) 0. 6 0. 5 9, 7, 5 12. 6 8, 6, 6 9. 6 Semantics: CTMC (infinite object)

Syntax: Transition Class Model (TCM) Dimension n: state (x 1, . . . , xn) 2 S state space S = Nn Finite set of transition classes: each transition class consists of 1. guard G µ S 2. injective update function u: G ! S rate function : G ! R+ 3.

Syntax: Transition Class Model (TCM) Dimension n: state (x 1, . . . , xn) 2 S state space S = Nn n=3 Finite set of transition classes: each transition class consists of 1. guard G µ S 2. injective update function u: G ! S rate function : G ! R+ G 1 : x 1 ¸ 1 Æ x 2 ¸ 1 u 1(x 1, x 2, x 3) = (x 1 -1, x 2 -1, x 3+1) 1(x 1, x 2, x 3) = 0. 2 ¢ x 1 ¢ x 2 G 2 : x 3 ¸ 1 u 2(x 1, x 2, x 3) = (x 1, x 2, x 3 -1) 2(x 1, x 2, x 3) = 0. 1 ¢ x 3 3.

Semantics: Continuous-Time Markov Chain (CTMC) For all times t 2 R+, a random variable X(t) 2 S.

Semantics: Continuous-Time Markov Chain (CTMC) For all times t 2 R+, a random variable X(t) 2 S. Syntax ! Semantics: TCM ! CTMC For each transition class (Gi, ui, i) and all times t 2 R+ and ! 0, Pr( X(t+ ) = ui(x) | X(t) = x ) = i(x) ¢ . In addition, Pr( X(0) = x 0 ) = 1 for some given initial state x 0 2 S.

The Chemical Master Equation Under certain technical conditions that hold for TCMs, X(t) is the unique solution of the ODE system dpt(x) / dt = i: x 2 Hi i(ui-1(x)) ¢ pt(ui-1(x)) - i: x 2 Gi i(x) ¢ pt(x) inflow where outflow pt(x) = Pr( X(t) = x) Hi = { x 2 S | ui-1(x) is defined }. x 1 = u 1 -1(x) 2(x) 1(x 1) u 2(x) x x 2 = u 2 -1(x) 2(x 2) 1(x) u 1(x)

The Chemical Master Equation Under certain technical conditions that hold for TCMs, X(t) is the unique solution of the ODE system dpt(x) / dt = i: x 2 Hi i(ui-1(x)) ¢ pt(ui-1(x)) - i: x 2 Gi i(x) ¢ pt(x) where pt(x) = Pr( X(t) = x) Hi = { x 2 S | ui-1(x) is defined }. If S is finite, then pt = p 0 ¢ e. Qt = p 0 ¢ k=0, 1, 2, . . . (Qt)k / k! where Q is the generator matrix with Q(x, x’) = i(x) if ui(x) = x’ - i i(x) if x = x’ 0 else.

The Chemical Master Equation Under certain technical conditions that hold for TCMs, X(t) is the unique solution of the ODE system dpt(x) / dt = i: x 2 Hi i(ui-1(x)) ¢ pt(ui-1(x)) - i: x 2 Gi i(x) ¢ pt(x) where pt(x) = Pr( X(t) = x) Hi = { x 2 S | ui-1(x) is defined }. If S is finite, then pt = p 0 ¢ e. Qt = p 0 ¢ k=0, 1, 2, . . . (Qt)k / k! where Q is the generator matrix with Q(x, x’) = i(x) if ui(x) = x’ - i i(x) if x = x’ 0 else. Numerically unstable!

Special Kinds of CTMC Poisson process: n = 1 0 Q= 1 2 ( - 0 0 0. . . ) 0 - 0 0. . . 0 0 - 0. . . 3

Special Kinds of CTMC Poisson process: n = 1 0 1 2 3 Birth process: n = 1 0 0 1 1 2 2 3 3

Solving the Chemical Master Equation • Computing distributions versus individual parameters (e. g. mean) -”molecular noise” important for small populations -entire distributions are of interest +distribution of time until protein concentration reaches effective signaling level +distribution of time for DNA replication initiation +distribution of concentration of transcription factors

Solving the Chemical Master Equation • Computing distributions versus individual parameters (e. g. mean) -”molecular noise” important for small populations -entire distributions are of interest +distribution of time until protein concentration reaches effective signaling level +distribution of time for DNA replication initiation +distribution of concentration of transcription factors Deterministic ODE model inadequate

Solving the Chemical Master Equation • Computing distributions versus individual parameters (e. g. mean) • Computing transient distributions versus steady state We want to compute pt for certain times t 2 [0, T] up to a given horizon T. Let = limt ! 1 pt. If x 2 S (x) = 1, then x is called stationary distribution and can be computed as the solution of the linear equation system 0= ¢Q x 2 S (x) = 1.

Solving the Chemical Master Equation • Computing distributions versus individual parameters (e. g. mean) • Computing transient distributions versus steady state • Propagating distributions versus states (Monte-Carlo simulation)

Toggle Switch: Four Simulation Runs

Gillespie Simulation -simulation algorithms are easy to understand implement -numerical handling of large (or infinite) matrices not feasible -one trajectory is considered as the outcome of one “in-silico” experiment

Gillespie Simulation -simulation algorithms are easy to understand implement -numerical handling of large (or infinite) matrices not feasible -one trajectory is considered as the outcome of one “in-silico” experiment -too expensive for computing event probabilities or distributions: +double precision requires 4 times more simulation runs +millions of runs for a precision of 10 -5 [CMSB 09]

Solving the Chemical Master Equation • Computing distributions versus individual parameters (e. g. mean) • Computing transient distributions versus steady state • Propagating distributions versus states (Monte-Carlo simulation) Deterministic ODE model inadequate Gillespie simulation inadequate

Solving the Chemical Master Equation • Computing distributions versus individual parameters (e. g. mean) • Computing transient distributions versus steady state • Propagating distributions versus states (Monte-Carlo simulation) Deterministic ODE model inadequate Gillespie simulation inadequate Numerical algorithms ! [Burrage, Munsky, Zhang, . . . ]

Toggle Switch: Fast Adaptive Uniformization t = 5000 t = 15000 t = 30000 t = 50000

Toggle Switch: Four Simulation Runs

Decomposition of CTMC Uniformization: CTMC X = DTMC Y + Poisson process N (“clock”) 4 0. 8 = X 1 + Y 0. 2 N 5

Decomposition of CTMC Uniformization: CTMC X = DTMC Y + Poisson process N (“clock”) = supx 2 S x < 1 where x = x’ (x, x’) Poisson process N: nt(k) = Pr( N(t) = k ) = e ¢ ( t)k / k! Probability matrix of Y: P = -1¢ Q + I pt(x) = Pr( Y(N(t)) = x ) = k=0, 1, 2, . . . Pr( Y(k) = x ) ¢ Pr( N(t) = k )

Decomposition of CTMC Uniformization: CTMC X = DTMC Y + Poisson process N (“clock”) = supx 2 S x < 1 where x = x’ (x, x’) Poisson process N: nt(k) = Pr( N(t) = k ) = e ¢ ( t)k / k! Probability matrix of Y: P = -1¢ Q + I pt(x) = Pr( Y(N(t)) = x ) = k=0, 1, 2, . . . Pr( Y(k) = x ) ¢ Pr( N(t) = k ) pt = k=0, 1, 2, . . . p 0 ¢ Pk ¢ nt(k)

Decomposition of CTMC Uniformization: CTMC X = DTMC Y + Poisson process N (“clock”) = supx 2 S x < 1 where x = x’ (x, x’) Poisson process N: nt(k) = Pr( N(t) = k ) = e ¢ ( t)k / k! Probability matrix of Y: P = -1¢ Q + I pt(x) = Pr( Y(N(t)) = x ) = k=0, 1, 2, . . . Pr( Y(k) = x ) ¢ Pr( N(t) = k ) pt = k=0, 1, 2, . . . p 0 ¢ Pk ¢ nt(k) Can be approximated to any desired accuracy!

Decomposition of CTMC Uniformization: CTMC X = DTMC Y + Poisson process N (“clock”) -clock rate is faster than all transition rates -too expensive for stiff systems (i. e. , systems with very different transition rates)

Decomposition of CTMC Uniformization: CTMC X = DTMC Y + Poisson process N (“clock”) -clock rate is faster than all transition rates -too expensive for stiff systems (i. e. , systems with very different transition rates) Adaptive Uniformization (van Moorsel & Sanders): CTMC = DTMC + Birth process -state-dependent clock rate

Decomposition of CTMC Uniformization: CTMC X = DTMC Y + Poisson process N (“clock”) -clock rate is faster than all transition rates -too expensive for stiff systems (i. e. , systems with very different transition rates) Adaptive Uniformization (van Moorsel & Sanders): CTMC = DTMC + Birth process -state-dependent clock rate -too expensive for large/infinite state spaces, because at all times t > 0, every state has a positive probability

Fast Adaptive Uniformization [Diedier, H, Mateescu & Wolf] Transient distributions can be efficiently approximated by combining adaptive uniformization with two ideas from verification: 1. on-the-fly state space exploration abstraction of insignificant (low-probability) states 2.

Fast Adaptive Uniformization [Diedier, H, Mateescu & Wolf] Transient distributions can be efficiently approximated by combining adaptive uniformization with two ideas from verification: 1. on-the-fly state space exploration abstraction of insignificant (low-probability) states Executability Sliding Window Abstraction 2.

Sliding Window Abstraction for DTMC: t = 0

Sliding Window Abstraction for DTMC: t = 1 high-probability states low-probability states

Sliding Window Abstraction for DTMC: t = 1 lost probability mass contributes to error window of significantprobability states

Sliding Window Abstraction for DTMC: t = 2

Sliding Windows in Adaptive Uniformization CTMC = DTMC + Birth process (state-dependent clock)

Sliding Windows in Adaptive Uniformization CTMC = DTMC + Birth process (state-dependent clock) DTMC + Poisson process

Sliding Windows in Adaptive Uniformization CTMC = DTMC + Birth process (state-dependent clock) DTMC + Poisson process 2 independent invocations of sliding windows !

Sliding Windows in Adaptive Uniformization CTMC = DTMC + Birth process (state-dependent clock) DTMC + Poisson process 2 independent invocations of sliding windows ! 0 0 supk k - 0 1 1 2 2 3 3 4 4

Three Sources of Errors: Pr( X(t) = x ) = k=0, 1, 2, . . . Pr( Y(k) = x ) ¢ Pr( B(t) = k ) truncation point k = R sliding-window abstraction of Y sliding-window abstraction of B

Three Sources of Errors: All Monotonically Refinable Underapproximations Pr( X(t) = x ) = k=0, 1, 2, . . . Pr( Y(k) = x ) ¢ Pr( B(t) = k ) truncation point k = R sliding-window abstraction of Y sliding-window abstraction of B

Fast Adaptive Uniformization GOOD if probability mass forms a (moving) cloud. BAD if probability mass spreads.

Bacteriophage Model

Bacteriophage Model Desired precision: 3 x 10 -6 Standard uniformization: timeout Fast adaptive uniformization: 55 min runtime Gillespie simulation ( = 0. 95): 67 h runtime (3 x 108 runs) 94 94

Why Syntax Matters

Epidemic Process: Stoichiometric Equations n = 2: x 1. . . number of healthy individuals x 2. . . number of infected individuals ; x 1 ; x 2 x 1 + x 2 a b c d e x 1 ; x 2 ; 2 x 2

Epidemic Process: Rate Matrix Q( (x 1, x 2), (x 1+1, x 2) ) Q( (x 1, x 2), (x 1 -1, x 2) ) =a = bx 1 if x 1 > 0 Q( (x 1, x 2), (x 1, x 2+1) ) Q( (x 1, x 2), (x 1, x 2 -1) ) =c = dx 2 if x 2 > 0 Q( (x 1, x 2), (x 1 -1, x 2+1) ) = ex 1 x 2 if x 1, x 2 > 0 All other nondiagonal entries are 0, and each diagonal entry is the negative sum of the other row entries.

Epidemic Process: Stochastic Petri Net a x 1 b ¢ m(x 1) e ¢ m(x 1) ¢ 2 m(x 2) c x 2 d ¢ m(x 2)

Epidemic Process: Stochastic Process Algebra Originally: one process per molecule

Epidemic Process: Stochastic Process Algebra Originally: one process per molecule Bio-PEPA [Hillston et al. ]: x 1 = a>> + bx 1<< + ex 1 x 2<< x 2 = c>> + dx 2<< + ex 1 x 2<< x 1 <ex 1 x 2> x 2

Epidemic Process: Guarded Commands [] true [] x 1 > 0 - a -> - bx 1 -> x 1 : = x 1 + 1 x 1 : = x 1 - 1 [] true [] x 2 > 0 - c -> - dx 2 -> x 2 : = x 2 + 1 x 2 : = x 2 – 1 [] x 1 > 0 Æ x 2 > 0 x 2+1 - ex 1 x 2 -> x 1 : = x 1 -1; x 2 : = Reactive Modules, PRISM [Kwiatkowska et al. ]

Properties of Languages: Composition Petri Nets: components need modification Rate Matrices: Kronecker product (stochastic automata) x 1 II x 2 x 3

Properties of Languages: Composition Petri Nets: components need modification Rate Matrices: Kronecker product (stochastic automata) Process Algebras, Stoichiometric Equations, Guarded Commands

Properties of Languages: Expressiveness and Succinctness Rate Matrices: not succinct Process Algebras: most have only constant rate functions Stoichiometric Equations: insufficient rate functions Bistable toggle switch (a genetic regulatory network): 1(x 1, x 2) = a / (b + x 22) [2 species, 4 reactions]

Properties of Languages: Expressiveness and Succinctness Rate Matrices: not succinct Process Algebras: most have only constant rate functions Stoichiometric Equations: insufficient rate functions Bistable toggle switch (a genetic regulatory network): 1(x 1, x 2) = a / (b + x 22) [2 species, 4 reactions] Petri Nets, Guarded Commands

Properties of Languages: Executability Petri Nets, Process Algebras: compiled into matrices Rate Matrices: expensive (global)

Properties of Languages: Executability Petri Nets, Process Algebras: compiled into matrices Rate Matrices: expensive (global) Stoichiometric Equations, Guarded Commands: -easy computation of successor distributions -support efficient simulation and on-the-fly reachability analysis

Properties of Languages: Executability Petri Nets, Process Algebras: compiled into matrices Rate Matrices: expensive (global) Stoichiometric Equations, Guarded Commands: -easy computation of successor distributions -support efficient simulation and on-the-fly reachability analysis Propagating States vs. Distributions = Testing vs. Verification

Properties of Languages: Encapsulation Petri Nets, Rate Matrices, Stoichiometric Equations Process Algebras, Guarded Commands

Properties of Languages: Encapsulation Dynamic Reconfiguration Petri Nets, Rate Matrices, Stoichiometric Equations Process Algebras, Guarded Commands

Properties of Languages: Encapsulation Dynamic Reconfiguration Petri Nets, Rate Matrices, Stoichiometric Equations Process Algebras, Guarded Commands . . . and the winner is: Guarded Commands !

Conclusions 1. Syntax matters (composition, expressiveness, succinctness, executability, encapsulation, dynamic reconfiguration) 2. Ideas from verification (on-the-fly, abstraction, data structures) make distribution propagation possible

Conclusions 1. Syntax matters (composition, expressiveness, succinctness, executability, encapsulation, dynamic reconfiguration) 2. Ideas from verification (on-the-fly, abstraction, data structures) make distribution propagation possible 3. Main limitation: large populations (not high dimensionality) Hybrid approach stochastic + continuous deterministic) well. (discrete works