Attractor Detection and Control of Boolean Networks Tatsuya

Attractor Detection and Control of Boolean Networks Tatsuya Akutsu Bioinformatics Center Institute for Chemical Research Kyoto University

Contents Boolean Network n Attractor Detection n q n Definition and Algorithms Control of Boolean Network q Definition and DP algorithm Integer Programming-based Approach n PBN and its Control n Conclusion n

Acknowledgment n n n Tamura Takeyuki, Morihiro Hayashida [Kyoto U. ] Masaki Yamamoto [Kwansei Gakuin U. ] Wai-Ki Ching, Shuqin Zhang, Xi Chen [U. Hong Kong] Michael Ng [Hong Kong Baptist U. ] Avraham A. Melkman [Ben-Gurion University of the Negev]

Boolean Network

Boolean Network Mathematical model of genetic networks n node⇔gene n q n Regulation rules q q n State of node：　1 (active) / 0 (inactive) Boolean function (AND, OR, NOT …) Edge from y to x ⇔ y directly controls x Synchronized update q Almost the same as digital circuits (with clocks) [Kauffman, The Origin of Order, 1993]

Example of Boolean Network State Transition Table A B C A’ = B B’ = A and C C’ = not A time ｔ＋１ A B C A’ B’ C’ 0 0 1 1 0 0 1 1 0 1 0 1 INPUT 0 0 0 1 1 1 0 0 OUTPUT Example of state transition：１１１　⇒　１１０　⇒　１００　⇒　００１　⇒　。。。

Why Boolean Networks ? n Criticism that BN is too simplified q Unless simplified, difficult for theoretical analysis, inference, and control n q n Maybe useful for qualitative analyses One of most simple non-linear models q n though complex models can be used for simulation Negative results on BN suggest negative results on more general (non-linear) models Almost the same as digital circuits q Theories and techniques in computer science can be utilized

Our Focus: Time Complexity n Many problems for BN are NP-hard q NP-hard means that there is no polynomial time algorithm (unless P=NP) It will take O(2 n) time or more if we use naïve methods n But, we want to solve much better n q Because we can solve the cases of n n=300 for O(1. 1 n) n=600 for O(1. 05 n) Important for coping with large-scale networks

Attractor Detection

Attractor (1) n n n Steady state Different attractors ⇔ Different cell types Example q q 011 ⇒ 101 ⇒ 010 ⇒… 111 ⇒ 110 ⇒ 100 ⇒ 001 ⇒　 001 ⇒ … State Transition Table time ｔ＋１ A B C A’ B’ C’ 0 0 1 1 0 0 1 1 0 1 0 1 INPUT 0 0 0 1 1 1 0 0 OUTPUT

Attractor (2) time ｔ＋１ A B C A’ B’ C’ 0 0 1 1 0 0 1 1 0 1 0 1 INPUT 0 0 0 1 1 1 0 0 OUTPUT 111 011 110 100 001 101 010

N-K Model (Kauffman Network) n n N: Number of nodes (We use n instead of N) K: Indegree q q n Indegree = the number of input edges = the number of genes directly affecting node v Each node has (maximum or average) indegree K Boolean function assigned to each node is randomly selected indegree＝２ v indegree＝３ v

Distribution of Attractors in N-K Model n Classical conjecture q n Some results suggest that this conjecture may not be true q q n The number of attractors is Superpolynomial growth ( > nγ for any γ) of the number of attractors (Samuelsson & Troein, PRL, 2003) Superpolynomial growth of the average size of attractors (Drossel et al. , PRL, 2005) No conclusive result is known

Singleton Attractor (or Point n. Attractor) Biological interpretation of attractors Different attractors　⇔　Different cell types Point attractor n Attractor with period 1 n Corresponding to a steady state n Definition: satisfying n n （or, 　　　　　　） n Attractor Detection n Input: Boolean Network n Output: Point Attractor (if any)

Previous Works and Our Works n Around q time is enough since there are 2 n global states Several heuristics, but no theoretical guarantee　 [Irons, Pysica D, 2006], [Devloo et al. , Bull. Math. Biol. 2003], … q Detection of a singleton attractor is NP-hard 　　　　　 n n 　[Akutsu et al. , GIW 1998] We developed algorithms with average case theoretical bounds [Zhang et al. , EURASIP JBSB 2007] We developed algorithms for singleton attractor detection q time algorithm for AND-OR BNs [Melkman, Tamura & Akutsu, 2010] q time algorithm for nested canalyzing BNs [Akutsu, Melkman, Tamura & Yamamoto, 2011]

Reduction from BN-ATTRACTOR to SAT n Detection of Singleton Attractor with Max. Indegree K (K+1)-SAT (Boolean SATisfiability problem) vj vk vi

Basic Idea of Our n Algorithms Assigning x=0 eliminates three nodes n Assigning x=1 eliminates two nodes ⇒ ⇒ ⇒ need additional work using SAT ⇒ y z OR x OR 0 0 OR OR 0 1 OR w OR 1

Summary of Attractor Detection Singleton Attractors Algorithms K=2 K=3 Recursive (Ave. Time) O(1. 19 n) O(1. 27 n) SAT based (detection) O(1. 323 n) O(1. 474 n) Our algorithms (detection) O((1. 323 -δ)n) (δ=0. 00004) Cyclic Attractors AND/OR of literals (any K) Canalyzing (any K) AND/OR of literals (Planar, any K) N/A N/A O(1. 587 n) O(1. 799 n) O((1+ε)n) (Recursive, Average Case) K=2 K=3 K=4 K=5 period=2 O(1. 57 n) O(1. 70 n) O(1. 78 n) O(1. 83 n) period=3 O(1. 72 n) O(1. 86 n) O(1. 92 n) O(1. 95 n)

Control of Boolean Network

Control Theory for Biological n Systems One of the main targets of Systems Biology q q n Though control theory is well established for linear systems, biological systems have non-linear components May lead to new drugs and treatment methods Introduction of 4 genes turns normal cells into induced pluripotent stem cells (i. PS cells) Control Cancer Cell Normal Cell

Definition of BN-Control n Input n v 1 , …, vn 　　External nodes： u 1 , …, um q Initial state: v 0 　　Desired state: v. M 　　　　　BN Output q Sequence of states of external nodes：　u(0), u(1), …, u(M) 0 M 　　　（leading to the desired state at time M） n v(0)=v , v(M)=v q Internal nodes: [Akutsu et al. , J. Theo. Biol. 2007]

BN-Control: Related Works n Datta et al. defined a problem of control of PBN (Probabilistic Extension of BN) and proposed a dynamic programming based method [Machine Learning, 52: 169 -191, 2003] q q They also proposed various extensions But, their method must handle 2 n× 2 n matrices n BN-Control (also PBN-Control) is NP-hard BN-Control can be solved in polynomial time if the network has a tree structure [Akutsu et al. , JTB 2007] n Practical approach based on Model Checking/SAT n [Langmund & Jha, APBC 2008, JBCB 2009] n Theoretical studies using Semi-Tensor Product [Cheng, 2009, 2010, …]

Dynamic Programming for Control of BN n BN version of the algorithm by Datta et al. n DP table: q q takes 1 if there is a control seq. leading to the target state can be computed by

Illustration of DP Algorithm D[1, 1, 1, 2] =1 D[0, 0, 0, 2] = 0 u 1=1, u 2=1 D[0, 1, 1, 3] = 1 DP Computation But, the size of DP table is exponential

Integer Linear Programming. Based Approach

Integer Programming n Linear Programming (LP) q n Maximize (or minimize) an objective linear function under constraints of linear inequalities Integer Linear Programming (ILP) q q q LP + 　constraints that specified variables must take integer value Several efficient solvers: CPLEX, Gurobi Used for solving various NP-hard problems

ILP Representation of Boolean Functions n Variables：　either ０ or １　 (i. e. , integer between 0 and 1) n ＡＮＤ n OR n NOT We applied this methodology to BN-control. [Akutsu et al. , IEEE CDC 2009]

Result on Attractor Detection n Data: randomly generated BNs q with cases of indegree=2 and indegree=3 q n: #nodes 3 GHz Xeon CPU + ILOG CPLEX Result： quite fast if indegree=2

Result on BN-Control n Data: randomly generated BNs q q n with cases of indegree=2 and indegree=3 n: #internal nodes, m: #external nodes, M: #steps Result： fast if indegree=2 but, not so fast if indegree=3

PBN and its Control

Probabilistic Boolean Network (PBN) [Shmulevich et al. , 2002] n n Multiple control rules (boolean functions) for each node Control rule is selected randomly at each t according to a given probability distribution q q q B C Almost equivalent to Dynamic Bayesian Network Pros: Capable of noise. Can be modeled as Markov process. Cons：Not scalable since it takes O(2 n) or more time for almost all problems on PBN A A(t+1) = B(t) AND C(t) with Prob. =0. 6 A(t+1) = B(t) OR (NOT C(t)) with Prob. =0. 4

Example of PBN State Transition Diagram PBN (only for half of nodes) One of 4(=2× 1× 2) BNs is randomly selected at each time setp

BN vs. PBN n n BN: 1 outgoing edge PBN: multiple outgoing edges (with probabilities) 101 0. 3 0. 2 0. 4 BN 1 BN 2 BN 3 001 BN 001 011 101 PBN BN 4 110

PBN-CONTROL: Model n Probabilistic Boolean network (PBN, an extension of Boolean network) Global state at time t: Probabilistic regulation rule is given as a 2 n× 2 n matrix A A can be controlled by m boolean variables n Cost functions n n n q q Ct(v, u): cost for applying control u for global state v at time t C(v): cost for final global state v [Datta et al. , Machine Learning, 2003]

PBN-CONTROL: Problem and n Algorithm Problem: q q n Given initial state v(0), control rule A(u(t)), target time M , and cost functions, Find a first control action u(0) minimizing Can be solved by dynamic programming [Datta et al. , Machine Learning, 2003]

Hardness Results n n n Control of BN is NP-complete [Akutsu et al. , JTB 07] Integer linear programming (ILP)-based [Akutsu et al. , IEEE CDC 09] method for control of BN Control of PBN is harder than NP ( -hard) n Such technique as ILP, SAT cannot be utilized [Chen et al. , BIBM 2010] PSPACE Control of BN ILP SAT NP ? Control of PBN

Conclusion

Conclusion n Boolean network q q n Attractor Detection/Enumeration q q n NP-hard Integer Linear Programming-based Approach q n NP-hard Much better than naïve O(2 n) bound for several cases Control of Boolean Networks q n A discrete model of a genetic network Similar to digital circuits Simple, Flexible for modifications/extensions Control of Probabilistic Boolean Networks q -hard ⇒　SAT or IP cannot be utilized

Future Work n Development of Non-trivial Algorithms for n Periodic Attractor Detection n In progress n Control of Boolean Network n Break O(2 n) bound ! n Control of PBN n How to cope with -hardness n Development of Hybrid Model/Theory Combining Boolean and Linear Models Thank you !