ConstraintBased Modeling of Metabolic Networks based on Genomescale
Constraint-Based Modeling of Metabolic Networks based on: “Genome-scale models of microbial cells: Evaluating the consequences of constraints”, Price, et. al (2004) Tomer Shlomi School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel January, 2006
Outline n n n Metabolism and metabolic networks Kinetic models vs. constraints-based modeling Flux Balance Analysis Exploring the solution space Altering phenotypic potential: gene knockouts 2
Cellular Metabolism n n n The essence of life. . Catabolism and anabolism The metabolic core – production of energy – anaerobic and aerobic metabolism Probably the best understood of all cellular networks: metabolic, PPI, regulatory, signaling Tremendous importance in Medicine; antibiotics, metabolic disorders, liver disorders, heart disorders Bioengineering; efficient production of biological products. 3
Metabolites and Biochemical Reactions n n Metabolite: an organic substance, e. g. glucose, oxygen Biochemical reaction: the process in which two or more molecules (reactants) interact, usually with the help of an enzyme, and produce a product Glucose + ATP Glucokinase Glucose-6 -Phosphate + ADP 4
5
Kinetic Models n Dynamics of metabolic behavior over time Metabolite concentrations ¨ Enzyme activity rate – depends on enzyme concentrations and metabolite concentrations ¨ Solved using a set of differential equations ¨ n Impossible to model large-scale networks Requires specific enzyme rates data ¨ Too complicated ¨ 6
Constraint Based Modeling n n n Provides a steady-state description of metabolic behavior ¨ A single, constant flux rate for each reaction ¨ Ignores metabolite concentrations ¨ Independent of enzyme activity rates Assume a set of constraints on reaction fluxes Genome scale models Flux rate: μ-mol / (mg * h) 7
Constraint Based Modeling n Find a steady-state flux distribution through all biochemical reactions n Under the constraints: ¨ Mass balance: metabolite production and consumption rates are equal ¨ Thermodynamic: irreversibility of reactions ¨ Enzymatic capacity: bounds on enzyme rates ¨ Availability of nutrients 8
Metabolic Networks Genome Annotation Biochemistry Cell Physiology Inferred Reactions Network Reconstruction Metabolic Network Analytical Methods 9
Mathematical Representation n Stoichiometric matrix – network topology with stoichiometry of biochemical reactions Glucokinase Glucose + ATP Glucokinase Glucose-6 -Phosphate + ADP Mass balance S·v = 0 n Subspace of R Glucose ATP -1 -1 G-6 -P ADP +1 +1 Thermodynamic vi > 0 Convex cone Capacity vi < vmax Bounded convex cone 10
Growth Medium Constraints n Exchange reactions enable the uptake of nutrients from the media and the secretion of waste products Glucose Oxygen Lower bound 0 0 CO 2 Upper bound 2. 5 Inf -Inf 0 G-Ex O-Ex Co 2 -Ex Glucose Oxygen CO 2 1 11
Determination of Likely Physiological States n n n How to identify plausible physiological states? Optimization methods ¨ Maximal biomass production rate ¨ Minimal ATP production rate ¨ Minimal nutrient uptake rate Exploring the solution space ¨ Extreme pathways ¨ Elementary modes 12
Outline: Optimization Methods n Predicting the metabolic state of a wild-type strain ¨ n Flux Balance Analysis (FBA) Predicting the metabolic state after a gene knockout Minimization Of Metabolic Adjustment ¨ Regulatory On/Off Minimization ¨ 13
Biomass Production Optimization n n Metabolic demands of precursors and cofactors required for 1 g of biomass of E. coli Classes of macromolecules: Amino Acids, Carbohydrates Ribonucleotides, Deoxyribonucleotides Lipids, Phospholipids Sterol, Fatty acids These precursors are removed from the metabolic network in the corresponding ratios n We define a growth reaction Z = 41. 2570 VATP - 3. 547 VNADH+18. 225 VNADPH + …. n 14
Biomass Composition Issues n n n Varies across different organisms Depends on the growth medium Depends on the growth rate The optimum does not change much with changes in composition within a class of macromolecules The optimum does change if the relative composition of the major macromolecules changes 15
Flux Balance Analysis (FBA) n Finds flux distribution with maximal growth rate n Successfully predicts: ¨ Growth rates ¨ Nutrient uptake rates ¨ Byproduct secretion rates Solved using Linear Programming (LP) n Max vgro, s. t S∙v = 0, vmin v vmax - maximize growth - mass balance constraints - capacity constraints Fell, et al (1986), Varma and Palsson (1993) th w gro 16
FBA Example (1) 17
FBA Example (2) 18
FBA Example (2) 19
Linear Programming Basics (1) 20
Linear Programming Basics (2) 21
Linear Programming Basics (3) 22
Linear Programming: Types of Solutions (1) 23
Linear Programming: Types of Solutions (2) 24
Linear Programming Algorithms n n Simplex ¨ Used in practice ¨ Does not guarantee polynomial running time Interior point ¨ Worse case running time is polynomial wt gro h 25
Phenotype Predictions: Evolving Growth Rate 26
27
Exploring the Convex Solution Space 28
Alternative Optima n The optimal FBA solution is not unique One solution gro n n wth Optimal solutions th w gro Near-optimal solutions h wt gro Basic solutions enumeration – MILP (Lee, et. al, 2000) Flux variability analysis (Mahadevan, et. al. 2003) Hit and run sampling (Almaas, et. al, 2004) Uniform random sampling (Wiback, et. al, 2004) 29
What Do Multiple Solutions Represent ? n n Some of the solutions probably do not represent biologically meaningful metabolic behaviors as there are missing constraints Previous studies tackled this problem by: ¨ Incorporating additional constraints: regulatory constraints (Covert, et. al. , 2004) ¨ Looking for reactions for which new constraints may significantly reduce the solution space (Wiback, et. al. , 2004) FBA solution space Meaningful solutions 30
Interpretations of Metabolic Space n n Effect of exogenous factors – the metabolic space corresponds to growth in a medium under various external conditions that are beyond the model’s scope such as stress or temperature Heterogeneity within a population - the metabolic space represents heterogenous metabolic behaviors by individuals within a cell population (Mahadevan, et. al. , 2003, Price, et. al. , 2004) Alternative evolutionary paths – the metabolic space represents different metabolic states attainable through different evolutionary paths (Mahadevan, et. al. , 2003, Fong, et. al. , 2004) The three interpretations are obviously not mutually exclusive 31
Alternative Optima: Basic Solutions Enumeration n n n Lee, et. al, 2000 Basic solutions – metabolic states with minimal number of non-zero fluxes Different solutions differ in at least a single zero flux Use Mixed Integer Linear Programming Formulate optimization as to identify new solutions that are different from the previous ones Applicable only to small scale models gro wth 32
Alternative Optima: Flux Variability Analysis n n n Mahadevan, et. al. 2003 Find metabolic states with extreme values of fluxes Use linear programming to minimize and maximize the flux through each reaction while satisfying all constraints Max / Min vi, s. t S∙v = 0, vmin v vmax Vgro = Vopt - maximize growth - mass balance constraints - capacity constraints - set maximal growth rate 33
Alternative Optima: Hit and Run Sampling n n n Almaas, et. al, 2004 Based on a random walk inside the solution space polytope Choose an arbitrary solution Iteratively make a step in a random direction Bounce off the walls of the polytope in random directions 34
Alternative Optima: Uniform Random Sampling n n n Wiback, et. al, 2004 The problem of uniform sampling a high-dimensional polytope is NPHard Find a tight parallelepiped object that binds the polytope Randomly sample solutions from the parallelepiped Can be used to estimate the volume of the polytope 35
Topological Methods n Not biased by a statement of an objective n Network based pathways: ¨ Extreme Pathways (Schilling, et. al. , 1999) ¨ Elementary Flux Modes (Schuster, el. al. , 1999) Decomposing flux distribution into extreme pathways Extreme pathways defining phenotypic phase planes Uniform random sampling n n n 36
Extreme Pathways and Elementary Flux Modes n n n Unique set of vectors that spans a solution space Consists of minimum number of reactions Extreme Pathways are systematically independent (convex basis vectors) 37
Extreme Pathways and Elementary Flux Modes n n n Inherent redundancy in metabolic networks (Price, et. al. , 2002) Robustness to gene deletion and changes in gene expression (Stelling, et. al. , 2002) Enzyme subsets (correlated reaction sets) in yeast (Papin, et. al. , 2002) Design strains (Carlson, et. al. , 2002) Assign functions to genes (Forster, et. al, 2002) 38
Altering Phenotypic Potential: Gene Knockouts 39
Altering Phenotypic Potential: Gene Knockouts n n Minimization Of Metabolic Adjustment (MOMA) (Segre et. al, 2002) ¨ The flux distribution after a knockout is close to the wild-type’s state under the Euclidian norm Regulatory On/Off Minimization (ROOM) (Shlomi et. al, 2005) ¨ Minimize the number of Boolean flux changes from the wild-type’s state w v 40
Altering Phenotypic Potential n Explaining gene dispensability (Papp, el. al. , 2004) Only 32% of yeast genes contribute to biomass production in rich media ¨ Considered one arbitrary optimal growth solution ¨ n n Opt. Knock – Identify gene deletions that generate desired phenotype (Burgard, et. al. , 2003) Opt. Strain – Identify strains which can generate desired phenotypes by adding/deleting genes (Pharkya, el. , al. , 2004) 41
Modeling Gene Knockouts n Gene knockout n Enzyme knockout n Reaction knockout 42
Cellular Adaptation to Genetic and Environmental Perturbations n n Transient changes in expression levels in hundreds of genes (Gasch 2000, Ideker 2001) Convergence to expression steady-state close to the wild -type (Gasch 2000, Daran 2004, Braun 2004) Drop in growth rates followed by a gradual increase (Fong 2004) growth n minutes generations 43
Regulatory On/Off Minimization (ROOM) n n Predicts the metabolic steady-state following the adaptation to the knockout Assumes the organism adapts by minimizing the set of regulatory changes Boolean Regulatory Change Boolean Flux Change Finds flux distribution with minimal number of Boolean flux changes n w v 44
ROOM: Implementation n n Solved using Mixed Integer Linear Programming (MILP) Boolean variable yi yi = 1 Min yi s. t v – y ( vmax - w) w v – y ( vmin - w) w S∙v = 0, vj = 0, j G n Flux vi change from wild-type - minimize changes - distance constraints - mass balance constraints - knockout constraints MILP is NP-Hard n Relax Boolean constraints - solve using LP n Relax strict constraint of proximity to wild-type 45
Example Network 46
ROOM’s Implicit Growth Rate Maximization n n ROOM implicitly attempts to maintain the maximal possible growth rate of the wild-type organism A change in growth requires numerous changes in fluxes M 1 M 2 . Growth Reaction Biomass . Mn 47
n n n Intracellular fluxes measurements in. Measurements E. coli Intracellular Flux central carbon metabolism Obtained using NMR spectroscopy in C labeling experiments 13 5 knockouts: pyk, pgi, zwf, gnd, ppc in Glycolysis and Pentose Phosphate pathways Glucose limited and Ammonia limited medias FBA wild-type predictions above 90% accuracy Emmerling, M. et al. (2002), Hua, Q. et al. (2003), Jiao, Z et al. (2003), Peng, et. al (2004) 48
Knockout Flux Predictions ROOM flux predictions are significantly more accurate than MOMA and FBA in 5 out of 9 experiments n n. ROOM steady-state growth rate predictions are significantly more accurate than MOMA 49
ROOM vs. MOMA n ROOM predicts metabolic steady-state after adaptation n. Provides accurate flux predictions n. Preserved flux linearity n. Finds alternative pathways n. Predicts steady-state growth rates MOMA predicts transient metabolic states following the knockout n Provides more accurate transient growth rates n 50
Additional Constraints n Transcriptional regulatory constraints (Covert, et. al. , 2002) ¨ Boolean representation of regulatory network ¨ Used to predict growth, changes in expression levels, simulate courses of batch cultures n Energy balance analysis (Beard, et. al. , 2002) ¨ Loops are not feasible according to thermodynamic principles – resulting in a non-convex solution space 51
Additional Constraints: Slow Changes in the Environment n n Timescales of cellular process are shorter than those of surrounding environment Generate dynamic curves to simulate batch experiments (Varma, et. al. , 1994) 52
• Thank you for listening • Questions
- Slides: 53