Pathway Logic Symbolic Analysis of Biological Signaling Presented
Pathway Logic Symbolic Analysis of Biological Signaling Presented by Geoffrey
Introduction • Tremendous growth of genomic sequence information combined with technological advances in the analysis of gene expression has revolutionized research in biology and biomedicine • Investigation of signaling and metabolic pathways would benefit greatly from the use of predictive models • Although these pathways are complex, fundamental concepts that stemmed from contemporary research indicates that they are also amenable to analysis via computational methods. E. g. most signaling pathways involved hierarchical assembly in space and time of multiprotein complexes that regulate the flow of information via stimulation or inhibition
Introduction • Various models have been proposed that incorporates quantitative information such as rate and/or concentration information • However they are limited due to the difficulty in obtaining the relevant parameters e. g. Michaelis constant etc. as well as stochastic features of signaling molecules • Hence another way to look at such pathways is by the logic of signal, e. g. use of π-calculus to represent and forward simulate signaling pathway • In this paper, we will look at the development of logical models based on the application of formal methods tools to mammalian signaling pathways
Levels of Abstraction • Continuous Abstraction – Involves continuous mathematics such as differential equations and are analyzed using sophisticated numerical computational packages – However the complexity of biological processes limits accuracy and effective description • Discrete Abstraction – Natural processes are described by purely symbolic expressions – Applicable to less predictable phenomena such as biological signaling processes (as we shall see later on)
Pathway Logic • Pathway Logic – an algebraic structure enabling the symbolic analysis of biological signaling pathways • It uses rewriting theories to formalize the ‘informal’ models that biologists use to describe processes • Advantages of using Pathway Logic: – Can include both facts and principles relating and categorizing data elements and processes – Allows data to be interpreted, combined and queried in the context of biological knowledge – Allows models with various levels of details – Dynamically generate pathways using search and model-checking – Transformation to Petri nets for analysis and visualization – Roadmap views of dynamically generated pathways • Pathway logic algebra can be written in the Maude executable specification language
Pathway Logic Example • As an example, we will consider a major receptor-mediated pathway in mammalian cells, focusing on the Epidermal Growth Factor Receptor Fragment of the mammalian EGFR system illustrating the activation of a downstream mitogenic signaling pathway involving the gene for the autocrine EGRF ligand TGFa
Biological Sorts and Elements • The basic declaration of types in Pathway Logic (or Maude) is by using the keyword sorts and subsorts • Constants and operators are defined using the keyword ops sorts Protein Chemical Thing. subsorts Protein Chemical < Thing. ops EGFR EGF PIP 3 Pdk 1 PKCe : -> Protein. ops Ca++ : -> Chemical EGRF, EGF, PIP 3, Pdk 1, PKCe are operations that maps from empty to the Protein sort, indicating that they are constants of Protein Thing is a sort that encompasses Protein and Chemical, something like a super class
Protein Modification • Pathway Logic allows a comprehensive algebra of protein modification. The example below shows a small part of its declaration and specification in Maude sorts Modification Mod. Set. subsort Modification < Mod. Set. ops GDP GTP act deact : -> Modification. op none : -> Mod. Set. op _ _ : Mod. Set -> Mod. Set [assoc comm id: none]. op [_-_] : Protein Mod. Set -> Protein [right id: none]. • Sets of modifications are applied to proteins using the operator [ _-_ ] for example [ EGRF – act ] represents the activated form of EGRF This line denotes a list of modification sets that has elements that are associative, commutative and has ‘none’ as its identity element
Protein Association • Signaling proteins commonly associate to form functional complexes. This is represented using Maude by the following specification code sort Complex. subsort Complex < Thing. op _: _ : Thing -> Complex [comm] • Hence multi-protein complexes can be specified from proteins and other things by using the “ : ” operator • And example would be the inhibitory complex (Iq. Gap 1 : (Ecadherin : b. Catenin))
Protein Compartmentalization • In eukaryotic cells (i. e. cells with nucleus) proteins and other molecules exist in complex mixtures that are compartmentalized. • They are represented algebraically by the following declarations Sorts Soup Enclosure Mem. Type. Subsort Thing < Soup. Op empty : -> Soup. Op _ _ : Soup -> Soup [assoc comm id: empty]. Ops CM NM : -> Mem. Type. Op {_|_{_}} : Mem. Type Soup -> Enclosure. Contents of the membrane The type of membrane
Protein Compartmentalization • An enclosure has its own membrane part and internal, each with its own constituent soup • For example { CM | cm: Soup PIP 3 [Pdk 1 - act] { cyto: Soup PKCe }} • This represents a cell containing the chemical PIP 3 and the activated form of Pdk 1 in the cell membrane and PKCe in the interior (or cytoplasm) • cm: Soup and cyto: Soup are variables that are declared on the fly of sort Soup
Analysis Techniques • Given a formal symbolic model of the networks, several kinds of analyses can be carried out – Static Analysis – Forward and Backward Search – Explicit State Model Checking – Meta-analysis • Static analysis allows one to examine the structure of the model and to understand how the elements are related and organized by just looking at the model itself • It also provides a means to check for inconsistencies or ill-formed declaration and also to look for missing information
The Dynamics of Pathway Logic – Biochemical Events • To model biochemical events such as signaling processes, we use the dynamic part of a rewrite theory, or rewrite rules, to express such reactions • To express the following rule – “Activated Erk 1 is rapidly translocated to the nucleus where it is functionally sequestered and can regulate the activity of nuclear proteins including transcription factors”, we have the following rule in Maude rl[410. Erk 1/2. to. nuc]: {CM | cm: Soup {cyto: Soup [Erk 1 -act] {NM | nm: Soup {nuc: Soup}}}} => {CM | cm: Soup {cyto: Soup {NM | nm: Soup {nuc: Soup [Erk 1 -act]}}}}. rl[438. Erk. act. Elk]: [? Erk 1/2 -act] Elk 1 => [? Erk 1/2 -act] [Elk 1 -act].
The Dynamics of Pathway Logic – Biochemical Events • As another example, “In the presence of PIP 3, activated Pdk 1 recruits PKCe from the cytoplasm to the cell membrane and activates it” Rl[757. PIP 3. Pdk 1. act. PKCe]: {CM | cm: Soup PIP 3 [Pdk 1 -act] {cyto: Soup PKCe}} => {CM | cm: Soup PIP 3 [Pdk 1 -act] [PKCe-act] {cyto: Soup}} [metadata “cite = 21961415”]. • The metadata is just an optional tag that cites the justification of the rule using the Med. Line database Unique Identifier
Dynamic Analysis • Hence given the cell declarations and the rewriting rules, we can analyze networks by executing the rewriting rules to obtain cell ‘states’ • In Maude there are two rewriting strategies – the normal top down strategy by means of the command “rewrite” and the command “frewrite” which means fair rewrite. The second one is better as it ensures that no laws are left out in the execution • We can also find all the possible outcomes using the command “search”
Dynamic Analysis • Hence considering the PKC network op q 14 : -> Dish. eq q 14 = PD(Ca++ {CM | PIP 2 [PI 3 Ka-act] [PLCb 1 -act] [Pten-act] {Erk 1 Pdk 1 PKCa PKCe}}). • Using the rewrite and frewrite commands will give rewrite q 14. result Dish: PD({CM | Ca++ DAG IP 3 [PI 3 Ka-act] [PLCb 1 -act] [Pten-act] {Erk 1 Pdk 1 [PKCa-act] [PKCe-act] {NM | empty { empty }}}}) frewrite q 14. Result Dish: PD({CM | Ca++ DAG IP 3 [PI 3 Ka-act] [PLCb 1 -act] [Pten-act] [Pdk 1 -act] {Erk 1 [PKCa-act] [PKCe-act] {NM | empty {empty}}}})
Dynamic Analysis • Using the search command. . search q 14 =>! D: Dish. . Search till termination Solution 3 (state 23) D: Dish --> PD({CM | Ca++ DAG IP 3 [PI 3 Ka-act] [PLCb 1 -act] [Pten-act] [Pdk 1 -act] [PKCe-act] {[PKCa-act] {NM | empty {[Erk 1 -act]}}}}) Path Graph of the PKCe network
Model Checking • One of the capabilities of using Pathway Logic and Maude system is Model Checking • LTL (Linear Temporal Logic) formulas can be used to assert whether or not a state is reachable (This has uses in biomedicine to find out whether or not, targeting a specific enzyme will produce side effects) subsort Dish < State. op prop 1 : -> Prop. eq PD(out: Soup {CM | cm: Soup {cyto: Soup {NM | nm: Soup [c. Jun-act] [c. Fos-act]}}}}) |= prop 1 = true. red q 1 |= ~<> prop 1
Meta Analysis • There is a variety of metadata associated with the executable model of a signaling network. This includes information justifying or qualifying a rule and also ordering information, i. e. allows us to reason about the models themselves • These form of analyses can be used to answer simples questions such as “What are the rule labels? ” or “What constants of the sort Protein have been declared? ” • These questions are used to query about the model structure and content
Graphical Representation • Ultimately, Maude is still a text based tool which may not be suitable for biologists to study and analyze • One way to view Pathway Logic is to use Bio. Net viewer, which is an applet written by the authors to view the pathways • In the model, ovals represent the components (proteins, chemicals) with blue ovals representing the initial states while white ovals represent reachable states • The rectangles represent the rewriting rules and the edges connect reactants and products to rules • Other than viewing the pathways, some analyses can also be performed using the Bio. Net viewer • Demo [click]
Conclusion • A formal framework and the application of modern model checking and symbolic techniques has been proposed for the modeling of biological processes such as signaling networks • Allows biologists to ask questions that are of different nature than simple forward simulation, such as “If I use a drug to target a certain protein, will it also affect other proteins and chemicals and how will they be affected? ” • Performs mainly qualitative analysis of networks that are not only restricted to gene level (such as the Gene regulatory networks shown previously) • However it does have its limitations for certain cases where proteins and enzyme concentration plays an important part of analyzing the network such as the oscillation of Mitogen Activated Protein Kinase (MAPK) pathways
Thank you
- Slides: 22