Next Steps in Literature Mining Marti Hearst UC

  • Slides: 12
Download presentation
Next Steps in Literature Mining Marti Hearst UC Berkeley ASIST 2003 Literature Mining Panel

Next Steps in Literature Mining Marti Hearst UC Berkeley ASIST 2003 Literature Mining Panel

Literature Mining Goals Discover new information … … As opposed to discovering which statistical

Literature Mining Goals Discover new information … … As opposed to discovering which statistical patterns characterize occurrence of known information. Method: Use large text collections to gather evidence to support (or refute) hypotheses Extract facts Make connections Draw inferences

Outline Don’t Repeat History Use Time Machines More Ambitious Semantics

Outline Don’t Repeat History Use Time Machines More Ambitious Semantics

Don’t Repeat History Don’t show the obvious e. g. , Cheney is president Don’t

Don’t Repeat History Don’t show the obvious e. g. , Cheney is president Don’t show what you’ve already shown Only show the most recent version of information Show which information is not present Changes in the usual pattern Something stops happening

Create “Time Machines” Do systematic analyses of how to find out what is now

Create “Time Machines” Do systematic analyses of how to find out what is now known based on what used to be known. Example: See if information that was found via microarray analysis could have been found in the literature before its invention. Reverse-engineer the method and use it to find new information. Different approach in a new paper by Lukose, Adar, and Chan of HP Labs

More Ambitious Semantics Go beyond extracting entities Find relations between entities Convert clauses into

More Ambitious Semantics Go beyond extracting entities Find relations between entities Convert clauses into propositions

: • Mouse Bim proteins (isoforms EL, L, S) binds to human Bcl-2 (bacteriophoage

: • Mouse Bim proteins (isoforms EL, L, S) binds to human Bcl-2 (bacteriophoage screening using c. DNA expression library from T-Lymphoma cell line KO 52 DA 20). • Human Bim. EL protein is 89% identical to mouse Bim. EL, Human Bim. L is 85% identical to mouse Bim. L (Hybridization of mouse bim c. DNA to human fetal spleen and peripheral blood c. DNA library). • Bim m. RNA is detected in B and T lyphoid cells (Northern blot analysis of mouse KO 52 DA 20, WEHI 703, WEHI 707, WEHI 7. 1, CH 1, WEHI 231 WEHI 415, B 6. 23. 16 BW 2 cell extracts). • Bim. L protein interact with Bcl-2 OR Bcl-XL, or Bcl-w proteins (Immunoprecipitation (anti-Bcl-2 OR Bcl-XL OR Bcl-w)) followed by Western blot (anti. EEtag) using extracts human 293 T cells co-transfected with EE-tagged Bim. L AND (bcl-2 OR bcl-XL OR bcl-w) plasmids) • Bim. L deleted of the BH 3 domain does not bind to Bcl-2 OR Bcl-XL, or Bcl-w proteins (under experimental conditions mentioned above)

Apoptosis Network Survival Factors Signaling Death Receptors Signaling Genotoxic Stress Lost of Attachment Cell

Apoptosis Network Survival Factors Signaling Death Receptors Signaling Genotoxic Stress Lost of Attachment Cell Cycle stress, etc ER Stress Initiator Caspases (8, 10) P 53 pathway BH 3 only Ca++ Signaling NFk. B Bcl-2 like Bax, Bak Smac Caspase 12 IAPs Mitochondria Cytochrome c Apaf 1 AIF Caspase 9 Effecter Caspases (3, 6, 7) Apoptosis Slide courtesy Ting Zhang

More Ambitious Semantics Go beyond extracting entities Find relations between entities Convert clauses into

More Ambitious Semantics Go beyond extracting entities Find relations between entities Convert clauses into propositions Go beyond combining via co-occurrence Draw inferences between the facts and relations Incorporate domain knowledge

Our Approach Assign Semantics using Statistics Hierarchical Lexical Ontologies to generalize Redundancy in the

Our Approach Assign Semantics using Statistics Hierarchical Lexical Ontologies to generalize Redundancy in the data Build up Layers of Representation Syntactic and Semantic Use these in a feedback loop

Goal: Convert Text to Generalized Semantics HDAC inhibitors induce differentiation of cultured murine erythroleukemia

Goal: Convert Text to Generalized Semantics HDAC inhibitors induce differentiation of cultured murine erythroleukemia cells. [<enzyme> <inhibitor>] <cause> <change-to> [<cell-prep-type><organism-type> [<cancer-type><cells>]]

Thank You! More information: http: //biotext. berkeley. edu

Thank You! More information: http: //biotext. berkeley. edu