Next Steps in Literature Mining Marti Hearst UC
- Slides: 12
Next Steps in Literature Mining Marti Hearst UC Berkeley ASIST 2003 Literature Mining Panel
Literature Mining Goals Discover new information … … As opposed to discovering which statistical patterns characterize occurrence of known information. Method: Use large text collections to gather evidence to support (or refute) hypotheses Extract facts Make connections Draw inferences
Outline Don’t Repeat History Use Time Machines More Ambitious Semantics
Don’t Repeat History Don’t show the obvious e. g. , Cheney is president Don’t show what you’ve already shown Only show the most recent version of information Show which information is not present Changes in the usual pattern Something stops happening
Create “Time Machines” Do systematic analyses of how to find out what is now known based on what used to be known. Example: See if information that was found via microarray analysis could have been found in the literature before its invention. Reverse-engineer the method and use it to find new information. Different approach in a new paper by Lukose, Adar, and Chan of HP Labs
More Ambitious Semantics Go beyond extracting entities Find relations between entities Convert clauses into propositions
: • Mouse Bim proteins (isoforms EL, L, S) binds to human Bcl-2 (bacteriophoage screening using c. DNA expression library from T-Lymphoma cell line KO 52 DA 20). • Human Bim. EL protein is 89% identical to mouse Bim. EL, Human Bim. L is 85% identical to mouse Bim. L (Hybridization of mouse bim c. DNA to human fetal spleen and peripheral blood c. DNA library). • Bim m. RNA is detected in B and T lyphoid cells (Northern blot analysis of mouse KO 52 DA 20, WEHI 703, WEHI 707, WEHI 7. 1, CH 1, WEHI 231 WEHI 415, B 6. 23. 16 BW 2 cell extracts). • Bim. L protein interact with Bcl-2 OR Bcl-XL, or Bcl-w proteins (Immunoprecipitation (anti-Bcl-2 OR Bcl-XL OR Bcl-w)) followed by Western blot (anti. EEtag) using extracts human 293 T cells co-transfected with EE-tagged Bim. L AND (bcl-2 OR bcl-XL OR bcl-w) plasmids) • Bim. L deleted of the BH 3 domain does not bind to Bcl-2 OR Bcl-XL, or Bcl-w proteins (under experimental conditions mentioned above)
Apoptosis Network Survival Factors Signaling Death Receptors Signaling Genotoxic Stress Lost of Attachment Cell Cycle stress, etc ER Stress Initiator Caspases (8, 10) P 53 pathway BH 3 only Ca++ Signaling NFk. B Bcl-2 like Bax, Bak Smac Caspase 12 IAPs Mitochondria Cytochrome c Apaf 1 AIF Caspase 9 Effecter Caspases (3, 6, 7) Apoptosis Slide courtesy Ting Zhang
More Ambitious Semantics Go beyond extracting entities Find relations between entities Convert clauses into propositions Go beyond combining via co-occurrence Draw inferences between the facts and relations Incorporate domain knowledge
Our Approach Assign Semantics using Statistics Hierarchical Lexical Ontologies to generalize Redundancy in the data Build up Layers of Representation Syntactic and Semantic Use these in a feedback loop
Goal: Convert Text to Generalized Semantics HDAC inhibitors induce differentiation of cultured murine erythroleukemia cells. [<enzyme> <inhibitor>] <cause> <change-to> [<cell-prep-type><organism-type> [<cancer-type><cells>]]
Thank You! More information: http: //biotext. berkeley. edu
- Marti a. hearst
- Marti hearst
- Mardipäeva kombed
- X.next = x.next.next
- Hearst tower structure analysis
- William randolph hearst
- Hearst patterns python
- Hearst museum
- Strip mining vs open pit mining
- Mineral resources and mining chapter 13
- Difference between strip mining and open pit mining
- Text and web mining
- Mining multimedia databases