The Claim Framework Catherine Blake School of Library
The Claim Framework Catherine Blake School of Library and Information Science University of Illinois at Urbana-Champaign clblake@illinois. edu
Motivation • Relentless increase in electronically available text – Life Sciences • 17 millionth entry added in April 2007 • 5, 200 journals indexed • 12, 000 new articles each week ! – Chemistry – more than 110, 000 articles in 1 year alone • Consequences: – Hundreds of thousands of relevant articles Shift from Retrieval to Synthesis – Implicit connections between literature go unnoticed 2
The Claim Framework • Scientists use a shared sublanguage to express claims made in an empirical study The Claim Framework captures the key characteristics of the claim sublanguage • Text mining can be used to populate the Claim Framework automatically An automated system will identify all and only the claims that have been identified 3
Claim Definition • “To assert in the face of possible contradiction” • Example sentence reporting a claim – “This study showed that Tamoxifen reduces the breast cancer risk” • Explicit Claim in the Claim Framework – Tamoxifenagent – reduceschange – [breast cancer risk] object 4
Distribution of Claim Categories Category Explicit Implicit Total (%) Pilot(%) 2489 77. 11 332 Main(%) 83. 42 2157 76. 63 87 2. 70 3 0. 75 84 2. 98 Observation 298 9. 23 24 6. 03 274 9. 73 Correlation 174 5. 39 12 3. 02 162 5. 75 Comparison 165 5. 11 27 6. 85 138 4. 9 100 398 100 2830 100 Total 3228 5
Inter Annotator Agreement Information Facet Kappa Agent 0. 71 Object 0. 77 Change 0. 57 Change+Change. Dir 0. 88 perfect Agreement substantial moderate almost 6
Location of Claims Section Abstract Introduction Method Result Discussion Total Sentences With % Claim Total section 98 309 31. 72 357 979 36. 47 6 1121 0. 54 293 1829 16. 02 539 1406 38. 34 1250 5535 22. 58 % claim 7. 84 28. 56 0. 48 23. 44 43. 12 100. 00 7
Interested ? • Send me an email clblake@illinois. edu • To see more details on the Claim Framework and an automated approach to populate explicit claims: – Blake, C. (2010) Beyond genes, proteins, and abstracts: Identifying scientific claims from full-text biomedical articles, Journal of Biomedical Informatics, 43(2), 173 -189. 8
- Slides: 8