Graphs of Consistent Concepts Data mining in a

  • Slides: 11
Download presentation
Graphs of Consistent Concepts Data mining in a medical domain (Pawel Matykiewicz, Wlodzislaw Duch,

Graphs of Consistent Concepts Data mining in a medical domain (Pawel Matykiewicz, Wlodzislaw Duch, John Pestian)

The story(1) • Hospital workflow: – – – Chest X-Ray order (Electronical Medical Record)

The story(1) • Hospital workflow: – – – Chest X-Ray order (Electronical Medical Record) Chest X-Ray (High Quality JPEG) Dictation ( “CLINICAL HISTORY: 9 -month-29 -day-old male with a history of cough. Rule out pneumonia. PROCEDURE COMMENTS: None. COMPARISON: XX/XX/XX. FINDINGS: There is mild hyperinflation of the lungs with increased peribronchial markings most consistent with viral versus reactive airway disease. Hazy increased density is seen in the right middle lobe, left lower lobe which could represent subsegmental atelectasis. Hazy increased density is also noted at the lingula with partial effacement of the left heart contour which could represent atelectasis versus early pneumonia. No pleural effusion is noted. The cardiothymic silhouette is within normal limits. Soft tissues and bony structures are unchanged. IMPRESSION: Findings most consistent with viral versus reactive airway disease. Patchy atelectasis is associated. Lingular early infiltrate cannot be excluded. ” ) – Billing ( ICD 9 CM 590. 8 = INFECTIONS OF KIDNEY: OTHER PYELONEPHRITIS OR PYONEPHROSIS, NOT SPECIFIED AS ACUTE OR CHRONIC )

Creating a novel tool(2) Recognition memory Semantic memory Episodic memory Full text annotation

Creating a novel tool(2) Recognition memory Semantic memory Episodic memory Full text annotation

UMLS (3) • UMLS = Unified Medical Language System • UMLS contains: – –

UMLS (3) • UMLS = Unified Medical Language System • UMLS contains: – – – 1, 195, 781 unique English concepts (CUI) 2, 873, 310 unique English phrases (SUI) 3, 283, 983 unique English, normalized words (WUI) 88 different ontologies (e. g. ICD 9 CM = 15871 CUIS) 36, 627, 948 relations 11, 495, 405 co-occurrence relations

Example(4) • Concept description: – ENG|zygopleurage zygospora|C 1473040|L 5302079|S 6018172| – C 1533582|ENG|P|L 5432111|PF|S

Example(4) • Concept description: – ENG|zygopleurage zygospora|C 1473040|L 5302079|S 6018172| – C 1533582|ENG|P|L 5432111|PF|S 6215413|Y|A 7881881|2532798015|412807000||SNOMEDCT|PT|412807000|Serum inhibin measurement|4|N|| • Relation description: – C 0000039|A 6841046|CODE|RO|C 0364349|A 0683492|CODE|has_component|R 39728053||LNC||Y|N|| SUI WUI CUI WUI SUI CUI CUI

Sense Disambiguation(5) • Word Sense Disambiguation: – “cold” (word): • "I am taking aspirin

Sense Disambiguation(5) • Word Sense Disambiguation: – “cold” (word): • "I am taking aspirin for my cold" • "Let's go inside, I'm cold“ • Phrase Sense Disambiguation: – “cold” (WUI): • cold temperature (CUI) • Common Cold (CUI) • Cold Therapy (CUI) • Chronic Obstructive Airway Disease (CUI) • Cold Sensation (CUI) • Cold brand of chlorpheniramine-phenylpropanolamine (CUI)

Concept Mapping(6) • Tough way: • Easy way:

Concept Mapping(6) • Tough way: • Easy way:

Graphs of consistent concepts(7) JJ(X) => ( NN(Y) => C(XY) ) X versus Y

Graphs of consistent concepts(7) JJ(X) => ( NN(Y) => C(XY) ) X versus Y Z => ( C(YZ) => C(XZ) ) X is associated => ( C(X) => P(X) = 1 )

Graphs of consistent concepts(8)

Graphs of consistent concepts(8)

Graphs of consistent concepts(9)

Graphs of consistent concepts(9)

Summary(10) • Data set: – 30 training documents ( 6 ICD 9 CM codes,

Summary(10) • Data set: – 30 training documents ( 6 ICD 9 CM codes, 137 CUIs ) – 30 testing documents ( 6 ICD 9 CM codes, 301 CUIs ) � before learning after learning training 66% 99% testing 53% 66% – 30 training documents ( 6 ICD 9 CM codes, 301 CUIs ) – 30 testing documents ( 6 ICD 9 CM codes, 137 CUIs ) • To – – – � before learning after learning testing 53% 80% training 66% 85% do: Construction finding Concept discovery State discovery