Almaden Research Center Structured Unstructured Why Bother Better
Almaden Research Center Structured + Unstructured: Why Bother? § Better information finding – Query text and relational data together – Query metadata and unstructured data together – Bring structure to unstructured data – Enterprise search of web sites, email, … § Better analysis – Leverage “semantics” from unstructured context – Derive further dimensions from unstructured data – Add precision to search – Compliance, call center performance, … § NOT transactional apps – Unstructured => uncertain © 2003 IBM Corporation
Almaden Research Center Imprecise The Structure-Precision Plane Information retrieval systems with free text search Precise QUERIES Text Analytics (uncertain annotations) Relational databases with SQL queries Structured DATA Unstructured © 2003 IBM Corporation
Almaden Research Center Imprecise The Structure-Precision Plane Information retrieval systems with free text search Precise QUERIES Interpret keyword queries (uncertainty in user intent) Query Imprecision Relational databases with SQL queries Structured DATA Unstructured © 2003 IBM Corporation
Almaden Research Center Integrated Search Imprecise query with multiple possible interpretations over data from multiple sources Traditional interpretation Keyword Search Return documents that contain the keywords “paper”, “ 295”, “contact” and phone Paper 295 contact phone Paper Contact Email 295 Beineke beineke@stanford. edu 413 Kossman kossman@inf. ethz. edu 321 Miller miller@cs. toronto. edu True user intent could be: Return paper #295 contact name from pubs db and find the contact’s phone number from emails © 2003 IBM Corporation
Almaden Research Center Business Intelligence in CRM Text-enabling the data-warehouse to answer aggregate queries such as: Structured Attribute Model: Malibu Precise query over annotated inherently uncertain data What is the number of angry calls by Dealer and Model of Car ? SPOKE WITH MIKE IN SVC AT ACME CHEVY. HE ADVISED THAT THEY HAD ADDED SPRINGS TO REAR OF VEHICLE, NOW HAS A CALL INTO DPSM BILL HARROLD TO REVIEW WITH HIM BEEFING UP THE FRONT SUSPENSION. STATES HE CANNOT TELL IF CUST IS OVERLOADING VEH AS THEY DO NOT HAVE SCALES TO WEIGH ………………… ……, CUST YELLING AND SCREAMING. WHEN ADVISED THAT DPSM IS WAITING ON INFORMATION FROM DLRSHP TO MAKE DECISION ON REPAIRS. CUST STATES HE TOOK VEH INTO DLR 3 DAYS AGO AND DLR TEST DROVE VEHICLE WITH CUST AND AGREED THAT VEHICLE WAS DANGEROUS TO DRIVE. CUST ALSO ANGRY THAT HE HAS CALLED SVC MGR, Jack Green AT ACME 2 DLRSHP AND NO ONE WILL RETURN HIS CALLS. CUST REQUESTED LOANER VEHICLE UNTIL HIS VEHICLE IS REPAIRED. DENIED LOANER, WHICH ALSO SEVERLY UPSET CUST, CUST STATES HE HAS BEEN COMPLAINING ABOUT THIS SINCE VEH WAS NEW AND HIS USE OF VEHICLE IS LIMITED AND CUST FEELS © 2003 IBM Corporation
Almaden Research Center Information Intensive Solutions Traditional View Application Today’s View Application Emerging View Application Database Management Federated “Semantic” Query System Federated Access Annotate Crawl and Index Storage Management © 2003 IBM Corporation
- Slides: 6