Next generation web search and Questionanswering technology Gary
Next generation web search and Questionanswering technology Gary Geunbae Lee Dept. of CSE, Postech & Di. Quest. com March, Oct, 2001 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 1
Contents Commercial e-solutions: search, QA, CRM Natural Language Processing Technology Information Retrieval Technology Intelligent QA solutions Conclusions 2 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Conventional search engine Directory based u Yahoo: everything u AOL search: web+AOL contents u Directhit: click monitoring for popular site top ranking u Looksmart: human compiled web site directory Search based u Altavista: you know u Excite: you know u Lycos: from search directory service u Fastsearch: first time 0. 2 billion web page indexing u Inktomi: highly scalable indexing system u Google: link analysis (high precision) Current trends: directory+ search integration 3 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
4 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Recent NL search and QA systems Internet search with natural language and intelligence u askjeeves: horizontal question-answering u Northernlight: natural language and phrasal search (clustering) u Empas: korean natural language search (? ) u Lexiquest: lexipacks: ontology/dictionary for specific domain (context search) u Oingo: meaning oriented search (big ontology) Natural language question answering 5 u Neuromedia (nativeminds): chatter bot (Eliza technology) u Easyask: data-base question answering u Brightware: web, email question answering (faq finding), recommendation u inquizit technology: natural language semantic analysis (concept engine) u YY-software: automatic email answering u Answerlogic: wordnet based question-answering u Answers. com: faq finding Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Interaction with customers for e-business Internet users over 130 m up to 350 m by 2003 (e. Marketer) Internet commerce $1. 3 trillion by 2003 (Forrester research) E-business sophistication From e-commerce to e-business Intelligent CRMCustomer history communications Purchase likelihood Transactions Staffing requirements Prior information history Corporate policy about service etc contents Time 6 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Customer interaction channel 7 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
CRM architecture – 3 different views Integration of data warehousing & data mining, web call-center, automatic sales and marketing Web-enabled Operational • Sales force automation • Marketing Automation • Field Service Automation • Customer Service/Support Analytical Collaborative • Data Warehouse • Data Mart • Marketing Automation • Data Marketing 8 Di. Quest. com • Voice(IVR, CTI, ACD) • e-Mail • Fax/Direct Mail • Web Site Source: META Group, June 1999 Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
World wide CRM market Year 2003 CRM market • Application License $8. 3 billion • Implementation $5. 2 billion • SW Maintenance $3. 2 billion 9 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Call Center solutions: integration of media Call Center Inbound Calls Outbound Calls Kiosk Contact Center Fax Telephone 10 Di. Quest. com Direct Mail Sales Force Automation Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment WWW / Email
Contents Commercial e-solutions: search, QA, CRM Natural Language Processing Technology Information Retrieval Technology Intelligent QA solutions Conclusions 11 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
NLP technology: eliza scripting <heading-0> a: 0. 2 p: 35 *what*keyword* "Rule Heading" the rule activation level the pattern priority and word pattern r: robot's reply <work-0> a: 0. 5 p: 60 Wh *your*job* r: I’m a full time Verbot <leasure-2> a: 0. 4 p: 30 What time * your * job over. r: I don’t get any time off, I always have to be here available for you. 12 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
POS tagging (with morpheme analysis) 포항공대 이근배 교수님께서 신을 신고 신고하러 가신다. [ 0, 0 ( 0, 0 )] 등 1. 000000 e+00(1. 000000 e+00) s<문장시작>([) [ 1, 10( 1, 1 )] 미 8. 288423 e-11(6. 102822 e-13) MPO<포항공대>(포항공대) [11, 11( 2, 2 )] 등 8. 736421 e-02(2. 559207 e-20) s<#>(#) [12, 18( 3, 3 )] 미 9. 236515 e-08(7. 008548 e-24) MPN<이근배>(이근배) [19, 19( 4, 4 )] 등 8. 736421 e-02(2. 939022 e-31) s<#>(#) [20, 23( 5, 5 )] 등 4. 469725 e+00(1. 564634 e-25) MC<교수>(교수) [24, 26( 6, 6 )] 등 1. 373613 e+02(1. 504397 e-25) -<님>(님) [27, 30( 7, 7 )] 등 1. 307859 e+01(1. 831031 e-25) j. C<이>(께서) [31, 31( 8, 8 )] 등 8. 736421 e-02(7. 678394 e-33) s<#>(#) [32, 34( 9, 9 )] 등 3. 250709 e+00(3. 667919 e-27) MC<신>(신) [35, 37(10, 10)] 등 1. 264760 e+01(3. 865534 e-27) j. C<을>(을) [38, 38(11, 11)] 등 8. 736421 e-02(1. 621005 e-34) s<#>(#) [39, 41(12, 12)] 등 5. 807344 e+00(1. 021970 e-28) DR<신>(신) [42, 43(13, 13)] 등 3. 936314 e+01(1. 918250 e-28) e. CC<고>(고) [44, 44(14, 14)] 등 8. 736421 e-02(8. 044147 e-36) s<#>(#) [45, 49(15, 15)] 등 8. 588220 e-04(1. 297090 e-33) MC<신고>(신고) [50, 51(16, 16)] 등 2. 626376 e+01(1. 404345 e-33) y<하>(하) [52, 56(17, 19)] 등 1. 445488 e+03(1. 043073 e-31) e. CC<러>(러) [52, 56(17, 19)] 등 1. 445488 e+03(1. 043073 e-31) s<#>(#) [52, 56(17, 19)] 등 1. 445488 e+03(1. 043073 e-31) DI<가>(가) [57, 58(20, 20)] 등 4. 657808 e+01(1. 348953 e-31) e. GS<시>(시) [59, 61(21, 21)] 등 1. 841659 e+01(4. 754894 e-31) e. GE<는다>(ㄴ다) [62, 64(22, 22)] 등 1. 250000 e-07(1. 365400 e-38) s. <. >(. ) [65, 65(23, 23)] 등 2. 500000 e-05(1. 638481 e-49) s<문장끝>(]) 13 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
NLP technology: postag example POSTAG architecture Features Input sentence u Statistics+rule combination u Tight coupling with morpheme analysis u Morpheme graph representation u Pattern dictionary concepts for unknown words Morph. Analyzer Morph adjacency table 100, 000 morpheme dic. u 1, 500 morpheme pattern dic. Morpheme graph Error correcter Error corrected Morpheme graph Parser, application 14 Di. Quest. com Morpheme pattern dic POS tagger POS Bigram Syllable Trigram u Morpheme dic Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment Error Correction rules
Unknown word guessing morpheme anlaysis with unknown word guessing morpheme pattern dic Syllable constraints for each part of u speech in Korean Input sentence Morph. Analyzer Morph adjacency table Lexical probabilities for unknown Filter words u Morpheme dic Morpheme pattern dic Filtering info. Filtered Morpheme graph Syllable tri-gram equations POS tagger Pattern dic for unknown words 15 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Syntactic parsing: pospar example Korean CCG q Functional application X/(Args {Y}) Y X/Args Y X(Args {Y}) XArgs q Composition X/(XArgs. X) Y/(YArgs. Y) X/(X(Args. X Args. Y)) YArgs. Y X(Args. X {Y}) X/(X(Args. X Args. Y)) POSPAR architecture Morpheme graph Syntactic Category Trigram Syntactic Analyzer Syntax dic Syntax pattern dic Parse tree Semantic Analyzer Syntactic dic. and Syntactic pattern dic. q Coordination X CONJ X X q Variable category $v, $vp q Featured category v : D, H, I, E vp : 었, 었었, 고있, 어있, 겠, 더, 시 17 Intelligent Dialog Interface Solution s. Di. Quest. com : 평서, 의문, 명령, 청유, 약속, 문장 for friendly User Interactions in Internet WEB Environment np : j이, j를, j에게
Semantic analysis: example 자연어처리를 전공한 교수가 가르치는 과목은? (What is the course name that a professor whose major is NLP teaches? ) -------------- Semantic Result -------------Scope: [0, 17] [ques, [contra, term(<quant, bare, sing>, X 7, [and, [course, X 7], [teach, EV 3, term(<def, bare, sing>, X 6, [and, [professor, X 6], [major, EV 1, X 6, term(<quant, bare, sing>, X 1, [NLP, X 1])]]), X 7, _: p[j에]; 0 F]])]] 18 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Semantic analysis: system overview Morphological Analyzer POS Tagger Input Sentence K-CCG Parser Syntactic Trees . . . … Semantic Analyzer QLF Structures Slot-Filler Generator Topic/Subject Extractor Semantic-based Applications 19 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment Semantic Dictionaries (base/dom/pat/user/rel) Thesauri . . . …
NLP technology: Korean Word. Net Map Korean words to other existing thesaurus (Word. Net) u Using bi-lingual dictionary u Automatic mapping tools using WSD techniques Korean word English word Word. Net synset ws 1 ew 1 ws 2 … kwi_j … … wsk ewm … wsn 20 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
NLP technology: Korean Word. Net Multiple heuristics for WSD u Maximum similarity u Prior probability u Sense ordering u IS-A relation u Word match u Cooccurrence Combining heuristics with machine leaning techniques 21 u Decision tree u Logistic regression Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Contents Commercial e-solutions: search, QA, CRM Natural Language Processing Technology Information Retrieval Technology Intelligent QA solutions Conclusions 22 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Several gaps in the search task interactive QA (askjeeve) Mis-conception Info need Queries in context (domain) (autonomy, verity) Mis-translation Verbal form Mis-formulation query Query refinement Search engine results Polysemy/synonymy Clustering (northernlight) 23 Di. Quest. com Nlp query (easyask, lexiquest) web Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Web search vs classical IR Classical IR u Fixed document corpus u Document relevancy is the goal u Contexts (domain) and individual users (preferences) ignored Web search u Public web: static + dynamic (generated from RDB) u High quality ranking is the goal (meet the user need given poor query and heterogeneity of the web) u 24 Various needs such as informational, navigational, transactional Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Search engine techniques First generation u TF/IDF from standard IR u Use only page data (text data) u Html parsing for weighting Second generation u Use off-page and web specific data u Such as link (connectivity) analysis, click-through data (relevance feedback), anchor-text data Third generation u Answer the need behind the query u Semantic analysis, context determination, dynamic corpus from RDB, validity (authority), cross-lingual/cross-media, question-answering, specific enterprise site search, etc 25 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
ⓝ constructing high-quality corpus: Web ROBOT Web-page Filtering 1 st trying Target Web Document • URL Domain Filtering • File Type Filtering • URL Name Filtering Various User-Input Option : Filtering Constraints Filtered Target URL File Collection & Management • manage independent site • saved by URL Hierarchy • make Log Files Domain Collected File. Site 1 Site 2 File info. (Date, Size, Link) Saved File Smart Updating same Web Document • save new Web-page • overwrite updated-page WEBtagger Result File Manager Result File Pool 26 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
High quality corups: WEB preprocessing Web document Input n. ROBOT (Web Crawler) HTML Refiner Sentence Extractor • Tag Corrector & Parser • Garbage String Filter • Regexp Patterns Rule • • Heuristic Rule Abbreviation Dic Symbol-Delimiter DB C 4. 5 Rule • Entry Dic, Pattern Dic Word Spacing Corrector • Postag Noun Trie Dic. • C 4. 5 Rule Result File Manager XML DOC. Form A POSTAG 27 Di. Quest. com Form B SAA Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment Result File Pool Form C TTS System XML DOC. Form D POSNIR
High quality corpus: Automatic indexing Indexing architecture u Based on general morpheme tagging u Term Extraction nominals - single terms compound noun generation – using rules automatically learned – filtering through precision (preventing overgeneration) compound POS Tagging Term Extraction Term weighting for document ranking based on TF, IDF measures 28 Morpheme Analysis noun segmentation – based on mutual information u Documents Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment index DB
Compound nouns in indexing POSNIR features u Noun extraction ex)철수는 회의에서 그 사건을 보고할지도 모른다. (Chulsoo may report the accident at the meeting) bogo(O) ex)지도를 보고 길을 찾는다. (see a map and find a load) bogo(X) u Compound noun segmentation Compound noun patterns plus statistical collocation (mutual information) ex) 대학생선교회(undergraduate missionary) 대학생/선교회 (O), 대학 (university)/생선(fish)/교회(church) (X) u Compound noun indexing (phrasal indexing) Using automatically acquired extraction rules Broad coverage of compound noun pattern recognition ex) 증기로 움직이는 기관차(locomotive operating by steam) 증기 (steam)/기관차(locomotive) 29 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Dealing with user queries: NL query Searh Result NL Query NLP Engine Morpheme Analysis Boolean Operation and Ranking Tagged Sequence DB Search Query Term Extraction and Boolean Formulation 30 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment DB
Humans extracts meaning in many linguistic levels but current web search is only counting words – Is it enough? Part of words – morpheme Word order Word lexicals Text structure or document structure Clue words/cue phrase Pronunciation/prosody World knowledge 31 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
NLP helps high-precision web search Information retrieval dilemma u Hard to ask right questions u Too much information u Irrelevant information u No information (phrase mismatch) NLP tools to help avoiding information dilemma 32 u Context of words: collocations u Syntax cues: how word is used u Concept mapping with clustering u Interactivity by clarifying dialog Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Other Related Technology Contents Auto-Builder Text Preprocessor Wrapper Induction Machine Learning Domain Ontology Mgmt Tool Intelligent Web Robot Similar Text Clustering K-Wordnet Auto-Builder XML & KM Information Extraction Korean NLP Core Engine Document Categorization POS-Tagging Syntactic Analysis Text Summarizer Semantic-Discourse Analysis Text Categorizer Answer Suggester Multi-Lingual IR Engine IR Application Comp-Noun Analyzer 33 Di. Quest. com Q/A Application DBQ/A Solution FAQ Finder Solution Shopping Aid Agent Solution Fuzzy-SQL Generator NL-Query Analyzer Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Contents Commercial e-solutions: search, QA, CRM Natural Language Processing Technology Information Retrieval Technology Intelligent QA solutions Conclusions 34 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
The third generation search engine Natural language question-answering : Answer-providing for dialog questions Answer sentence extraction (Di. Quest d-Answer) Pre-defined question types u Semantic-level processing of NL query u Answer finding from FAQ (Di. Quest e-Answer) Systematic construction of FAQ u Finding semantically same questions from FAQ list u Email/Web call center applications u Answer finding from R-Database (Di. Quest db-Answer) Finding answers from R-DB attributes u SQL conversion from natural language query u Companies u 35 Neuromedia, Answerfriend, Answers. com, Brightware, Answerlogic, Easyask, etc. Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Easy Interface with Natural Dialogues Di. Quest Q/A : Total dialog information retrieval solutions u Easy and accurate information retrieval using natural language dialog u Retrieval from any information source including internet/intranet web documents, FAQ knowledge, databases Di. Quest d-Answer Di. Quest db-Answer Di. Quest Q/A Solution Di. Quest e-Answer 36 Di. Quest. com Other NLP Applications Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Why Dialog Web Interface? Customer-Side § § Efficiency: no need for web surfing Accuracy: exact description of search Convenience: using everyday dialog sentences Customer satisfaction guaranteed! Company-Side § Easy to catch customers’ needs in natural language query (Not easy to catch customers’ needs using only keywords query) § Customer-oriented Web content management § Customer–oriented FAQ K/B construction and maintenance § Personal profile management for each customer (CRM) 37 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Spectrum of Products Service Package Library Email/web call center Intranet question answering Vertical IR Answer sentence extraction Question type processing Structure indexing Semantics Dialogs shopping mall retrieval Wireless question answering FAQ finding NL-SQL conversion Answer indexing Complex term indexing Morphology Syntax CRM/KM/E-commerce 38 Component Di. Quest. com Information retrieval Document processing Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment Language processing
Branded Products Brand d. Answer e. Answer db. Anwer 39 Di. Quest. com applications properties performance • Vertical retrieval • High speed indexing • Answer sentence extraction • Optimized retrieval • 0. 1 million doc. answer sentence extraction (about 1 sec response) • 1 million doc vertical IR (about 0. 3 sec) • platform: Linux, Solaris, HPUX • Document search for KM/internet/portals • Answer finding for KM/intranet • High precision search for wireless application • Answer finding from FAQ knowledge base • Real time FAQ construction/indexing • Possible fusion with d-Answer/ db-Answer • Over 10, 000 FAQ doc. (about 0. 3 sec response) • More than 1000 simultaneous access • platform: Linux, Solaris • Email call center • Web call center • Automatic FAQ knowledge base construction • CRM analysis • Brightware • Egains • SQL feature computation • Automatic vocabulary construction • Optimized for given RDB schema • 100% retrieval accuracy • Over 100, 000 records (0. 3 sec response) • platform: Linux, Solaris • Product search for ecommerce (B 2 B, B 2 C, B 2 G) • Employ portal/business portal • Intranet/KM DB search • Easyask • ELF/Microsoft Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment competitors • Verity • Askjeeves
Di. Quest d-Answer : Vertical IR agent with answer extraction High precision optimizable IR engine u Horizontal IR limitations : focusing high speed indexing, sacrificing high precision u Why Vertical IR? User intention analysis using language processing Optimization possible for specific domain/portal Intelligent IR engine for answer sentence extraction u Conventional natural language IR (e. g. askjeeves) limitations Only provide documents which possibly include query terms It is the USER who needs to find exact information in the documents u Why Q/A System ? Provide direct answers (information) rather than thousand of documents Towards true meaning of information retrieval (next generation IR) 40 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Di. Quest d-Answer Merits Spectrum of solutions from high precision IR to intelligent question-answering system with natural language dialog query u Web site question answering engine: extract sentences that contain possible answers as well as documents for users’ questions u Di. Quest d-Answer : Question Example “삼성그룹 회장은? ” (Who is the chairman of Samsung group? ) “야후코리아의 홈페이지 주소와 김경희 팀장의 이메일은? ” “야후코리아의 사장은 누구인가” “윈도우 미의 가격? ” (What is the price of Windows ME? ) Di. Quest d-Answer Result File & Answer “물건 반납에 관한 것을 상담하려면 어디에 전화해야 하나요? ” “화공과는 어디에 있나요” (Where is the CE dept. ? ) 41 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment Query Analysis “R관” “공학관 7층”
Di. Quest d-Answer Preview Answer Suggestions 42 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Di. Quest Site. Q – Natural Language Answer Extraction System Architecture 43 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Di. Quest e-Answer Intelligent answering agent with FAQ Finding semantically same questions from FAQ knowledge-base § Exact pin-pointing of users’ question intentions § Structural analysis of sentences for finding same-meaning questions § Highly precise retrieval using specialized analysis for question and answer parts in faq KB § Conventional keyword IR techniques cannot retrieve semantically same questions !! FAQ finding engine 44 u FAQ : frequently asked question knowledge-base (question/answer pairs) u 80% of user questions can be processed using well constructed FAQ lists u Automatically finding optimized answers from FAQ lists u Reducing email/phone calls using automatic FAQ finding solutions (customer satisfaction increased) Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Di. Quest e-Answer Preview (1) Answer Suggestion 45 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Di. Quest e-Answer Preview (2) e-Answer combined with d-Answer Suggestion 46 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Di. Quest FAQ Finder System Architecture 47 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Di. Quest DB-Answer Intelligent RDB search Engine Translate users natural language questions into standard SQL for relational database computing § Recursive natural language query (automatic query refinement) § Fusion solutions with e-Answer and d-Answer § Integrated search for product description texts with product database § Integrated search for web documents with highly variable data in structured database Database search engine 48 u information retrieval in the relational database (using SQL computation) u automatic term indexing by analyzing running database Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Di. Quest DB-Answer Preview(1) 자연어 질의 분석 후 SQL 생성 49 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Di. Quest DB-Answer Preview(2) 이전 결과에 대한 담화(Discourse) 유지 50 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Di. Quest DBQ – Natural Language SQL Interface System Architecture 51 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Web Total Q/A System Architecture 52 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
E-commerce applications SAA (Shopping Aid Agent) – web mining back-end solution Web robots Category specific web crawling (remove duplicates) Categorizer Categorize the web documents into the pre-defined domain classes Extractor Web information extraction to build R-db extraction using m. DTD (modified Document Type Definition) Sequential m. DTD learning to generate new m. DTD rules Natural Language query to automatically constructed RDB Comparison-based shopping, automatic job search, continuededucation Whizbanglabs. com (from CMU) 53 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
E-commerce SAA Doc. Type Definition SGML Documents Basic Idea Analysis & Encoding DTD Learning Training Documents (structured HTML) 54 Di. Quest. com Extraction m. DTD Web Documents (structured and semi-structured Documents) Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
E-commerce SAA Web Robot Seed URLs HTML Gathering knn Bi-categorizing Categorizer Sequential m. DTD Learner HTML Documents Seed m. DTDs m. DTD Parsing Example Extraction Domain Documents Sequential Learning Learned m. DTD Structured Documents m. DTD Parsing Extractor Extraction Slot Filling DB building Learned m. DTD 55 Di. Quest. com Structured /Semi-structured Documents Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment Domain DB for AV
Contents Commercial e-solutions: search, QA, CRM Natural Language Processing Technology Information Retrieval Technology Intelligent QA solutions Conclusions 56 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
NLP QA/vertical search applications Internet/intranet vertical retrieval e. CRM/web-based CRM (automated call center) Comparison based e-shopping mall/meta mall WAP enabled PDA/cell phone retrieval KMS embedded solutions Voice enabled retrieval/ voice portal retrieval 57 Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
Future perspectives Long term future u Apple’s bow tied man -- new millenium dream u SF films -- “angel” in “disclosure” movie u HAL in space odyssey 2001 (forever dream? ) Short term future u General magic’s portico system (http: //www. genmagic. com/portico_home. shtml) u Microsoft persona project -- peedy (http: //msdn. microsoft. com/workshop/cframe. htm#/workshop/imedia/agent/default. asp) u 58 Diquest. com – total QA solution (www. diquest. com demo) Di. Quest. com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment
- Slides: 58