Effective Pattern Discovery for Text Mining PRESENTER CHUANG























- Slides: 23
Effective Pattern Discovery for Text Mining PRESENTER : CHUANG, KAI-TING AUTHORS : NING ZHONG, YUEFENG LI, AND SHENG-TANG WU 2012, TKDE Intelligent Database Systems Lab
Outlines n Motivation n Objectives n Methodology n Experiments n Conclusions n Comments Intelligent Database Systems Lab
Motivation • Many data mining techniques have been proposed for mining useful patterns in text documents. • How to effectively use and update discovered patterns is still an open research issue. Frequent Itemset E. g. : {Milk, Bread, Diaper}, {Milk} Intelligent Database Systems Lab
Objectives • This paper presents an innovative and effective pattern discovery technique which includes the processes of pattern deploying and pattern evolving. Intelligent Database Systems Lab
Pattern-based ( Phrase-based) • Low frenquency. • Misinterpretaion. Intelligent Database Systems Lab
Methodology-Framework Pattern Taxonomy Model Pattern Deploying Method Inner Pattern Evolution Intelligent Database Systems Lab
Methodology-Frequent and Closed Patterns Pattern Taxonomy Model Pattern Deploying Method Inner Pattern Evolution Intelligent Database Systems Lab
Methodology-Pattern Taxonomy Model Pattern Deploying Method Inner Pattern Evolution Intelligent Database Systems Lab
Methodology-PDM Pattern Taxonomy Model Pattern Deploying Method Inner Pattern Evolution Intelligent Database Systems Lab
Methodology-PDM Pattern Taxonomy Model Pattern Deploying Method Inner Pattern Evolution Intelligent Database Systems Lab
Methodology-D-Pattern Mining Algorithm Pattern Taxonomy Model Pattern Deploying Method Inner Pattern Evolution Intelligent Database Systems Lab
Methodology-IPEvolving Pattern Taxonomy Model Pattern Deploying Method Inner Pattern Evolution Intelligent Database Systems Lab
Methodology-Shuffling Pattern Taxonomy Model Pattern Deploying Method Inner Pattern Evolution Intelligent Database Systems Lab
The list of methods used for evaluation Intelligent Database Systems Lab
Baseline Models-Concept-Based Models CBM: CBM Pattern Matching: Intelligent Database Systems Lab
Baseline Models-Term-Based Mothods Rocchio: Prob: TF-IDF: BM 25: SVM: Intelligent Database Systems Lab
Comparison of all methods on the first 50 topics Intelligent Database Systems Lab
Experiment Intelligent Database Systems Lab
Experiment Intelligent Database Systems Lab
Experiment Intelligent Database Systems Lab
Experiment Intelligent Database Systems Lab
Conclusions • The experimental results show that the proposed model outperforms not only other pure data miningbased methods and the concept-based model, but also term-based state-of-the-art models, such as BM 25 and SVM-based models. Intelligent Database Systems Lab
Comments • Advantages – The approach is helpful. • Applications – Text mining. Intelligent Database Systems Lab