1 Mining Dependent Patterns Hansheng Lei Univ. of Texas Rio Grande Valley Yamin Hu, Wenjian Luo Univ. of Science and Tech. of China Cheng Chang Pan Nova Southeastern University
Outline • • • Motivation Association Rule Mining Dependent Patterns Experimental results Conclusion ICDIS – Int. Conf. on Data Intelligence and Security 2
Motivation • Mining Survey Data
Association Rule Mining • Proposed by Agrawal et al in 1993. • Applied in market basket analysis to find how items purchased by customers are related. Beer Diaper [sup = 5%, conf = 100%] 4
AR Model • I = {i 1, i 2, …, im}: a set of items. • Transaction t : • t a set of items, and t I. • Transaction Database T: a set of transactions T = {t 1, t 2, …, tn}.
Association rules • An association rule is an implication of the form: X Y, where X, Y I, and X Y = • An itemset is a set of items. • E. g. , X = {milk, bread, cereal} is an itemset. 6
Support and Confidence • The support count of an itemset X in a data set T is the number of transactions in T that contain X. Assume T has n transactions. • Then, 7
Problems with AR mining (a) generates a huge amount of rules (b) Not supporting other relations, such as negative implication, correlation and dependence (c) Universal support and confidence 8
Dependent Patterns 9
DP Properties • Downward closure • Individual support thresholds for each item • Right dependence measure 10
Related Work • m-Pattern (not a good measure for dependence) 11
Experiments Three Algorithms to compare: DP, Apriori, and m-Pattern 12
Pattern Distribution 13
Support vs. mining level 14
Overlapping 15
Scalability 16
Conclusion • Proof of Concept • Obvious advantages • More scalability testing 17