Association Rules Olson Yanhong Li Fuzzy Association Rules

  • Slides: 18
Download presentation
Association Rules Olson Yanhong Li

Association Rules Olson Yanhong Li

Fuzzy Association Rules • Association rules mining provides information to assess significant correlations in

Fuzzy Association Rules • Association rules mining provides information to assess significant correlations in large databases • IF X THEN Y • SUPPORT: degree to which relationship appears in data • CONFIDENCE: probability that if X, then Y

Association Rule Algorithms • APriori • Agrawal et al. , 1993; Agrawal & Srikant,

Association Rule Algorithms • APriori • Agrawal et al. , 1993; Agrawal & Srikant, 1994 – Find correlations among transactions, binary values • Weighted association rules • Cai et al. , 1998; Lu et al. 2001 • Cardinal data • Srikant & Agrawal, 1996 – Partitions attribute domain, combines adjacent partitions until binary

Fuzzy Association Rules • Most based on APriori algorithm • Treat all attributes as

Fuzzy Association Rules • Most based on APriori algorithm • Treat all attributes as uniform • Can increase number of rules by decreasing minimum support, decreasing minimum confidence – Generates many uninteresting rules – Software takes a lot longer

Gyenesei (2000) • Studied weighted quantitative association rules in fuzzy domain – With &

Gyenesei (2000) • Studied weighted quantitative association rules in fuzzy domain – With & without normalization – NONNORMALIZED • Used product operator to define combined weight and fuzzy value • If weight small, support level small, tends to have data overflow – NORMALIZED • Used geometric mean of item weights as combined weight • Support then very small

Algorithm • Get membership functions, minimum support, minimum confidence • Assign weight to each

Algorithm • Get membership functions, minimum support, minimum confidence • Assign weight to each fuzzy membership for each attribute (categorical) • Calculate support for each fuzzy region • If support > minimum, OK • If confidence > minimum, OK • If both OK, generate rules

Demo Model: Loan App Case 1 2 3 4 5 6 7 8 9

Demo Model: Loan App Case 1 2 3 4 5 6 7 8 9 10 Age 20 26 46 31 28 21 46 25 38 27 Income 52623 23047 56810 38388 80019 74561 65341 46504 65735 26047 Risk -38954 -23636 45669 -7968 -35125 -47592 58119 -30022 30571 -6 Credit Result Red 0 Green 1 Amber 1 Green 1 Green 1 Red 1

Fuzzified Age 1. 2 Membership value 1 0. 8 0. 6 0. 4 0.

Fuzzified Age 1. 2 Membership value 1 0. 8 0. 6 0. 4 0. 2 0 Age 0 25 35 Young Figure 2: The membership functions of attibute Age 40 Middle 50 100 Old

Fuzzify Age Case 1 2 3 4 5 6 7 8 9 10 Age

Fuzzify Age Case 1 2 3 4 5 6 7 8 9 10 Age 20 26 46 31 28 21 46 25 38 27 Young 1. 000 0. 9 0 0. 4 0. 7 1 0 0. 8 Middle 0 0. 1 0. 4 0. 6 0. 3 0 0. 4 0 1 0. 2 Old 0 0 0. 6 0 0 0

Calculate Support for Each Pair of Fuzzy Categories • Membership value – Identify weights

Calculate Support for Each Pair of Fuzzy Categories • Membership value – Identify weights for each attribute – Identify highest fuzzy membership category for each case • Membership value = minimum weight associated with highest fuzzy membership category • Support – Average membership value for all cases

Support • If support for pair of categories is above minimum support, retain •

Support • If support for pair of categories is above minimum support, retain • Identifies all pairs of fuzzy categories with sufficiently strong relationship

Pairs: minsup 0. 25 R 11 R 22 0. 235 R 22 R 42

Pairs: minsup 0. 25 R 11 R 22 0. 235 R 22 R 42 0. 184 R 11 R 31 0. 207 R 22 R 51 0. 449 R 11 R 41 0. 212 R 31 R 41 0. 266 R 11 R 42 0. 131 R 31 R 42 0. 096 R 11 R 51 0. 230 R 31 R 51 0. 264 R 22 R 31 0. 237 R 41 R 51 0. 560 R 22 R 41 0. 419 R 42 R 51 0. 174

Confidence • Identify direction • For those training set cases involving the pair of

Confidence • Identify direction • For those training set cases involving the pair of attributes, what proportion came out as predicted?

Confidence Values: Pairs Minimum confidence 0. 9 R 22 R 41 0. 855 R

Confidence Values: Pairs Minimum confidence 0. 9 R 22 R 41 0. 855 R 41 R 31 0. 462 R 41 R 22 0. 727 R 31 R 51 0. 825 R 22 R 51 0. 916 R 51 R 31 0. 410 R 51 R 22 0. 697 R 41 R 51 0. 972 R 31 R 41 0. 831 R 51 R 41 0. 870

Rules vs. Support

Rules vs. Support

Rules vs. Confidence

Rules vs. Confidence

Higher order combinations • Try triplets – If ambitious, sets of 4, and beyond

Higher order combinations • Try triplets – If ambitious, sets of 4, and beyond • Problem: – Computational complexity explodes

Research • The higher the minimum support, the fewer rules you get • The

Research • The higher the minimum support, the fewer rules you get • The higher the minimum confidence, the fewer rules you get • Weights can yield more rules • Greatest accuracy seemed to be at intermediate levels of support – Higher levels of confidence