732 A 02 Data Mining Clustering and Association
- Slides: 18
732 A 02 Data Mining Clustering and Association Analysis • Constrained frequent itemset mining ………………… Jose M. Peña jospe@ida. liu. se
Constraints ®A constraint C(. ) is ® Monotone If C(A) then C(B) for all A B. ® E. g. A’ A. ® ® Antimonotone If C(A) then C(B) for all B A. ® Or, if not C(B) then not C(A) for all B A. ® E. g. support ≥ min_support. ® The apriori property applies to any antimonotone constraint. ®
Constraints ® sum(S. Price) ® min(S. Price) v is monotone (positive prices). v is monotone. ® range(S. Price) ® ® 15 is monotone. Itemset ab satisfies C So does every superset of ab Item Price a 40 b 0 c -20 d 10 e -30 f 30 g 20 h -10
Constraints ® sum(S. Price) v is antimonotone (positive prices). ® sum(S. Price) v is not antimonotone. ® range(S. Price) 15 is antimonotone. ® Itemset ab violates C ® So does every superset of ab Item Price a 40 b 0 c -20 d 10 e -30 f 30 g 20 h -10
Constraints Constraint v S S V Antimonotone no no Monotone yes S V min(S) v yes no no yes min(S) v max(S) v yes no no max(S) v count(S) v no yes no count(S) v no yes sum(S) v ( a S, a 0 ) yes no no yes range(S) v yes no no yes avg(S) v, { , , } support(S) No but convertible yes No but convertible no support(S) no yes
Apriori algorithm + any constraint Database D L 1 C 1 Scan D C 2 Scan D L 2 C 3 Scan D L 3 Constraint: Sum{S. price} < 5, where item price equals item id
Apriori algorithm + antimonotone constraint Prune search space Database D L 1 C 1 Scan D C 2 Scan D L 2 C 3 Scan D L 3 Constraint: Sum{S. price} < 5, where item price equals item id
Apriori algorithm + monotone constraint Does not prune search space but avoids constraint checking Database D L 1 C 1 Scan D C 2 Scan D L 2 ☺ Not in the output, since they don’t satisfy the constraint ☺ ☺ C 3 ☺ Scan D L 3 Constraint: Sum{S. price} ≥ 5, where item price equals item id
FP grow algorithm + antimonotone constraint Similar in Apriori (prune search space) Specific of FP grow (avoids constraint check)
FP grow algorithm + monotone constraint ® If C(α) then do not check C(. ) in TDB|α
Constraints ® avg(S. Price) v and avg(S. Price) ≥ v are neither monotone nor antimonotone. ® Convertible monotone ® If there exists an item order R such that ® ® ® If C(A) then C(B) for all A and B respecting R such that A is a suffix of B. E. g. avg(S. Price) ≥ v wrt decreasing price order. Convertible antimonotone ® If there exists an item order R such that ® ® ® If C(A) then C(B) for all A and B respecting R such that B is a suffix of A. Or, if not C(B) then not C(A) for all A and B respecting R such that B is a suffix of A. E. g. avg(S. Price) ≥ v wrt to increasing price order.
Constraints ® avg(X) 25 is convertible monotone wrt descending item price order R: < a, f, g, d, b, h, c, e> ® ® If an itemset d satisfies a constraint C, so do itemsets fd and afd, which have d as a suffix. avg(X) 25 is convertible antimonotone wrt ascending item price item order R-1: < e, c, h, b, d, g, f, a > ® If an itemset dfa satisfies a constraint C, so do itemsets fa and a, which are suffixes of dfa. Thus, avg(X) 25 is strongly convertible. ® Check that avg(X) 25 is also strongly convertible. ®
Constraints Constraint Convertible antimonotone Convertible monotone Strongly convertible avg(S) , v Yes Yes median(S) , v Yes Yes sum(S) v (items could be of any value, v 0) Yes No No sum(S) v (items could be of any value, v 0) No Yes No sum(S) v (items could be of any value, v 0) Yes No No ……
Constraints Monotone Antimonotone Strongly convertible Convertible antimonotone Inconvertible avg(S)-median(S)=0 Convertible monotone
FP grow algorithm + convertible antimonotone constraint ® Instead of ordering the items according to decreasing frequency, now the items are ordered according to the order R of the constraint. False: Such items can appear not only as suffix. False: No check is needed for those itemsets that are a suffix of α U β. The check is needed for the rest of items. True: α will be added as suffix to any itemset derived from TDB|α and the result respects R.
FP grow algorithm + convertible monotone constraint ® With ® monotone constraint If C(α) then do not check C(. ) in TDB|α ® With convertible monotone constraint Instead of ordering the items according to decreasing frequency, now the items are ordered according to the order R of the constraint. ® If C(α) then do not check C(. ) in TDB|α because α will be added as suffix to any itemset derived from TDB|α and the result respects R. ®
Exercise ® How would you incorporate covertible constraints in the Apriori algorithm ?
- Classification and clustering in data mining
- Mining complex types of data in data mining
- Birch clustering algorithm
- Clustering in data mining
- K-means clustering algorithm in data mining
- Flat and hierarchical clustering
- Partitional clustering
- Rumus euclidean distance
- Mining multimedia databases
- Association data mining techniques
- Association rules in data mining
- Association rules in data mining
- Association rules in data mining
- Association rules in data mining
- Difference between strip mining and open pit mining
- Difference between text mining and web mining
- Compsci 732
- Challenges n 732 ddl
- Softeng 750