Using Attribute Value Lattice to Find Closed Frequent
Using Attribute Value Lattice to Find Closed Frequent Itemsets – Lin, Hu, Louie
New Apporoach to Data Mining • Find closed frequent itemsets • Search only the attribute-value lattice – Enables finding only the non-redundant association rule set
Frequent and Closed Itemsets • Frequent Itemset: Itemset that occurs in a user-specified percentage of the database • Closed Itemset: An itemset (A) that is identical to its closure Cl(A) • Closure of an Itemset Cl(A): all items that appear in all tuples that contain A. – – Eg. Cl(A) = {1, 3, 4, 5} Cl(C) = { 1, 2, 3, 4, 5} Cl(W) = {1, 2, 3, 4, 5} Cl(A)Cl (C ) Cl(W) = {I, 3, 4, 5} ACW is a closed frequent itemset. 1 ACTW 2 CDW 3 ACTWHG 4 ACDWHF 5 ACDTWH
Partial order and lattice • Partial Order: A binary relation that is reflexive (a <=a ), antisymmetric (a<=b and b<=a, then a = b) and transitive (a<=b and b<=c, then a<=c) • Lattice: Partially ordered set in which nonempty finite subsets have a least upper bound a greatest lower bound
Lin’s Algorithm • Attribute value lattice constructed from database 1. 2. 3. 4. 5. Construct bitmap of each frequent itemset B(Ii) Set level number Li of Ii to 1 Nodes contain Ii , Li , and B(Ii) where B(Ii) > threshold Sort the item in nodes based on bitcount For each node, Ii , Li , B(Ii) in nodes 1. 2. 3. For each sibling Ii I = Ii Ij and Bcomb = B(Ii) B(Ij) If Bcomb> threshold 1. If B(Ii) = B(Ij) remove Ij from nodes replace Ii with I 2. If If B(Ii) B(Ij) create an edge from Ii to Ij Lj = max(Lj , Lj + 1) 3. If B(Ij) B(Ii) create an edge from Ij to Ii Li = max(L, Lj + 1)
Lin’s Algorithm • Searches attribute value lattice to find closed frequent itemsets
- Slides: 6