Tutorial 4 Data Mining Association Rules 1 Contents





























- Slides: 29

Tutorial 4 Data Mining – Association Rules 1

Contents � Review Questions ◦ Question 1: Data Mining and Metrics � Algorithm Questions ◦ Question 2: Applying Apriori Algorithm ◦ Question 3: Finding Association Rules 2

Review Questions 3

Question 1: Data Mining and Metrics � What is an Association Rule? 4

Question 1: Data Mining and Metrics � What is an Association Rule? An association rule states that: given a set of records, each of which contain some number of items from a given collection, there will be a dependency rule that will predict the occurrence of an item based on the occurrences of other items in the transaction. In other words, if it has been found in all transactions that coke is always bought with milk, then there will be a rule that states {milk} -> {coke} (however, not the other way around since not all milk is bought with coke). 5

Question 1: Data Mining and Metrics � What are the metrics for evaluating association rules? 6

Question 1: Data Mining and Metrics � What are the metrics for evaluating association rules? The association rule evaluation metrics are “Support” (s) and “Confidence” (c). Support is the fractions of the transactions that contain both X and Y. Confidence measures how often items in Y appears in transactions that contain X. 7

Question 1: Data Mining and Metrics � What are the metrics for evaluating association rules? For example given the following table, these are the support and confidence values: Example Association Rule: {Milk, Diaper} => Beer s = (Milk, Diaper, Beer)/Total Transactions = 2/5 = 0. 4 c = (Milk, Diaper, Beer)/ (Milk, Diaper) = 2/3 = 0. 67 1 2 3 4 5 TID Items Bread, Milk Bread, Diaper, Beer, Eggs Milk, Diaper, Beer, Coke Bread, Milk, Diaper, Beer Bread, Milk, Diaper, Coke 8

Algorithm Questions 9

Question 2: Applying the Apiori Algorithm � Apply the Apriori algorithm to find all itemsets with support >= 0. 2 from the following data: 1 2 3 4 5 6 7 8 9 10 Transaction Items in Transaction Milk, Bread, Eggs Milk, Juice, Butter Milk, Bread, Eggs Coffee, Juice Milk, Bread, Cookies, Eggs Cookies, Butter Milk, Bread 10

Question 2: Applying the Apiori Algorithm � Apriori Principle Step 1: Count up the occurrences of 1 item: Itemset Milk Bread Eggs Juice Butter Coffee Cookies 5 4 4 3 2 Count *Note: since it is out of 10, 0. 2 support means if it appears twice in the list. 11

Question 2: Applying the Apiori Algorithm � Apriori Principle Step 2: Look for frequent occurrences of 2 items (in bold, not strikethrough): Itemset Milk, Bread Milk, Eggs Milk, Juice Milk, Cookies Bread, Eggs Bread, Cookies Eggs, Coffee Eggs, Cookies Juice, Butter Juice, Coffee Butter, Cookies 4 3 1 1 1 1 Count 12

Question 2: Applying the Apiori Algorithm � Apriori Principle Step 3: Look for frequent occurrences of 3 items (in bold, not strikethrough): Itemset Milk, Bread, Eggs 3 Count Therefore, the most frequent and highest itemset data mining sub-itemset is {Milk, Bread, Eggs}. 13

Question 3: Applying the Apiori Algorithm � Using the data set in question 2 ({Milk, Bread, Eggs}), find all the association rules with support >= 0. 2 and confidence >= 0. 8. � “{Milk, Bread} -> Eggs” where {Milk, Bread} is X and Eggs is Y. � Support = {itemset (X and Y)}/transactions � Confidence = {itemset (X and Y)}/{itemset (X)} � To do this, we check each permutation of the association rules. 14

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Bread, Eggs}: {Milk, Bread} -> {Eggs} Support = Confidence = {Milk Eggs} -> {Bread} Support = Confidence = {Eggs, Bread} -> {Milk} Support = Confidence = 1 2 3 4 5 6 7 8 9 10 Transaction Items in Transaction Milk, Bread, Eggs Milk, Juice, Butter Milk, Bread, Eggs Coffee, Juice Milk, Bread, Cookies, Eggs Cookies, Butter Milk, Bread 15

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Bread, Eggs}: {Milk, Bread} -> {Eggs} Support = 3/10 = 0. 3 Confidence = 3/4 = 0. 75 {Milk Eggs} -> {Bread} Support = Confidence = {Eggs, Bread} -> {Milk} Support = Confidence = 1 2 3 4 5 6 7 8 9 10 Transaction Items in Transaction Milk, Bread, Eggs Milk, Juice, Butter Milk, Bread, Eggs Coffee, Juice Milk, Bread, Cookies, Eggs Cookies, Butter Milk, Bread 16

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Bread, Eggs}: {Milk, Bread} -> {Eggs} Support = 3/10 = 0. 3 Confidence = 3/4 = 0. 75 {Milk Eggs} -> {Bread} Support = 3/10 = 0. 3 Confidence = 3/3 = 1 {Eggs, Bread} -> {Milk} Support = Confidence = 1 2 3 4 5 6 7 8 9 10 Transaction Items in Transaction Milk, Bread, Eggs Milk, Juice, Butter Milk, Bread, Eggs Coffee, Juice Milk, Bread, Cookies, Eggs Cookies, Butter Milk, Bread 17

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Bread, Eggs}: {Milk, Bread} -> {Eggs} Support = 3/10 = 0. 3 Confidence = 3/4 = 0. 75 {Milk Eggs} -> {Bread} Support = 3/10 = 0. 3 Confidence = 3/3 = 1 {Eggs, Bread} -> {Milk} Support = 3/10 = 0. 3 Confidence = 3/3 = 1 1 2 3 4 5 6 7 8 9 10 Transaction Items in Transaction Milk, Bread, Eggs Milk, Juice, Butter Milk, Bread, Eggs Coffee, Juice Milk, Bread, Cookies, Eggs Cookies, Butter Milk, Bread 18

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Bread}: {Milk} -> {Bread} Support = Confidence = {Bread} -> {Milk} Support = Confidence = 1 2 3 4 5 6 7 8 9 10 Transaction Items in Transaction Milk, Bread, Eggs Milk, Juice, Butter Milk, Bread, Eggs Coffee, Juice Milk, Bread, Cookies, Eggs Cookies, Butter Milk, Bread 19

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Bread}: {Milk} -> {Bread} Support = 4/10 = 0. 4 Confidence = 4/5 = 0. 8 {Bread} -> {Milk} Support = Confidence = 1 2 3 4 5 6 7 8 9 10 Transaction Items in Transaction Milk, Bread, Eggs Milk, Juice, Butter Milk, Bread, Eggs Coffee, Juice Milk, Bread, Cookies, Eggs Cookies, Butter Milk, Bread 20

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Bread}: {Milk} -> {Bread} Support = 4/10 = 0. 4 Confidence = 4/5 = 0. 8 {Bread} -> {Milk} Support = 4/10 = 0. 4 Confidence = 4/4 = 1 1 2 3 4 5 6 7 8 9 10 Transaction Items in Transaction Milk, Bread, Eggs Milk, Juice, Butter Milk, Bread, Eggs Coffee, Juice Milk, Bread, Cookies, Eggs Cookies, Butter Milk, Bread 21

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Eggs}: {Milk} -> {Eggs} Support = Confidence = {Eggs} -> {Milk} Support = Confidence = 1 2 3 4 5 6 7 8 9 10 Transaction Items in Transaction Milk, Bread, Eggs Milk, Juice, Butter Milk, Bread, Eggs Coffee, Juice Milk, Bread, Cookies, Eggs Cookies, Butter Milk, Bread 22

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Eggs}: {Milk} -> {Eggs} Support = 3/10 = 0. 3 Confidence = 3/5 = 0. 6 {Eggs} -> {Milk} Support = Confidence = 1 2 3 4 5 6 7 8 9 10 Transaction Items in Transaction Milk, Bread, Eggs Milk, Juice, Butter Milk, Bread, Eggs Coffee, Juice Milk, Bread, Cookies, Eggs Cookies, Butter Milk, Bread 23

Question 3: Applying the Apiori Algorithm Association Rules for {Milk, Eggs}: {Milk} -> {Eggs} Support = 3/10 = 0. 25 Confidence = 3/5 = 0. 6 {Eggs} -> {Milk} Support = 3/10 = 0. 3 Confidence = 3/4 = 0. 75 1 2 3 4 5 6 7 8 9 10 Transaction Items in Transaction Milk, Bread, Eggs Milk, Juice, Butter Milk, Bread, Eggs Coffee, Juice Milk, Bread, Cookies, Eggs Cookies, Butter Milk, Bread 24

Question 3: Applying the Apiori Algorithm Association Rules for {Bread Eggs}: {Bread} -> {Eggs} Support = Confidence = {Eggs} -> {Bread} Support = Confidence = 1 2 3 4 5 6 7 8 9 10 Transaction Items in Transaction Milk, Bread, Eggs Milk, Juice, Butter Milk, Bread, Eggs Coffee, Juice Milk, Bread, Cookies, Eggs Cookies, Butter Milk, Bread 25

Question 3: Applying the Apiori Algorithm Association Rules for {Bread Eggs}: {Bread} -> {Eggs} Support = 3/10 = 0. 3 Confidence = 3/4 = 0. 75 {Eggs} -> {Bread} Support = Confidence = 1 2 3 4 5 6 7 8 9 10 Transaction Items in Transaction Milk, Bread, Eggs Milk, Juice, Butter Milk, Bread, Eggs Coffee, Juice Milk, Bread, Cookies, Eggs Cookies, Butter Milk, Bread 26

Question 3: Applying the Apiori Algorithm Association Rules for {Bread Eggs}: {Bread} -> {Eggs} Support = 3/10 = 0. 25 Confidence = 3/4 = 0. 75 {Eggs} -> {Bread} Support = 3/10 = 0. 3 Confidence = 3/4 = 0. 75 1 2 3 4 5 6 7 8 9 10 Transaction Items in Transaction Milk, Bread, Eggs Milk, Juice, Butter Milk, Bread, Eggs Coffee, Juice Milk, Bread, Cookies, Eggs Cookies, Butter Milk, Bread 27

Question 3: Applying the Apiori Algorithm Therefore, the only Association Rules that satisfy the restriction of having support >= 2 and confidence >= 0. 8 is: � {Milk, Eggs} -> {Bread} (s=0. 3, c=1) � {Eggs, Bread} -> {Milk} (s=0. 3, c=1) � {Milk} -> {Bread} (s=0. 4, c=0. 8) � {Bread} -> {Milk} (s=0. 4, c=1) 28

Questions? 29