Measuring the quality of Rules Support and Confidence

  • Slides: 8
Download presentation
Measuring the quality of Rules • Support and Confidence are the normal method used

Measuring the quality of Rules • Support and Confidence are the normal method used to measure the quality of association rule.

 • Various Concepts such as surprise and interest have been used to evaluate

• Various Concepts such as surprise and interest have been used to evaluate the quality or usefulness of rules. • Ex : Person purchase potato chips high likelihood to purchase Cola. The rule is not really of interest is not surprising. • With correlation rules, we saw that correlation may be used to measure the relationship between items in a rule.

 • This may also be expressed as the lift or interest. • The

• This may also be expressed as the lift or interest. • The measure takes into account both P(A) and P(B). • The problem with this measure is that it is symmetric. (A=>B) = (B=>A)

 • As with lift, Conviction takes into account both P(A) and P(B). •

• As with lift, Conviction takes into account both P(A) and P(B). • From logic we know that implication • The measure of the independence of the negation of implication, then, is

 • To take into account the negation, the conviction measure inverts this ratio.

• To take into account the negation, the conviction measure inverts this ratio. The formula for conviction is [BMS 77] • Conviction has a value of 1 if A and B are not related. Rules that always hold have a value of infinity.

 • The usefulness of discovered association rules may be tied to the amount

• The usefulness of discovered association rules may be tied to the amount of surprise. • Surprise is a measure of changes of correlation between items over time. • Ex: if you are aware that beer and pretzels are often purchased together, it would be a surprise if this relationship actually lowered significantly.

 • Another technique to measure the significance of rules by using the chi

• Another technique to measure the significance of rules by using the chi square test for independence has been proposed. • It accounts both presence (occurrence) and absence (non occurrence ) of items in sets. • Suppose the set of items • The transaction tj

 • Given the possible Itemset X, it also is viewed as a subset

• Given the possible Itemset X, it also is viewed as a subset of the Cartesian product. The chi squared statistic is then calculated for X as • O(X) is the count of the number of transactions that contain the items in X.