APRIORI Algoritam Pronalaenje skrivenog znanja Student Nina Tatomir

  • Slides: 17
Download presentation
APRIORI Algoritam Pronalaženje skrivenog znanja Student: Nina Tatomir Profesor: Veljko Milutinović

APRIORI Algoritam Pronalaženje skrivenog znanja Student: Nina Tatomir Profesor: Veljko Milutinović

Apriori Algoritam-Osnovno Data mining algoritam za pronalaženje pravila pridruživanja analiziranjem transkacija Koncept: Frequent itemsets

Apriori Algoritam-Osnovno Data mining algoritam za pronalaženje pravila pridruživanja analiziranjem transkacija Koncept: Frequent itemsets (Veliki skupovi) Apriori property ( Apriori trik) Apriori-gen funkcija Nina Tatomir ninatatomir@gmail. com 1/16

Apriori Algoritam - Ljuska Pronalaženje Frequent itemsets -Podskup Velikog skupa mora takodje biti Veliki

Apriori Algoritam - Ljuska Pronalaženje Frequent itemsets -Podskup Velikog skupa mora takodje biti Veliki skup -Iterativno pronalaženje Velikog skupa 1 -k Generisnaje asocijativnih pravila Nina Tatomir ninatatomir@gmail. com 2/16

Apriori-gen funkcija Join step Ck se formira spajanjem elemenata iz Lk-1 Prune step Bilo

Apriori-gen funkcija Join step Ck se formira spajanjem elemenata iz Lk-1 Prune step Bilo koji k-1 ne Veliki podskup ne može biti podskup Velikog skupa Nina Tatomir ninatatomir@gmail. com 3/16

Pseudo. Code L 1 = {Frequent 1 -itemsets}; for ( k = 2; Lk-1

Pseudo. Code L 1 = {Frequent 1 -itemsets}; for ( k = 2; Lk-1 0; k++ ) do begin Ck = apriori-gen(Lk-1); forall transactions t D do begin Ct = subset(Ck, t); forall candidates c Ct do c. count++; end Lk = {c Ck c. count minsup} end Answer = Uk Lk; Nina Tatomir ninatatomir@gmail. com 4/16

Primer TID Elementi T 100 hleb, mleko, kikiriki T 200 mleko, jaja T 300

Primer TID Elementi T 100 hleb, mleko, kikiriki T 200 mleko, jaja T 300 mleko, pivo T 400 hleb, mleko, jaja T 500 hleb, pivo T 600 mleko, pivo T 700 hleb, pivo T 800 hleb, mleko, pivo, kikiriki T 900 hleb, mleko, pivo Nina Tatomir ninatatomir@gmail. com 5/16

Primer – Frequent 1 -itemsets Itemsets Sup. count {hleb} 6 {mleko} 7 {pivo} 6

Primer – Frequent 1 -itemsets Itemsets Sup. count {hleb} 6 {mleko} 7 {pivo} 6 {jaja} {kikiriki} Itemsets Sup. count {hleb} 6 {mleko} 7 {pivo} 6 2 {jaja} 2 2 {kikiriki} 2 supp({hleb}) >= min. supp Frequent 1 -itemsets L 1 = {hleb, mleko, pivo, jaja, kikiriki} Nina Tatomir ninatatomir@gmail. com 6/16

Primer – Frequent 2 -itemstes Join step Itemsets Supp {hleb, mleko} 4 {hleb, pivo}

Primer – Frequent 2 -itemstes Join step Itemsets Supp {hleb, mleko} 4 {hleb, pivo} 4 {hleb, jaja} 1 {hleb, kikiriki} 2 {mleko, pivo} 4 {mleko, jaja} 2 {mleko, kikiriki} 2 >min. supp {mleko, kikiriki} 2 {pivo, jaja} 0 {pivo, kikiriki} 1 {jaja, kikiriki} 0 L 2 C 2 Nina Tatomir ninatatomir@gmail. com 7/16

Primeri – Frequent 3 -itemsets Itemsets {hleb, mleko} {hleb, mleko, pivo} {hleb, mleko, kikiriki}

Primeri – Frequent 3 -itemsets Itemsets {hleb, mleko} {hleb, mleko, pivo} {hleb, mleko, kikiriki} Join step {hleb, kikiriki} {hleb, pivo, kikiriki} {mleko, pivo, jaja} {mleko, pivo, kikiriki} {mleko, jaja, kikiriki} L 2 Prune step C 3’ Itemsets {hleb, mleko, pivo} {hleb, mleko, kikiriki} C 3 Prune step -> apriori trik Nina Tatomir ninatatomir@gmail. com 8/16

Primer – Frequent items Itemsets Supp {hleb, mleko, pivo} 2 >min. supp Itemsets Supp

Primer – Frequent items Itemsets Supp {hleb, mleko, pivo} 2 >min. supp Itemsets Supp {hleb, mleko, pivo} 2 {hleb, mleko, kikiriki} 2 L 3 C 3 Join step Itemsets Prune step {hleb, mleko, pivo, kikiriki} C 4’ C 4 Frequent itemsets -> Association rules Nina Tatomir ninatatomir@gmail. com 9/16

Primer – Association rules s->(I-s) conf(s)=supp(I)/supp(I-s) ≥ min. conf Frequent itemsets: L={{hleb}, {mleko}, {pivo},

Primer – Association rules s->(I-s) conf(s)=supp(I)/supp(I-s) ≥ min. conf Frequent itemsets: L={{hleb}, {mleko}, {pivo}, {jaja}, {kikiriki}, {hleb, mleko}, {hleb, pivo}, {hleb, kikiriki}, {mleko, pivo}, {mleko, jaja}, {mleko, kikiriki}, {hleb, mleko, pivo}, {hleb, mleko, kikiriki}} Minimum confidence: min. conf=70% Nina Tatomir ninatatomir@gmail. com 10/16

Primer – Association rule I= {hleb, mleko, pivo} R 1: hleb^mleko->pivo conf=supp(hleb, mleko, pivo)/supp(hleb,

Primer – Association rule I= {hleb, mleko, pivo} R 1: hleb^mleko->pivo conf=supp(hleb, mleko, pivo)/supp(hleb, mleko)=2/4 conf=50% < 70% R 1 odbijeno R 2: hleb^pivo->mleko conf=supp(hleb, mleko, pivo)/supp(hleb, pivo)=2/2 conf=100% > 70% R 2 prihvaćeno Nina Tatomir ninatatomir@gmail. com 11/16

Primer – Association rule R 3: mleko^pivo->hleb conf=2/2= 100% > 70% R 3 prihvaćeno

Primer – Association rule R 3: mleko^pivo->hleb conf=2/2= 100% > 70% R 3 prihvaćeno R 4: hleb->mleko^pivo conf=supp(hleb, mleko, pivo)/supp(hleb)=2/6 = 33% R 4 odbijeno R 5: mleko-> hleb^pivo conf=2/7=29% R 5 odbijeno R 6: pivo->hleb^mleko conf=2/2 = 100% R 6 prihvaćeno Nina Tatomir ninatatomir@gmail. com 12/16

Složenost – O(∑│Ck│np) Nina Tatomir ninatatomir@gmail. com 13/16

Složenost – O(∑│Ck│np) Nina Tatomir ninatatomir@gmail. com 13/16

Metode za poboljšanje efikasnosti Ograničavanje broja kandidata u svakoj iteraciji Ograničavanje maksimalne dubine pravila

Metode za poboljšanje efikasnosti Ograničavanje broja kandidata u svakoj iteraciji Ograničavanje maksimalne dubine pravila K Smanjivanje broja transakcija koje se analizairaju Smanjivanje broja različitih elemenata Nina Tatomir ninatatomir@gmail. com 14/16

Literatura Top 10 algorithms in data mining http: //www. cs. umd. edu/~samir/498/10 Algorithms 08.

Literatura Top 10 algorithms in data mining http: //www. cs. umd. edu/~samir/498/10 Algorithms 08. pdf Dubinska nalaza podataka potrošačkih korpi http: //www. zemris. fer. hr/predmeti/kdisc/Sem 5. doc Nina Tatomir ninatatomir@gmail. com 15/16

Pitanja? Nina Tatomir ninatatomir@gmail. com 16/16

Pitanja? Nina Tatomir ninatatomir@gmail. com 16/16