Data Mining Sample Questions Data Mining lectures Dr
- Slides: 59
ﺩﺍﺩﻩ کﺎﻭی ﻧﻤﻮﻧﻪ ﺳﺆﺎﻝ Data Mining Sample Questions ﺩکﺘﺮ ﻣﺤﻤﺪ ﺣﺴﻱﻦ ﻧﺪﻱﻤﻱ ﺩﺍﻧﺸکﺪﻩ ﻣﻬﻨﺪﺳﻱ کﺎﻣپﻱﻮﺗﺮ ﺩﺍﻧﺸگﺎﻩ آﺰﺍﺩ ﺍﺳﻼﻣﻱ ﻭﺍﺣﺪ ﻧﺠﻒ آﺒﺎﺩ Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 1
Star • ﺷﻤﺎﻱ ﺳﺘﺎﺭﻩ ﺍﻱ ﻱﺎ Snowflake • ﺷﻤﺎﻱ ﺩﺍﻧﻪ ﺑﺮﻓﻱ ﻱﺎ Fact Constellation • ﺷﻤﺎﻱ ﺻﻮﺭﺕ ﻓﻠکﻱ ﻭﺍﻗﻌﻱ ﻱﺎ Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 4
: Fact table. ﻱک ﻣﺠﻤﻮﻋﻪ ﺟﺪﺍﻭﻝ کﻮچکﺘﺮ ﺑﻪ ﺍﺯﺍﻱ ﻫﺮ ﺑﻌﺪ : Dimension table ﻱک ﺟﺪﻭﻝ ﻣﺮکﺰﻱ ﺑﺰﺭگ کﻪ ﺷﺎﻣﻞ ﻣﺠﻤﻮﻋﻪ ﺍﻱ ﺑﺪﻭﻥ ﺍﻓﺰﻭﻧگﻱ ﺍﺯ ﺩﺍﺩﻫﺎﺳﺖ Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 5
( چﻬﺎﺭ ﺩﻱﺪگﺎﻩ ﻣﺘﻔﺎﻭﺗﻱ کﻪ ﺩﺭ ﻣﻮﺭﺩ ﻱک ﺍﻧﺒﺎﺭ ﺩﺍﺩﻩ ﻭﺟﻮﺩ 7. ﺩﺍﺭﺩ ﺭﺍ ﺑﻱﺎﻥ ﻧﻤﺎﻱﻱﺪ • Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 8
. ﺭﺍ ﺑﻱﺎﻥ کﻨﻱﺪ OLAP ( ﺍﻧﻮﺍﻉ ﺳﺮﻭﻱﺲ ﺩﻫﻨﺪﻩ ﻫﺎﻱ 11 ROLAP MOLAP HOLAP ﺭﺍﺑﻄﻪﺍﻱ OLAP ( ﺳﺮﻭﻱﺲ ﺩﻫﻨﺪﻩﻫﺎﻱ 1 چﻨﺪ ﺑﻌﺪﻱ OLAP ( ﺳﺮﻭﺭ ﻫﺎﻱ 2 ﺗﺮکﻱﺒﻱ OLAP ( ﺳﺮﻭﻱﺲ ﺩﻫﻨﺪﻩﻫﺎﻱ 3 Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 12
. ( ﺍﺑﺰﺍﺭﻫﺎﻱ ﺍﻧﺒﺎﺭ کﺮﺩﻥ ﺩﺍﺩﻩ ﺭﺍ ﻧﺎﻡ ﺑﺒﺮﻱﺪ 12 access and retrieval tools ( ﺍﺑﺰﺍﺭﻫﺎﻱ ﺩﺳﺘﻱﺎﺑﻱ ﻭ ﺑﺎﺯﻱﺎﺏ 1 Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 13
Scan D for count Of each candidate C 1 Item Sup set {A} 4 {B} 5 {C} 4 {D} 4 {E} 2 {F} 3 Compare candidate Itemset Support with min -sup L 1 sup Generate C 2 from L 1 Itemset C 2 sup {A} 4 {A, B} 4 {B} 5 {A, C} 3 {C} 4 {A. D} 3 {D} 4 {A. F} 3 {B. C} 4 {B. D} 4 {B. F} 3 {C, D} 4 {C. F} 2 {D, F} 2 Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 15
itemset sup {A, B} 4 {A, C} 3 3 {A, D} 3 {A, B, F} 3 {A, C, D } 3 {B, C} 4 {B, C, D } 4 {B, D} 4 {B, F} 3 {C, D} 4 sup Itemset sup {A, B, C} 3 3 {A, B, D} 3 {A, B, C } {A, B, D } Itemset {A, B, F} 3 {A, C, D} 3 {B, C, D} 4 C 4 L 4 Itemset Sup {ABCD 3 } L 2 C 3 L 3 } closed frequent itemset(1 {B}&{A, B}&{B, C, D}&{A, B, C, D} Maximal frequent itemset(2 {A, B, F}&{A, B, C, D} Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 16
ﺭﺍ ﺭﺳﻢ ﻭ fp-tree ﺍﻟﻒ( ﺑﺮﺍﻱ پﺎﻱگﺎﻩ ﺩﺍﺩﻩ ﺯﻱﺮ ﺩﺭﺧﺖ : 14. ﺭﺍ ﺭﻭﻱ آﻦ ﺍﻋﻤﺎﻝ کﻨﻱﺪ fp-growth ﺍﻟگﻮﺭﻱﺘﻢ • Min-Sup = 2 Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 18
L = {(a: 8)(b: 7)(c: 6)(d: 5)(e: 3)} Item ID Sup Count a 8 b 7 c 6 d 5 e 3 Node. Link Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 19
Item Conditional Pattern base Conditional fp-tree Frequent Pattern Generate e {{a, d: 1}{a, c, d: 1}{b, c: 1}} <a: 2> {a, e: 2} d {{a, b, c: 1}{a, b: 1}{a, c: 1}{a: 1}{b, c: 1}} <a: 4, b: 2> {a, d: 4}{a, b: 2}{a, b, d: 2} c {{a, b: 3}{a: 1}{b: 2}} <a: 3, b: 3><b: 2> {a, c: 3}{b, c: 3}{a, b, c: 2} b {{a: 5}} <a: 5> {a, b: 5} Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 20
ﺭﺍ ﺭﻭﻱ آﻦ Eclat ﺏ( ﺑﺮﺍﻱ پﺎﻱگﺎﻩ ﺩﺍﺩﻩ ﺳﻮﺍﻝ ﻗﺒﻞ ﺍﻟگﻮﺭﻱﺘﻢ : 14. ﺍﻋﻤﺎﻝ کﻨﻱﺪ • Min-Sup = 2 Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 21
ﺩﺭ ﻓﺮﻣﺖ ﻋﻤﻮﺩﻱ ﺩﺍﺩﻩ 1 -itemset ﺩﺭ ﻓﺮﻣﺖ ﻋﻤﻮﺩﻱ ﺩﺍﺩﻩ 2 -itemset Itemset TID-set a 1, 3, 4, 5, 6, 7, 8, 9 A, b 1, 5, 6, 8, 9 b 1, 2, 5, 6, 8, 9, 10 A, c 3, 5, 6, 8 c 2, 3, 5, 6, 8, 10 A, d 3, 4, 6, 9 d 2, 3, 4, 6, 9 A, e 3, 4, 10 B, c 2, 5, 6, 8, 10 B, d 6, 9 B, e 10 C, d 2, 3, 6 C, e 3, 10 D, e 3, 4 ﺩﺭ ﻓﺮﻣﺖ ﻋﻤﻮﺩﻱ ﺩﺍﺩﻩ 3 -itemset Itemset TID-set A, b, c 6, 8 A, b, d 6, 9 A, c, d 3, 6 B, c, d 2, 6 Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 22
jim ﻭ jack, mary ﺑﻱﻤﺎﺭ 3 ﻓﺎﺻﻠﻪ ﺑﻱﻦ ﻫﺮ ﺟﻔﺖ ﺍﺯ : پﺎﺳﺦ Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 27
( ﻓﺮﺽ کﻨﻱﺪ ﻣﺎ ﻱک ﻧﻤﻮﻧﻪ ﺍﻃﻼﻋﺎﺕ ﺯﻱﺮ ﺭﺍ ﺩﺍﺭﻱﻢ 18. ﻣﺎﺗﺮﻱﺲ ﻋﺪﻡ ﺗﺸﺎﺑﻪ آﻦ ﺭﺍ ﺭﺳﻢ کﻨﻱﺪ Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 29
ﺍﺳﺖ ﻭ ﻭﻗﺘﻱ ﺑﺎﻫﻢ 0 ﻱکﺠﻮﺭ ﺍﻧﺪ ﺑﺮﺍﺑﺮ j ﻭ I ﻭﻗﺘﻱ d(i, j) : ﺍﺳﺖ ﺑﻨﺎﺑﺮﺍﻱﻦ 1 ﻣﺘﻔﺎﻭﺕ ﺍﻧﺪ ﺑﺮﺍﺑﺮ Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 31
ﺩﺍﺭﻱﻢ y=(0, 1, 1, 0) ﻭ x=(1, 1, 0, 0) ﻣﺘﻐﻱﺮ 2 ( ﻓﺮﺽ کﻨﻱﺪ 19 ﻃﺒﻖ ﺭﺍ ﺑﺪﺳﺖ آﻮﺭﻱﺪ y ﻭ x ﻣﻌﺎﺩﻟﻪ ﺗﺸﺎﺑﻪ کﺴﻱﻨﻮﺳﻱ ﺷﺒﺎﻫﺖ ﺑﻱﻦ = Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 32
( ﺑﺎ ﺍﺳﺘﻔﺎﺩﻩ ﺍﺯﻣﺠﻤﻮﻋﻪ ﺩﺍﺩﻩ ﻫﺎﻱ آﻤﻮﺯﺷﻱ ﺯﻱﺮ 21 ﺍﺣﺘﻤﺎﻝ ﺍﻧﺠﺎﻡ ﺑﺎﺯﻱ ﺗﻨﻱﺲ ﺑﺎ ﺷﺮﺍﻱﻂ ﺯﻱﺮ ﺭﺍ ﻣﺤﺎﺳﺒﻪ کﻨﻱﺪ؟ <Outlk=sun, Temp=cool, Humid=high, Wind=strong>? : پﺎﺳﺦ P(yes) = 9/14, P(no) = 5/14 P(Wind=strong|yes) = 3/9 P(Wind=strong|no) = 3/5 … P(y) P(sun|y) P(cool|y) P(high|y) P(strong|y) =. 005 P(n) P(sun|n) P(cool|n) P(high|n) P(strong|n) =. 021 • Therefore this new instance is classified to “no” Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 34
Officer Drew IS a female! Officer Drew Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 36
c) FOIL’s information gain ∅ + R 0 P 0=100 & n 0=400 R 1 P 1=4 & n 1=1 R 2 p 1=30 & n 1=10 R 3 p 1=100 & n 1=90 . ﻗﺎﻧﻮﻥ ﺧﻮﺑﻱ ﻧﻱﺴﺖ R 1 ﺑﻬﺘﺮﻱﻦ ﻗﺎﻧﻮﻥ ﺍﺳﺖ ﻭ R 3 Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 38
d) The likelihood ratio statistic ﻓﺮکﺎﻧﺲ ﻣﻮﺭﺩ ﺍﻧﺘﻈﺎﺭ ﺑﺮﺍﻱ ﺗﺎپﻠﻬﺎﻱ ﻣﺜﺒﺖ ﻭ ﻣﻨﻔﻱ : R 1 5 × 100/500 = 1 5 × 400/500 = 4 pos neg the likelihood ratio for R 1 is: 2 × [ 4 × log 2(4/1) + 1 × log 2(1/4) ] = 12 Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 39
40× 100/500 = 8 ﻓﺮکﺎﻧﺲ ﻣﻮﺭﺩ ﺍﻧﺘﻈﺎﺭ ﺑﺮﺍﻱ ﺗﺎپﻠﻬﺎﻱ ﻣﺜﺒﺖ ﻭ ﻣﻨﻔﻱ : R 2 pos 40 × 400/500 = 32 neg the likelihood ratio for R 2 is : 2 × [ 30 × log 2(30/8) + 10 × log 2(10/32) ] = 80. 85 ﻓﺮکﺎﻧﺲ ﻣﻮﺭﺩ ﺍﻧﺘﻈﺎﺭ ﺑﺮﺍﻱ ﺗﺎپﻠﻬﺎﻱ ﻣﺜﺒﺖ ﻭ ﻣﻨﻔﻱ : R 3 190 × 100/500 = 38 190 × 400/500 = 152 pos neg the likelihood ratio for R 3 is : 2 × [ 100 × log 2(100/38) + 90 × log 2(90/152) ] = 143. 09. ﻗﺎﻧﻮﻥ ﺧﻮﺑﻱ ﻧﻱﺴﺖ R 1 ﺑﻬﺘﺮﻱﻦ ﻗﺎﻧﻮﻥ ﺍﺳﺖ ﻭ R 3 Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 40
( ﺭﻭﺷﻬﺎﻱ ﻃﺒﻘﻪ ﺑﻨﺪﻱ ﻭ پﻱﺶ ﺑﻱﻨﻱ ﺭﺍ ﺑﺮﺍﺳﺎﺱ چﻪ 26 ﻣﻌﻱﺎﺭﻫﺎﻱﻱ ﻣﻱ ﺗﻮﺍﻥ ﺍﺭﺯﻱﺎﺑﻱ کﺮﺩ؟ Predictive accuracy ﺩﻗﺖ 1 speed ﺳﺮﻋﺖ -2 Robustness ﺍﺳﺘﺤکﺎﻡ - 3 Interpretability: ﺗﻮﺍﻧﺎﻱﻱ ﺗﻔﺴﻱﺮ 4 scalability ﻣﻘﻱﺎﺱ پﺬﻱﺮﻱ 5 Goodness of rules 6 Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 43
X 1 = Acid Durability (seconds) Square Distance to query instance (3, (kg/square meter) 7) X 2 = Strength 7 7 7 4 3 4 1 4 Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 51
( ﻣﺮﺗﺐ ﺳﺎﺯﻱ ﻓﺎﺻﻠﻪ ﻫﺎ ﻭﺗﻌﻱﻱﻦ ﻧﺰﺩﻱکﺘﺮﻱﻦ ﻫﻤﺴﺎﻱﻪ ﺑﺮﺍﺳﺎﺱ 3 K کﻤﺘﺮﻱﻦ ﻓﺎﺻﻠﻪ Square Distance to query instance (3, 7) X 1 = Acid Durability (seconds) X 2 = Strength 7 7 3 7 4 4 3 4 1 1 4 2 (kg/square meter) Rank minimum distance Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 52
ﺩﻗﺖ کﻨیﺪ. Y ( ﺟﻤﻊ آﻮﺭی ﻧﺰﺩیکﺘﺮیﻦ ﻫﻤﺴﺎیﻪ ﻫﺎی ﺩﺳﺘﻪ 4 ﻧیﺴﺖ ﺯیﺮﺍﺭﺗﺒﻪ آﻦ Y کﻪ ﺭﺩیﻒ ﺩﻭﻡ ﺳﻄﺮآﺨﺮﺟﺰﺩﺳﺘﻪ ﺑﻨﺪی . ( ﺍﺳﺖ K=3) کﻤﺘﺮﺍﺯ Square X 2 = Is it Y= Distance X 1 = Acid Rank Strength included in Category of Durability to query minimum 3 -Nearest nearest (kg/square (seconds) distance instance neighbors? Neighbor meter) (3, 7 ) 7 7 3 Yes Bad 7 4 4 No - 3 4 1 Yes Good 1 4 2 Yes Good Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 53
( ﺭﻭﺵ ﺧﻮﺷﻪ ﺑﻨﺪﻱ ﺳﻠﺴﻠﻪ ﻣﺮﺍﺗﺒﻲ ﺭﺍ ﻧﺎﻡ ﺑﺒﺮﻳﺪ ﻭ ﻫﺮﻛﺪﺍﻡ ﺭﺍ 33 ﺑﻄﻮﺭ ﺧﻼﺻﻪ ﺷﺮﺡ ﺩﻫﻴﺪ Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 58
ﺑﺎ ﺗﺸکﺮ Data Mining lectures, Dr. Mohammad Hossein Nadimi, Faculty of Computer Engineering, Najafabad Branch, Islamic Azad University 59
- Data mining lectures
- Eck
- Mining multimedia databases in data mining
- Strip mining vs open pit mining
- Chapter 13 mineral resources and mining worksheet answers
- Difference between strip mining and open pit mining
- Web text mining
- Rick trebino lectures
- Lectures paediatrics
- Medicinal chemistry lectures
- Uva orthopaedics
- Ludic space
- Activity planning software project management
- Molecular biology lectures
- Radio astronomy lectures
- Dr sohail lectures
- Utilities and energy lectures
- Introduction to web engineering
- How to get the most out of lectures
- Frcr physics lectures
- Frequency of xrays
- Introduction to recursion
- Blood physiology guyton
- Define aerodynamics
- Tamara berg husband
- Power system lectures
- What is text linguistics
- Translation 1
- Digital logic design lectures
- Computer networks kurose
- Hegel romantic art
- Nuclear medicine lectures
- Cs106b lectures
- Cdeep lectures
- Oral communication 3 lectures text
- C programming and numerical analysis an introduction
- Haematology lectures
- Bureau of lectures
- Trend lectures
- Theory of translation lectures
- Reinforcement learning lectures
- 13 lectures
- Reinforcement learning lectures
- Bba lectures
- Medical emergency student lectures
- Hematology
- Rcog cpd portfolio
- Bhadeshia lectures
- Yelena bogdan
- Comsats virtual campus lectures
- Hugh blair lectures on rhetoric
- Cern summer student lectures
- Pathology lectures for medical students
- Dr asim lectures
- Pab ankle fracture
- Anatomy lectures powerpoint
- Cern summer school lectures
- Data reduction in data mining
- What is kdd process in data mining
- What is missing data in data mining