n Machine Learning with WEKA n WEKA A

  • Slides: 77
Download presentation
n Machine Learning with WEKA n WEKA: A Machine Learning Toolkit The Explorer •

n Machine Learning with WEKA n WEKA: A Machine Learning Toolkit The Explorer • • Eibe Frank • • Department of Computer Science, University of Waikato, New Zealand • n n n Classification and Regression Clustering Association Rules Attribute Selection Data Visualization The Experimenter The Knowledge Flow GUI Conclusions

WEKA: the bird Copyright: Martin Kramer (mkramer@wxs. nl) 6/14/2021 University of Waikato 2

WEKA: the bird Copyright: Martin Kramer (mkramer@wxs. nl) 6/14/2021 University of Waikato 2

WEKA: the software • • • Machine learning/data mining software written in Java (distributed

WEKA: the software • • • Machine learning/data mining software written in Java (distributed under the GNU Public License) Complements “Data Mining” by Witten & Frank Main features: • • • 6/14/2021 Comprehensive set of data pre-processing tools, learning algorithms and evaluation methods Graphical user interfaces (incl. data visualization) Environment for comparing learning algorithms University of Waikato 3

WEKA only deals with “flat” files @relation heart-disease-simplified @attribute age numeric @attribute sex {

WEKA only deals with “flat” files @relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63, male, typ_angina, 233, not_present 67, male, asympt, 286, yes, present 67, male, asympt, 229, yes, present 38, female, non_anginal, ? , not_present. . . 6/14/2021 University of Waikato 4

WEKA only deals with “flat” files @relation heart-disease-simplified @attribute age numeric @attribute sex {

WEKA only deals with “flat” files @relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63, male, typ_angina, 233, not_present 67, male, asympt, 286, yes, present 67, male, asympt, 229, yes, present 38, female, non_anginal, ? , not_present. . . 6/14/2021 University of Waikato 5

6/14/2021 University of Waikato 6

6/14/2021 University of Waikato 6

Explorer: pre-processing the data • • Data can be imported from a file in

Explorer: pre-processing the data • • Data can be imported from a file in various formats: ARFF, CSV, C 4. 5, binary Data can also be read from a URL or from an SQL database (using JDBC) Pre-processing tools in WEKA are called “filters” WEKA contains filters for: • 6/14/2021 Discretization, normalization, resampling, attribute selection, transforming and combining attributes, … University of Waikato 7

6/14/2021 University of Waikato 8

6/14/2021 University of Waikato 8

6/14/2021 University of Waikato 9

6/14/2021 University of Waikato 9

6/14/2021 University of Waikato 10

6/14/2021 University of Waikato 10

6/14/2021 University of Waikato 11

6/14/2021 University of Waikato 11

6/14/2021 University of Waikato 12

6/14/2021 University of Waikato 12

6/14/2021 University of Waikato 13

6/14/2021 University of Waikato 13

6/14/2021 University of Waikato 14

6/14/2021 University of Waikato 14

6/14/2021 University of Waikato 15

6/14/2021 University of Waikato 15

6/14/2021 University of Waikato 16

6/14/2021 University of Waikato 16

6/14/2021 University of Waikato 17

6/14/2021 University of Waikato 17

6/14/2021 University of Waikato 18

6/14/2021 University of Waikato 18

6/14/2021 University of Waikato 19

6/14/2021 University of Waikato 19

6/14/2021 University of Waikato 20

6/14/2021 University of Waikato 20

6/14/2021 University of Waikato 21

6/14/2021 University of Waikato 21

6/14/2021 University of Waikato 22

6/14/2021 University of Waikato 22

6/14/2021 University of Waikato 23

6/14/2021 University of Waikato 23

6/14/2021 University of Waikato 24

6/14/2021 University of Waikato 24

6/14/2021 University of Waikato 25

6/14/2021 University of Waikato 25

6/14/2021 University of Waikato 26

6/14/2021 University of Waikato 26

6/14/2021 University of Waikato 27

6/14/2021 University of Waikato 27

6/14/2021 University of Waikato 28

6/14/2021 University of Waikato 28

Explorer: building “classifiers” • • Classifiers in WEKA are models for predicting nominal or

Explorer: building “classifiers” • • Classifiers in WEKA are models for predicting nominal or numeric quantities Implemented learning schemes include: • • Decision trees and lists, instance-based classifiers, support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, … “Meta”-classifiers include: • 6/14/2021 Bagging, boosting, stacking, error-correcting output codes, locally weighted learning, … University of Waikato 29

6/14/2021 University of Waikato 30

6/14/2021 University of Waikato 30

6/14/2021 University of Waikato 31

6/14/2021 University of Waikato 31

6/14/2021 University of Waikato 32

6/14/2021 University of Waikato 32

6/14/2021 University of Waikato 33

6/14/2021 University of Waikato 33

6/14/2021 University of Waikato 34

6/14/2021 University of Waikato 34

6/14/2021 University of Waikato 35

6/14/2021 University of Waikato 35

6/14/2021 University of Waikato 36

6/14/2021 University of Waikato 36

6/14/2021 University of Waikato 37

6/14/2021 University of Waikato 37

6/14/2021 University of Waikato 38

6/14/2021 University of Waikato 38

6/14/2021 University of Waikato 39

6/14/2021 University of Waikato 39

6/14/2021 University of Waikato 40

6/14/2021 University of Waikato 40

6/14/2021 University of Waikato 41

6/14/2021 University of Waikato 41

6/14/2021 University of Waikato 42

6/14/2021 University of Waikato 42

6/14/2021 University of Waikato 43

6/14/2021 University of Waikato 43

6/14/2021 University of Waikato 44

6/14/2021 University of Waikato 44

6/14/2021 University of Waikato 45

6/14/2021 University of Waikato 45

6/14/2021 University of Waikato 46

6/14/2021 University of Waikato 46

6/14/2021 University of Waikato 47

6/14/2021 University of Waikato 47

6/14/2021 University of Waikato 48

6/14/2021 University of Waikato 48

6/14/2021 University of Waikato 49

6/14/2021 University of Waikato 49

6/14/2021 University of Waikato 50

6/14/2021 University of Waikato 50

6/14/2021 University of Waikato 51

6/14/2021 University of Waikato 51

6/14/2021 University of Waikato 52

6/14/2021 University of Waikato 52

Explorer: clustering data • • WEKA contains “clusterers” for finding groups of similar instances

Explorer: clustering data • • WEKA contains “clusterers” for finding groups of similar instances in a dataset Implemented schemes are: • • • k-Means, EM, Cobweb, X-means, Farthest. First Clusters can be visualized and compared to “true” clusters (if given) Evaluation based on loglikelihood if clustering scheme produces a probability distribution 6/14/2021 University of Waikato 53

6/14/2021 University of Waikato 54

6/14/2021 University of Waikato 54

6/14/2021 University of Waikato 55

6/14/2021 University of Waikato 55

6/14/2021 University of Waikato 56

6/14/2021 University of Waikato 56

6/14/2021 University of Waikato 57

6/14/2021 University of Waikato 57

6/14/2021 University of Waikato 58

6/14/2021 University of Waikato 58

6/14/2021 University of Waikato 59

6/14/2021 University of Waikato 59

6/14/2021 University of Waikato 60

6/14/2021 University of Waikato 60

6/14/2021 University of Waikato 61

6/14/2021 University of Waikato 61

6/14/2021 University of Waikato 62

6/14/2021 University of Waikato 62

6/14/2021 University of Waikato 63

6/14/2021 University of Waikato 63

6/14/2021 University of Waikato 64

6/14/2021 University of Waikato 64

6/14/2021 University of Waikato 65

6/14/2021 University of Waikato 65

6/14/2021 University of Waikato 66

6/14/2021 University of Waikato 66

6/14/2021 University of Waikato 67

6/14/2021 University of Waikato 67

6/14/2021 University of Waikato 68

6/14/2021 University of Waikato 68

Explorer: finding associations • WEKA contains an implementation of the Apriori algorithm for learning

Explorer: finding associations • WEKA contains an implementation of the Apriori algorithm for learning association rules • • Can identify statistical dependencies between groups of attributes: • • Works only with discrete data milk, butter bread, eggs (with confidence 0. 9 and support 2000) Apriori can compute all rules that have a given minimum support and exceed a given confidence 6/14/2021 University of Waikato 69

6/14/2021 University of Waikato 70

6/14/2021 University of Waikato 70

6/14/2021 University of Waikato 71

6/14/2021 University of Waikato 71

6/14/2021 University of Waikato 72

6/14/2021 University of Waikato 72

6/14/2021 University of Waikato 73

6/14/2021 University of Waikato 73

6/14/2021 University of Waikato 74

6/14/2021 University of Waikato 74

6/14/2021 University of Waikato 75

6/14/2021 University of Waikato 75

6/14/2021 University of Waikato 76

6/14/2021 University of Waikato 76

Conclusion: try it yourself! • • WEKA is available at http: //www. cs. waikato.

Conclusion: try it yourself! • • WEKA is available at http: //www. cs. waikato. ac. nz/ml/weka Also has a list of projects based on WEKA 6/14/2021 University of Waikato 77