Weka Experimenter and Knowledge Flow interfaces Neil Mac

  • Slides: 43
Download presentation
Weka: Experimenter and Knowledge Flow interfaces Neil Mac Parthaláin email: ncm@aber. ac. uk

Weka: Experimenter and Knowledge Flow interfaces Neil Mac Parthaláin email: ncm@aber. ac. uk

What is different from Explorer? • Experimenter is used for ‘batches’ of experiments •

What is different from Explorer? • Experimenter is used for ‘batches’ of experiments • Can only be used for Classification and Regression problems* • Results are generated in a different way – Explorer: % correct = (sum of correctly classified instances for all test folds)/(Total No. of instances in dataset) – Experimenter: % correct = Average of correctly predicted over all folds *Possible to perform attribute selection but not covered here

Experimenter - Overview • Compare the performance of different learning schemes easily • Allows

Experimenter - Overview • Compare the performance of different learning schemes easily • Allows better analysis than Explorer • Results: write-to-file or database • Evaluation: cross-validation, learning curve, or hold-out • Ability to iterate over different parameter settings • Statistical significance tests “for free”!

Experimenter - Overview • The interface essentially has three ‘panes’: – Setup: Configure experiments

Experimenter - Overview • The interface essentially has three ‘panes’: – Setup: Configure experiments – Run: Generate results files – Analyse: Analyse the results of the experiments

Always use ‘corrected’ T-Tester! Use this to decide how you compare results

Always use ‘corrected’ T-Tester! Use this to decide how you compare results

Lots of different ways to compare results!

Lots of different ways to compare results!

Knowledge Flow Interface • “Visual: drag-and-drop” user interface for WEKA intuitive • Java-Beans-based •

Knowledge Flow Interface • “Visual: drag-and-drop” user interface for WEKA intuitive • Java-Beans-based • Can do everything that Explorer does (plus a bit more), but not as comprehensively as Experimenter • Data sources, classifiers, etc. are beans and can be connected graphically • Data “flows” through modules: e. g. , “data source” ->“filter” ->“classifier”-> “evaluator” • KF layouts can be saved and re-used later

Knowledge Flow: An Example • What we want to do: – Take a dataset

Knowledge Flow: An Example • What we want to do: – Take a dataset – Do some attribute selection – Perform some classification on the reduced data using 10 fold CV – Examine the subsets selected for each CV fold – Visualise the results in text format and ROC

Getting Started

Getting Started

A few ‘hidden’ steps…

A few ‘hidden’ steps…

Add the Classifier learner

Add the Classifier learner

…and the performance evaluator

…and the performance evaluator

Text. Viewers can be used for visualisation of results as well as examining the

Text. Viewers can be used for visualisation of results as well as examining the processes – more later…

‘Right-clicking’ on each ‘block’ allows you to configure it as well as ‘wire-up’ to

‘Right-clicking’ on each ‘block’ allows you to configure it as well as ‘wire-up’ to others…

Connect: data. Set to Cross. Validation. Fold. Maker

Connect: data. Set to Cross. Validation. Fold. Maker

Continue to ‘wire-up’ each ‘block’…

Continue to ‘wire-up’ each ‘block’…

…and so on

…and so on

To see the results output: ‘dump’ the text to Text. Viewer…

To see the results output: ‘dump’ the text to Text. Viewer…

When you have finished ‘wiring-up’, it’s time to configure each of the components/blocks…

When you have finished ‘wiring-up’, it’s time to configure each of the components/blocks…

Set the path/filename(s) of the datasets you would like to load…

Set the path/filename(s) of the datasets you would like to load…

Once all is configured, you are ready to start…

Once all is configured, you are ready to start…

Once all experiments have finished, we can visualise the results…

Once all experiments have finished, we can visualise the results…

Output is similar to that of console window of Explorer

Output is similar to that of console window of Explorer

But there also ways save these results if we want to keep them for

But there also ways save these results if we want to keep them for later…

Text. Viewer components are also useful for ‘looking-inside’ processes…

Text. Viewer components are also useful for ‘looking-inside’ processes…

For example: attribute selection….

For example: attribute selection….

It is also possible to visualise data in a similar way to Explorer…e. g.

It is also possible to visualise data in a similar way to Explorer…e. g. ROC/threshold curves

Some problems you may encounter… Often caused by incorrectly defined. arff files… - too

Some problems you may encounter… Often caused by incorrectly defined. arff files… - too many attribs defined in the header - Incorrectly labelled @attribute types Be aware also that WEKA labels the dataset by whatever name you put in the @Relation field!

Some problems you may encounter… You may experience an error related to Java heap

Some problems you may encounter… You may experience an error related to Java heap size if: - The initial heap size is too-small - You load a large dataset - Attempt to run a large number of experiments Can be fixed by initialising the JVM with a large initial heap size: Java –Xmx 2048 m. . .

Write your own algorithms… • • WEKA is Open Source! Much of the work

Write your own algorithms… • • WEKA is Open Source! Much of the work is already done for you Take advantage of the WEKA framework Writing code and contributing to the WEKA project now easier than before see: http: //weka. wikispaces. com/How+can+I+contribute +to+WEKA%3 F

Conclusion • Experimenter and Knowledge Flow: – offer useful and flexible ways to perform

Conclusion • Experimenter and Knowledge Flow: – offer useful and flexible ways to perform a range of batches of experiments – Beware of the way in which results are generated! – KF is particularly useful for visualisation – Experimenter more suited to learning • Just a snapshot of capabilities of WEKA! • Want more info? email me (or Richard) • These slides available at: http: //users. aber. ac. uk/ncm/weka_slides