Introduction to Weka CS 4705 Natural Language Processing
- Slides: 12
Introduction to Weka CS 4705 – Natural Language Processing Thursday, September 28
What is weka? ● java-based Machine Learning Tool ● 3 modes of operation ● – GUI – Command Line – API (not discussed here) To run: – java -Xmx 1024 M -jar ~cs 4705/bin/weka. jar &
weka Homepage ● http: //www. cs. waikato. ac. nz/ml/weka/
. arff file format ● http: //www. cs. waikato. ac. nz/~ml/weka/arff. html @relation name @attribute attr. Name {numeric, string, <nominal>, date}. . . @data a, b, c, d, e ● <nominal> : = {class 1, class 2, . . . , class. N}
Example Arff Files ● http: //sourceforge. net/projects/weka ● iris. arff ● cmc. arff
To Classify with weka GUI 1. Run weka GUI 7. Click 'Start' 2. Click 'Explorer' 8. Wait. . . 3. 'Open file. . . ' 9. Right-click on Result list entry 4. Select 'Classify' tab 5. 'Choose' a classifier a. 'Save result buffer' 6. Confirm options b. 'Save model'
Classify ● Some classifiers to start with. – Naive. Bayes – JRip – J 48 – SMO ● Find References by selecting a classifier ● Use Cross-Validation!
Analyzing Results ● Important tools for Homework 2 – Accuracy ● “Correctly classified instances” – Confusion matrix – Save model – Visualization
Running weka from the Command Line ● Running an N-fold cross validation experiment – ● java -cp ~cs 4705/bin/weka. jar weka. classifiers. bayes. Naive. Bayes -t trainingdata. arff -x N Using a predefined test set – java -cp ~cs 4705/bin/weka. jar weka. classifiers. bayes. Naive. Bayes -t trainingdata. arff -T testingdata. arff
● Saving the model – ● java -cp ~cs 4705/bin/weka. jar weka. classifiers. bayes. Naive. Bayes -t trainingdata. arff -d output. model Classifying a test set – java -cp ~cs 4705/bin/weka. jar weka. classifiers. bayes. Naive. Bayes -l input. model T testingdata. arff
● Analyzing results – Get predictions from test data ● – java -cp ~cs 4705/bin/weka. jar weka. classifiers. bayes. Naive. Bayes -l input. model -T testingdata. arff -p range Then DIY with scripts ● awk and sed will be your friends
● Getting predictions from crossvalidation – “Output Predictions” doesn't cut it. – export CLASSPATH=~cs 4705/bin/: ~cs 4705/bin/weka. jar – java call. Classifier weka. classifiers. bayes. Naive. Bayes -t trainingdata. arff
- 4705 in word
- Hyponomy
- Coms 4705
- Weka introduction
- Natural language processing vietnamese
- Probabilistic model natural language processing
- Natural language processing nlp - theory lecture
- Markov chain nlp
- Christopher manning nlp
- Grammar adalah
- Buy nlu
- Natural language processing lecture notes
- Foundations of statistical natural language processing