Introduction to NLTK Text Analytics Giuseppe Attardi Universit
- Slides: 11
Introduction to NLTK Text Analytics Giuseppe Attardi Università di Pisa
Installing NLTK l Download and Install § http: //nltk. org/install. html l Download NLTK data >>> import nltk >>> nltk. download()
Jupyter Notebook l Register with your Uni. Pi credentials to activate your free account for a G Suite at: § this page. l Astart your Jupyter Notebook here: § https: //attardi-4. di. unipiit: 8000/
NLTK
NLTK l Suite of classes for several NLP tasks l Parsing, POS tagging, classifiers… l Several text processing utilities, corpora § Brown, Penn Treebank corpus… § Your data was divided into sentences using ‘punkt’
NLTK l Text material § Raw text § Annotated Text l Tools § Part of speech taggers § Semantic analysis l Resources § Word. Net, Treebanks
Linguistic Tasks l l l l Part of Speech Tagging Parsing Word Net Named Entity Recognition Information Retrieval Sentiment Analysis Document Clustering Topic Segmentation l l l l Classification Authoring Machine Translation Summarization Information Extraction Spoken Dialog Systems Natural Language Generation Word Sense Disambiguation
‘import nltk’ l You will need to import the necessary modules to create objects and call member functions § import ~ include objects from pre-built packages l Freq. Dist, Conditional. Freq. Dist are in nltk. probability l Plaintext. Corpus. Reader is in nltk. corpus
Basic NLTK usage l Load the notebook ‘Intro to NLTK’ using: § File > Open > Text Anaytics > Intro to NLTK l Explore the examples by advancing through them with the button ►
Exercise 1. l Run examples from Chapter 1 of NLTK book: § http: //nltk. googlecode. com/svn/trunk/doc/book/ch 01. html
Exercise 2. l Run examples from Chapter 3 of NLTK book § http: //nltk. googlecode. com/svn/trunk/doc/book/ch 03. html