Overview LING 5200 Computational Corpus Linguistics Martha Palmer
























- Slides: 24
Overview LING 5200 Computational Corpus Linguistics Martha Palmer 1
What’s a corpus? n Mc. Enery & Wilson: q (i) (loosely) any body of text q (ii) (most commonly) a body of machinereadable text q (iii) (more strictly) a finite collection of machine-readable text, sampled to be maximally representable of a language or variety LING 5200, 2006 2 BASED on Kevin Cohen’s LING 5200
What’s corpus linguistics? n “the study of language based on examples of ‘real life’ language use” (Mc. Enery & Wilson) q n A methodology, not a branch of linguistics Biber et al. : q Uses computers q “Natural” texts q Large & principled collection q Both quantitative and qualitative LING 5200, 2006 3 BASED on Kevin Cohen’s LING 5200
What was Chomsky’s complaint? n n Linguistics should model competence not performance. What are the underlying rules that allow us to generate language? Context – structuralists believed in collecting linguistic data about a language without taking meaning and communication into consideration. Mirrors the debate between the rationalists and the empiricists. But, does Chomsky account for meaning? (see Searle) LING 5200, 2006 4 BASED on Kevin Cohen’s LING 5200
Which Linguistic branches can make use of corpus linguistics? n n n Phonetics Phonology Morphology Syntax Semantics Pragmatics LING 5200, 2006 n n n 5 Psycholinguistics Computational Lx Descriptive Lx Historical Lx Sociolinguistics BASED on Kevin Cohen’s LING 5200
Corpus linguistics in context data Natural Language Processing Corpus Linguistics applications Computational Linguistics models LING 5200, 2006 6 BASED on Kevin Cohen’s LING 5200
What’s LING 5200 Corpus Linguistics? n n Tools Techniques LING 5200, 2006 7 BASED on Kevin Cohen’s LING 5200
Overview n n n Quick intro to Unix A little corpus design Quick tour of corpora and annotation Tools for working with corpora Programming in Python Some software engineering LING 5200, 2006 8 BASED on Kevin Cohen’s LING 5200
Why Python? n n It works Many advantages It’s a bona fide programming language You’ll need it for CSCI 5832 LING 5200, 2006 9 BASED on Kevin Cohen’s LING 5200
Administrative things n n n Textbooks – Unix, Python Office hours – Mon 5 -6, Tues 1 -2 verbs. colorado. edu/mpalmer/ling 5200 Prerequisites - none Grades – homeworks/project Accounts on babel LING 5200, 2006 10 BASED on Kevin Cohen’s LING 5200
Logging on for the first time n n n First thing to do: change your password. passwd Give it your current password, then your new password. Repeat the new one. (to catch typos) LING 5200, 2006 11 BASED on Kevin Cohen’s LING 5200
Connecting with another computer ssh –l your_name babel. colorado. edu You are prompted to log in. LING 5200, 2006 12 BASED on Kevin Cohen’s LING 5200
Logging on for the first time, again n First thing to do: change your password. passwd Give it your current password, then your new password. Repeat the new one. (Why? ) LING 5200, 2006 13 BASED on Kevin Cohen’s LING 5200
Where am I? n Type pwd n You see something like this: /home/mpalmer LING 5200, 2006 14 BASED on Kevin Cohen’s LING 5200
What's that mean? ? LING 5200, 2006 15 BASED on Kevin Cohen’s LING 5200
Important directories / bin LING 5200, 2006 home etc usr mpalmer local ling 5200 bin RCS 16 BASED on Kevin Cohen’s LING 5200
Important directories / bin /home/mpalmer/ling 5200 LING 5200, 2006 home etc usr mpalmer local ling 5200 bin RCS 17 BASED on Kevin Cohen’s LING 5200
Important directories / bin /home/mpalmer/ling 5200 LING 5200, 2006 home etc usr mpalmer local ling 5200 bin RCS 18 /usr/local/bin BASED on Kevin Cohen’s LING 5200
Navigating directories n ls to list contents, cd to change directory q n n n Directories are just like windows folders /home/mpalmer shortcut: ~ “the directory above this one”: . . “this directory”: . LING 5200, 2006 19 BASED on Kevin Cohen’s LING 5200
What's in the neighborhood? n n Type ls You see a list of directories and files that are contained within the current directory Homework_1. txt tools buglog. txt LING 5200, 2006 20 BASED on Kevin Cohen’s LING 5200
I'd like to go somewhere else… n n Type pwd Type cd Where are you? Type cd. . n Where are you? Type cd your_user_id n Where are you? n LING 5200, 2006 21 BASED on Kevin Cohen’s LING 5200
Unix is a verb-initial language cd. . "go" LING 5200, 2006 where to go 22 BASED on Kevin Cohen’s LING 5200
Unix is a verb-initial language cd "go" LING 5200, 2006 If no argument, I assume you mean "home" 23 BASED on Kevin Cohen’s LING 5200
Making a new directory n n n n Type cd ls mkdir ling 5200 ls Go to the directory you just made (how? ) Type pwd Type ls LING 5200, 2006 24 BASED on Kevin Cohen’s LING 5200