CORPUS LINGUISTICS 1 A revision of corpus linguistics
- Slides: 33
CORPUS LINGUISTICS 1) A revision of corpus linguistics 2) Language corpora in the ESL/EFL classroom
WHAT IS A CORPUS? A corpus can be defined as a collection of texts assumed to be representative of a given language put together so that it can be used for linguistic analysis. Usually the assumption is that the language stored in a corpus is naturally-occurring, that is gathered according to explicit design criteria, with a specific purpose in mind, and with a claim to represent natural chunks of language selected according to specific typology Tognini-Bonelli (2001: 2)
“nowadays the term 'corpus' nearly always implies the additional feature of 'machine-readable'”. Mc. Enery & Wilson, Corpus Linguistics. Online manual.
English language corpora: General vs. Specific
ENGLISH CORPORA: GENERAL LANGUAGE CORPORA First generation corpora: -Brown Corpus of Written American English -Lancaster Oslo-Bergen of Written British English -500 texts of around 2000 words each -no spoken data -wide variety of written texts
ENGLISH CORPORA: GENERAL LANGUAGE CORPORA Second generation corpora: -Bank of English -monitor corpus -both spoken and written text -different regional varieties of English -British National Corpus (BNC) -90 million written words -10 million spoken words -freely accessible: Mark Davies‘ interface
OTHER TYPES OF ENGLISH LANGUAGE CORPORA -speech corpora: -sound recordings -SPOKEN ENGLISH CORPUS -detailed description of spoken phenomena: phonology, prosody (stress, tone units…), etc -multimedia corpora: -transcripts synchronised audio/video recordings -TALKBANK Website: SANTA BARBARACORPUS OF SPOKEN AMERICAN ENGLISH (SBCSAE)
audiovisual element some markup for context space for our own annotation
OTHER TYPES OF ENGLISH LANGUAGE CORPORA -parsed corpora: -syntactically analysed -SURFACE AND UNDERLYING STRUCTURAL ANALYSES AND NATURALISTIC ENGLISH CORPUS (SUSANNE) -historical corpora: -English of earlier periods -may cover specific historical periods or genres -track and describe how language has evolved -A REPRESENTATIVE CORPUS OF HISTORICAL ENGLISH REGISTERS (ARCHER)
OTHER TYPES OF ENGLISH LANGUAGE CORPORA -specialised corpora: -focus on concrete genres/domains -BUSINESS LETTERS CORPUS (BLC) -lingua franca corpora: -ENGLISH AS A LINGUA FRANCA IN ACADEMIC SETTINGS (ELFA) CORPUS -intercultural exchanges among speakers who use English as a lingua franca
OTHER TYPES OF ENGLISH LANGUAGE CORPORA -developmental language corpora: -non-adult English native speakers' output -not as proficient as native-speaker corpora -POLYTECHNIC OF WALES (POW) CORPUS -ESL/EFL learner corpora: -learners of English's output -one and the same L 1 background or different mother tongues -JAPANESE EFL LEARNER CORPUS (JEFLL)
WORDSMITH: FLEXIBLE CORPUS -Computer program which permits users to compile their own corpus -Texts must be in. txt format -Any text can be subjected to the same process of analysis that official corpora undergo: concordance lines, word lists, etc -No need to pre-process such texts in advance
Corpus linguistics -Insights into the internal workings of real language -Knowledge in turn also used in other fields of enquiry -Planning, designing, compiling and tagging -Frequency lists and concordance lines (+further analysis) -Sinclair’s (2003) “degeneralisation”: -sceptical about 'received' descriptions -patterns found in the data: more precise or alternative descriptions -Corpus-based dictionaries and grammars -how lexis and grammar are “really” used -COLLINS COBUILD LEARNER'S DICTIONARY -THE LONGMAN GRAMMAR OF SPOKEN AND WRITTEN ENGLISH
CORPORA IN THE ESL/EFL CLASSROOM: PEDAGOGICAL FOUNDATIONS -Mixture between instructional and naturalistic LL -Fulfilment of both the input and output hypotheses -”Scaffolding” (though loosely speaking) -insights concerning English culture(s) -Student-centred and related to constructivism: mastering corpora = learning autonomy
CORPUS-BASED ESL/EFL ACTIVITIES -Focus on lexis, grammar and register -introductory notions concerning collocation, colligation, and formal vs. informal -For already motivated students: BNC
Activity one: contractions, formal or informal? spoken or written? The key * ? ’? ?
Quotation marks!
Activities two and three: Corpora as a source of knowledge concerning collocation and colligation
[v*] mistakes
[aj*] powerful, not strong!!!
Activity four: meaning via collocations and co -text
For non-motivated students: Word. Smith -Contact with the English language: input (at least lexis-wise) -Popular culture: MUSIC IN ENGLISH!!!
Activity one: music corpora, lexis, and the BNC for grammar accuracy
author corpus reference corpus
Select the text you want a list of
Save both lists to compare them with Keyword
author corpus list reference corpus list
That was all! The nightmare is over! Thank you for listening! ^. ^
- Active and passive vocabulary
- Lutalphase
- 3 layers of fallopian tube
- Decipherment
- Corpus linguistics
- Maxims of annotation in corpus linguistics
- Mcenery corpus linguistics "free download" or "read online"
- Applied linguistics history
- Traditional linguistics and modern linguistics
- Transmission of a nerve impulse
- Ict igcse practical revision presentation
- Euclidean geometry rules
- Wjec unit 4 criminology
- Jekyll and hyde revision
- Othello freytag pyramid
- Preposition examples
- Forgetting curve psychology
- Revision unit 1
- Ict igcse theory
- Lochia color
- Learn to learn
- Fast revision techniques
- Primitaive
- Animal farm revision guide
- Work area protection manual
- Year 9 french revision booklet
- The sailmaker play
- Half my life is an act of revision
- Mathsrevision.com
- Mathsrevision.com
- Grundlagen englisch
- Ted hughes bayonet charge context
- Virginia work area protection manual revision 2
- Aqa gcse geography revision checklist