Research Methods Workshop Introducing Corpus Linguistics Techniques 1

  • Slides: 27
Download presentation
Research Methods Workshop Introducing Corpus Linguistics Techniques (1): Making the Most of the VIEW

Research Methods Workshop Introducing Corpus Linguistics Techniques (1): Making the Most of the VIEW

A reminder • Corpus Linguistics is a methodology, which tends to: – involve the

A reminder • Corpus Linguistics is a methodology, which tends to: – involve the analysis of “actual” language use in natural texts (but the analysis of literary texts is also possible) – utilise a large and principled collection of natural texts, known as a “corpus”, as the basis for analysis – makes extensive use of computers, utilising both automatic and interactive techniques – depend on both quantitative and qualitative analytical techniques: “The goal of corpus-based investigations is not simply to report quantitative findings, but to explore the importance of these findings for learning about the patterns of language use” (adapted from Biber et al 1998: 4 -5)

Concordancing = An alphabetical listing of the words in a text, given together with

Concordancing = An alphabetical listing of the words in a text, given together with the contexts in which they appear. – The most common form of concordance today is the Keyword-in. Context (KWIC) index: Figure 1: Concordance of poor in Tale of Two Cities, Book 1 1320 taste it is that such 948 of sparing the 778 small property of my 1870 desolate, while your 1947 Miss, if the 1884 the love of my 1615 stockings, and all his 1577 faded away into a 1001 on your way to the 1036 detachment from the poor poor poor cattle always have in their mouths child the inheritance of any part of father, whom I never saw--so long heart pined away, weep for it lady had suffered so intensely mother hid his torture from me tatters of clothes, had, in a long weak stain. So sunken and wronged gentleman, and, with a young lady, by laying a brawny

What do concordancers let you do? – let you look at a word in

What do concordancers let you do? – let you look at a word in context, see how common it is, see the style associated with it. – Let you compare your usage with that of others (very useful in EFL) – Let you compare usage across different genres/registers (very useful in ESL) – More advanced users can explore attitudes (the thought processes that lie behind the words) The recall problem: Although concordancers allow you to specify search words, it’s worth remembering that … – Some tools will only give you the results for what you said you were looking for, which may not be the same thing as what you thought you were looking for. – You notice only what you get back; you will notice what you did not find.

Becoming familiar with VIEW = Variation In English Words and Phrases You can find

Becoming familiar with VIEW = Variation In English Words and Phrases You can find it at: http: //corpus. byu. edu/bnc/ So what’s so good about VIEW? – allows you to quickly and easily search for a wide range of words and phrases of English in the 100 million word BNC. – BNC = represents modern English of the late 20 th century – As with some other BNC interfaces, you can search for words and phrases by • exact word or phrase • wildcard or part of speech • combinations of word/phrase and wildcard/part of speech. – Time permitting, we’re going to master the first two on the list.

Search: ‘corporation’ Clicking on the word brings up a concordance We can search the

Search: ‘corporation’ Clicking on the word brings up a concordance We can search the whole of the BNC – or just a small part of it (i. e. W_commerce) – but remember to tick the “limit” box!!!

KWIC concordance of ‘corporation’ (in w_commerce)

KWIC concordance of ‘corporation’ (in w_commerce)

Sorting our entries … We can sort our entries according to ‘left’ and ‘right’

Sorting our entries … We can sort our entries according to ‘left’ and ‘right’ context, by using an * However, if we want to look at the same results, we have to pick the appropriate register in the left hand column … e. g. w_commerce

Results for … corporation * What strikes you about the results?

Results for … corporation * What strikes you about the results?

Results for … * corporation What strikes you about these results?

Results for … * corporation What strikes you about these results?

Group Task • How is ‘corporation’ used in newspaper tabloids? • Is it used

Group Task • How is ‘corporation’ used in newspaper tabloids? • Is it used in similar ways to the use of ‘corporation’ in W_commerce? Let’s explore some other words …. You choose …. !

Check out the “CHART” button … CHART is useful when you want to see

Check out the “CHART” button … CHART is useful when you want to see the extent to which specific words are utilised in the different genres. …

Using VIEW to search for collocates We use the ‘surrounding’ display … remembering to:

Using VIEW to search for collocates We use the ‘surrounding’ display … remembering to: • Make sure we’re in the TABLE display • Define the size of window (the smaller the window, the closer our words will be to X) • Put the ‘min freq’ to X (i. e. any number between 2 -7) • Tick the ‘limit’ box • Choose the register

Using VIEW to search for collocates Search word = market Click on ‘surrounding’ Register

Using VIEW to search for collocates Search word = market Click on ‘surrounding’ Register = W_commerce Tick limit box What strikes you about these results? Replicate the search on your computer, and then answer the following: • Are there any collocates that are predictable in your view? • Do any of the collocates of ‘market’ surprise you?

We can use a similar process to search for antonyms and synonyms …

We can use a similar process to search for antonyms and synonyms …

Comparing synonyms ‘Search String’ to worker/employee ‘Surrounding’ on (5/5 window) ‘Register’ = W_commerce ‘Limit’

Comparing synonyms ‘Search String’ to worker/employee ‘Surrounding’ on (5/5 window) ‘Register’ = W_commerce ‘Limit’ to on ‘Min freq. ’ to 5

Comparing synomyns Your chosen words No. of times that word X appears near to

Comparing synomyns Your chosen words No. of times that word X appears near to chosen words

Group task: The collocates of worker/employee • What collocates with ‘worker’? • What collocates

Group task: The collocates of worker/employee • What collocates with ‘worker’? • What collocates with ‘employee’? • Change your search so that you use the whole BNC …register 1 = -- IGNORE – – Have the collocates for ‘worker’ remained the same? – Have the collocates for ‘employee’ remained the same?

Lexical priming and semantic prosody Lexical priming: “Every word is primed for use in

Lexical priming and semantic prosody Lexical priming: “Every word is primed for use in discourse as a result of the cumulative effects of an individual's encounters with the word. . . Every word is primed to occur with particular words; these are its collocates. ” (Hoey 2005) Semantic prosody: …occurs when the habitual collocates of a word (or phrase) colour its meaning so it can no longer be seen in isolation from its semantic prosody. Some questions to ponder … • How do we study 'semantic prosody'? • What can it tell us? • Where can we find it? • How can we find it?

Searching for meaningful patterns Patterns contribute to the creation of a network of textual

Searching for meaningful patterns Patterns contribute to the creation of a network of textual meanings; computers and human interpretation can be used in conjunction to identify (and make sense of) these patterns. . . residual/core meaning DENOTATION = literal meaning COLLOCATION = patterns of words appearing together COLLIGATION = collocation patterns based on syntactic groups rather than individual words SEMANTIC ASSOCIATION: semantic PREFERENCE semantic PROSODY textual meaning = tendency of a word to keep company with a semantic set or class; some members of this set or class will usually be collocates. = colouring of meaning (? Permanently ? )

Group Task • Do a search for the following: – “slump”, “slumped”, slumps”, “jinxed”,

Group Task • Do a search for the following: – “slump”, “slumped”, slumps”, “jinxed”, “shortfall”, “demand” – How are they used in context and are they always negative? – Are the meanings of any of these terms “coloured” (i. e. can no longer be seen in isolation from its semantic prosody)?

Now let’s explore parts of speech • What do you think the most common

Now let’s explore parts of speech • What do you think the most common noun in English is? – Write down your answers on a piece of paper – Now do the following search to find out whether your “hunch” was correct: [nn*]

The most frequent nouns in the BNC We search for nouns by including [n*]

The most frequent nouns in the BNC We search for nouns by including [n*] here … What strikes you about the results?

Most frequent nouns in spoken section of BNC ( = 10 million words) Notice

Most frequent nouns in spoken section of BNC ( = 10 million words) Notice that TIME is now the second most frequent noun … but there a lot of other nouns relating to periods of time … Indeed - YEAR, DAY, YEARS, WEEK, NIGHT, MORNING – are all in the top 25! Question: How much does this result suggest we are preoccupied with time in Britain?

Other parts of speech worthy of exploration • • [vv*] [aj*] [av*]

Other parts of speech worthy of exploration • • [vv*] [aj*] [av*]

CL: Best Practice • We need to balance a quantitative approach with a qualitative

CL: Best Practice • We need to balance a quantitative approach with a qualitative approach • We need to know our data – or be prepared to become very familiar with it! • We need to be prepared to engage with theory

References Biber, D. , Conrad, S. , and R. Reppen (1998) Corpus Linguistics: Investigating

References Biber, D. , Conrad, S. , and R. Reppen (1998) Corpus Linguistics: Investigating Language Structure and Use. Cambridge: Cambridge University Press. , Barnbrook (1996) Language and computers. Edinburgh: Edinburgh University Press. Hoey, M. (2005). Lexical priming: a new theory of words and language. London: Routledge. Nelson, Mike ‘Computers and Semantic Prosody’. Online paper, available at http: //www. kielikanava. com/semantic. html. * Sinclair, J. (2004). Trust the text. London: Routledge. Stubbs, M. (1996). Text and corpus analysis. Oxford: Blackwell.