CORPORA2013 A study of inflectional morpheme development in

  • Slides: 30
Download presentation
CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung

CORPORA-2013 A study of inflectional morpheme development in English-speaking children using CHILDES Corpus Myung Sook Min 1, Sun-Young Lee 2, Jong-Sup Jun 1 1 Hankuk University of Foreign Language & 2 Cyber Hankuk University of Foreign Language 26 JUNE, 2013 The International Conference on Corpus Linguistics CORPORA-2013

CORPORA-2013 Research Goal § Using the CHILDES(Child Language Data Exchange System) database, this study

CORPORA-2013 Research Goal § Using the CHILDES(Child Language Data Exchange System) database, this study investigated the order of acquisition of inflectional morphemes and the overregularization found in English children’s L 1 acquisition. 2

CORPORA-2013 1. Introduction Background § Children’s L 1 development is made by regularizing the

CORPORA-2013 1. Introduction Background § Children’s L 1 development is made by regularizing the linguistic knowledge acquired through diverse input from caregivers. § In English, the language development can be measured by the usage of the inflectional morphemes such as –ing and –(e)d. § Brown(1973) proposed the mean order of acquisition of 14 morphemes and Marcus et al. (1992) confirmed the U-shape development in the acquisition of English verbal irregular past tense. Research purpose § Using the whole CHILDES database, this study verifies the previous studies that studied a limited number of subjects on inflectional morpheme development in child language. 3

CORPORA-2013 2. Literature Review 2. 1 Acquisition order of inflectional morphemes Berko(1958) § Studied

CORPORA-2013 2. Literature Review 2. 1 Acquisition order of inflectional morphemes Berko(1958) § Studied the acquisition of morphemes in 4 -7 year old American children using WUG Test which investigates children’s ability to apply the inflectional morphemes to nonsense words. § Order of acquisition of Infl. (1) Present progressive(-ing) (2) Past regular(-ed) (3) Third Person regular(-s) (4) Possessive(-’s) Brown(1973) § Studied the acquisition of grammatical morphemes by analyzing the spontaneous utterance produced by 3 children. § Order of acquisition of Infl. (1) Present progressive (2) Plural (3) Past irregular (4) Possessive (5) Past regular (6) Third person regular (7) Third person irregular 4

CORPORA-2013 2. Literature Review 2. 2 Overregularization Marcus et al. (1992) – (-ed) Kuczaj(1977)

CORPORA-2013 2. Literature Review 2. 2 Overregularization Marcus et al. (1992) – (-ed) Kuczaj(1977) – (-ing) § Studied the overregularization of past tense morpheme on the spontaneous utterance produced by 83 subjects. § Overregularization rate was not high but its tendency existed. § Overregularization errors were found from the age of 2 till the beginning of school age. § U-shape development confirmed. § Studied the overregularization of present progressive morpheme on the spontaneous utterance produced by 15 subjects. § Overregularization was rarely found. § Claimed that it is because there is no irregular present progressive form for irregular verbs. 5

CORPORA-2013 2. Literature Review 2. 3 Research questions Limit of previous studies § The

CORPORA-2013 2. Literature Review 2. 3 Research questions Limit of previous studies § The results of previous studies are insufficient for the generalization of children’s language development due to a limited number of participants. Research questions 1) Do children apply the inflectional morphemes to diverse verbs as they get older? 2) Is the overregularization error found? And is the U-shape developmental pattern found in children’s language acquisition? 3) Related to questions 1 -2 above, is there a difference between the UK and the USA children’s language development? If so, is it due to mothers’ input? 6

CORPORA-2013 2. Literature Review 2. 4 Research Method § The number of inflectional word

CORPORA-2013 2. Literature Review 2. 4 Research Method § The number of inflectional word types, their frequency and type per token ratio, and D which stands for ‘lexical diversity’ were calculated to measure the development of inflectional morpheme by age. § D indicates the lexical diversity on randomly selected sentences. The higher D is, the more diverse the words to which the children apply the inflectional morphemes. § D is calculated by the command of Voc. D in CLAN on the CHILDES Corpus with different lengths of texts. 7

CORPORA-2013 3. Corpus Study 3. 1 CHILDES Corpus § The CHILDES Corpus is one

CORPORA-2013 3. Corpus Study 3. 1 CHILDES Corpus § The CHILDES Corpus is one of the most frequently used for research on language acquisition and the caregiver’s input influence research. § Rearranged the entire CHILDES Corpus to analyze it in an easy way and focused on the corpus from the age of 1 to 7 which accounts for 97% of the entire CHILDES Corpus. § 7, 841 files were created with 2, 272 files from 275 UK children and 5, 569 files from 1, 355 USA children. § 35, 130 word types with 1, 937, 624 tokens from the UK children and 63, 705 word types with 2, 771, 312 tokens from the USA children were extracted by the command of FREQ in CLAN 8

CORPORA-2013 3. Corpus Study 3. 3 Analysis § First, classified 4, 700, 000 words

CORPORA-2013 3. Corpus Study 3. 3 Analysis § First, classified 4, 700, 000 words by regular inflectional morphemes such as –(e)d and then extracted irregular inflectional morphemes such as ‘wore’ and integrated it with the regular inflectional words. (1) Present progressive(-ing) (2) Regular and irregular past tense(-(e)d, irr), (3) Comparative and superlative(-er, -est, irr) (4) Third person singular present/plural (-(e)s, irr), (5) Possessive singular and plural(-’s, -s’) (6) Pronoun § Calculated Type, Token and TTR by the command of FREQ in CLAN - Command: freq +t*CHI +u +f @ file § Calculated D by the command of Voc. D in CLAN - Command: vocd +t"CHI" +r 6 +s"@C: CHILDESCLANlib17133_ ed_d_irr_2556. cut" +u +f @ file 9

CORPORA-2013 3. Corpus Study 3. 4. Results § Extracted the inflectional word types of

CORPORA-2013 3. Corpus Study 3. 4. Results § Extracted the inflectional word types of 13, 528 and the tokens of 1, 221, 916 § TTR and D of inflectional morphemes by country UK USA Inflectional morphemes Type Token TTR -ing 1, 229 39, 759 0. 031 33. 26 1, 084 47, 458 0. 023 27. 38 -d_ed_irr(V) 1, 006 82, 474 0. 012 10. 78 1, 472 132, 405 0. 011 19. 23 217 11, 499 0. 008 0. 77 4, 245 114, 524 0. 037 52 165, 778 -er_-est_irr(A) -es_-s_irr(N) pronoun -s'_-'s 1, 359 49, 219 Total 8, 108 463, 253 D Type TTR D 13, 978 0. 014 1. 57 18. 18 3, 905 172, 631 0. 023 30. 78 0. 000 2. 64 51 321, 019 0. 000 1. 82 0. 028 4. 48 0. 019 11. 685 198 Token 1, 904 71, 172 0. 027 3. 77 8, 614 758, 663 0. 016 14. 09 10

CORPORA-2013 3. Corpus Study 3. 4. 1 Present progressive (-ing) § TTR and D

CORPORA-2013 3. Corpus Study 3. 4. 1 Present progressive (-ing) § TTR and D Age 1 2 UK Type 97 Token USA TTR D Type Token TTR D 719 0. 135 13. 83 290 3, 378 0. 086 27. 60 1, 158 33, 264 0. 035 33. 27 637 15, 904 0. 040 28. 14 3 256 3, 071 0. 083 20. 18 558 10, 924 0. 051 23. 93 4 131 642 0. 204 21. 84 569 11, 343 0. 050 26. 67 5 154 1, 009 0. 153 27. 07 383 3, 740 0. 102 28. 95 6 83 264 0. 314 27. 07 244 1, 210 0. 202 36. 05 7 118 790 0. 149 22. 48 200 959 0. 209 27. 23 1, 997 39, 759 0. 153 23. 68 Total 2, 881 47, 458 0. 106  28. 37 11

CORPORA-2013 3. Corpus Study 3. 4. 1 Present progressive (-ing) § The difference in

CORPORA-2013 3. Corpus Study 3. 4. 1 Present progressive (-ing) § The difference in D is not found between the UK and the USA children. § The correlations between D and the children’s age were not significant, which seems to indicate that children already apply the present progressive morpheme to diverse verbs from the age of 1. - UK children: r =0. 025, p >. 05 / USA children: r =0. 385, p >. 05 § 80 -90% of the most frequently used 50 words in children’s speech were found in the most frequently used 50 words in mothers’. § Overregularization errors were rarely found. - Noun+ing(tennising, swording, appetizing) one or twice of each Adjective+ing(noticeabling) only once - However, present progressive and gerund shares the same form, it needs further study by reviewing their usage. 12

CORPORA-2013 3. Corpus Study 3. 4. 2 Past tense (-(e)d_irr(V)) § TTR and D

CORPORA-2013 3. Corpus Study 3. 4. 2 Past tense (-(e)d_irr(V)) § TTR and D Age UK Type Token USA TTR D Type Token TTR D 1 94 1, 800 0. 052 5. 24 223 5, 145 0. 043 16. 34 2 913 68, 022 0. 013 11. 45 726 33, 190 0. 022 19. 92 3 262 6, 474 0. 040 12. 63 757 33, 020 0. 023 21. 24 4 166 1, 600 0. 104 15. 68 820 38, 482 0. 021 21. 00 5 181 2, 327 0. 078 14. 82 547 13, 106 0. 042 22. 60 6 108 610 0. 177 12. 34 352 4, 755 0. 074 24. 52 7 162 1, 641 0. 099 13. 39 321 4, 707 0. 068 20. 05 1, 886 82, 474 0. 080 12. 22 3, 746 132, 405 0. 042 20. 81 Total 13

CORPORA-2013 3. Corpus Study 3. 4. 2 Past tense (-(e)d_irr(V)) § As children get

CORPORA-2013 3. Corpus Study 3. 4. 2 Past tense (-(e)d_irr(V)) § As children get older, the D of past tense increased by the age of 5 or 6 and decreased at age 7 in both the UK and the USA. § A marginal correlation was found between the D and the children’s age. (The critical value of significant correlation coefficient was 0. 68) It means children tend to apply past tense morphemes to more diverse verbal words as their age increased. § UK children: r=0. 643 p>. 05 / USA children r= 0. 66, p>. 05 § In all age groups, the D of the USA children is higher than that of the UK children. § That the D of past tense is lower than that of present progressive confirms the grammatical morpheme developmental order proposed by Brown(1973). 14

CORPORA-2013 3. Corpus Study 3. 4. 2 Past tense (-(e)d_irr(V)) § The words with

CORPORA-2013 3. Corpus Study 3. 4. 2 Past tense (-(e)d_irr(V)) § The words with the highest frequencies are occupied mostly by irregular verbs. They were found four times more than regular verbs in both countries. - UK: 25 irregular verbs, 8 irregular verbs whose bare form shares the same form as the past and the past participle, 7 auxiliary verbs, 6 regular verbs, 4 words with regular past tense morphemes but probably used as adjectives - USA: 25 irregular verbs, 9 irregular verbs whose bare form shares the same form as the past and the past participle such as put, 5 auxiliary verbs, 5 regular verbs, 6 words with regular past tense morphemes but probably used as adjectives 15

CORPORA-2013 3. Corpus Study 3. 4. 2 Past tense (-(e)d_irr(V)) § ‘go’ and ‘fall’

CORPORA-2013 3. Corpus Study 3. 4. 2 Past tense (-(e)d_irr(V)) § ‘go’ and ‘fall’ were the most overregularized irregular verbs attached with regular past tense morpheme ‘-(e)d’. § Overregularization error type and frequency of irregular verb ‘go’ Correct Age went Gone UK USA 1 2 - 29 UK 580 784 572 4, 978 Overregularization Subtotal USA 239 UK 580 goed USA goned wented total subtotal UK USA 268 1 - - - 1 627 5, 762 1, 199 17 38 3 2 - 1 20 - UK 581 USA 268 41 5, 782 1, 240 3 73 675 192 142 265 817 3 52 - - 3 52 268 869 4 23 860 21 109 44 969 - 4 - - - 4 44 973 5 33 286 22 30 55 316 - - - - 55 316 6 24 158 2 6 26 164 - - - - 26 164 15 22 115 96 - - - - 115 96 7 100 74 16

CORPORA-2013 3. Corpus Study 3. 4. 2 Past tense (-(e)d_irr(V)) § Overregularization errors were

CORPORA-2013 3. Corpus Study 3. 4. 2 Past tense (-(e)d_irr(V)) § Overregularization errors were not found at the age of 1 but appeared between the ages of 2 and 3 and then they began to disappear from the age of 4 or 5. § U-shape developmental pattern of irregular verb ‘go’ 100% 99% 98% 97% 96% UK 95% USA 94% 93% 92% 91% 1 2 3 4 5 6 7 age § Overregularization rate of ‘go’ between the UK and the USA was significantly different by the Pearson chi-square. 17

CORPORA-2013 3. Corpus Study 3. 4. 2 Past tense (-(e)d_irr(V)) § Overregularization error type

CORPORA-2013 3. Corpus Study 3. 4. 2 Past tense (-(e)d_irr(V)) § Overregularization error type and frequency of irregular verb ‘fall’ Correct Age Fell Overregularization Fallen subtotal falled felled fallened total subtotal UK USA UK USA 1 3 73 - - - 1 3 74 2 315 462 290 2 605 464 57 94 6 5 4 - 67 3 37 324 16 5 53 329 4 34 - 2 - - 4 36 57 365 4 15 264 2 1 17 265 1 8 - 1 - - 1 9 18 274 5 16 94 2 - 18 94 - 2 - 1 - - - 3 18 97 6 7 32 - 1 7 33 1 - - - 1 - 8 33 7 5 15 1 1 6 16 - 1 - - - 2 6 18 99 672 563 § The overregularization error types of ‘fall’ were found more in the UK children. 18

CORPORA-2013 3. Corpus Study 3. 4. 2 Past tense (-(e)d_irr(V)) § Overregularization error in

CORPORA-2013 3. Corpus Study 3. 4. 2 Past tense (-(e)d_irr(V)) § Overregularization error in irregular past tense tended to appear at the age of 2 and began to decrease from the age of 3 and disappeared at the age of 4 or 5. § U-shape developmental pattern of irregular verb ‘fall’ 100% 98% 96% 94% 92% 90% UK 88% USA 86% 84% 82% 80% 1 2 3 4 5 6 7 age § Overregularization rate of ‘fall’ between the UK and the USA was not significantly different by the Pearson chi-square. 19

CORPORA-2013 3. Corpus Study 3. 4. 3 Comparative and Superlative (-er_-est_irr(A)) § TTR and

CORPORA-2013 3. Corpus Study 3. 4. 3 Comparative and Superlative (-er_-est_irr(A)) § TTR and D Age 1 2 3 4 5 6 7 total Type 11 76 26 21 28 20 17 199 UK Token TTR 661 0. 017 9, 940 0. 008 420 0. 062 126 0. 167 194 0. 144 57 0. 351 101 0. 168 11, 499 0. 131  D 0. 10 0. 74 1. 24 2. 89 2. 92 5. 82 2. 54 2. 32 Type 19 70 99 122 84 47 41 482 USA Token TTR 1, 660 0. 011 4, 242 0. 017 2, 758 0. 036 3, 349 0. 036 1, 211 0. 069 376 0. 125 382 0. 107 13, 978 0. 057  D 0. 26 0. 92 2. 03 2. 70 3. 44 4. 06 2. 17 2. 23 20

CORPORA-2013 3. Corpus Study 3. 4. 3 Comparative and Superlative (-er_-est_irr(A)) § As children

CORPORA-2013 3. Corpus Study 3. 4. 3 Comparative and Superlative (-er_-est_irr(A)) § As children get older, the D of comparative -er and superlative –est increased by the age of 6 and slightly decreased at age 7 in both the UK and the USA. It confirms that children applied comparative and superlative form to diverse adjectival words as they get older. § Strong correlations were found between D and the children’s age. UK: r = 0. 779, p <. 05 / USA: r = 0. 776, p <. 05 § The Ds between the UK and the USA were not distinctively noticeable. 21

CORPORA-2013 3. Corpus Study 3. 4. 3 Comparative and Superlative (-er_-est_irr(A)) § The words

CORPORA-2013 3. Corpus Study 3. 4. 3 Comparative and Superlative (-er_-est_irr(A)) § The words with the most frequency are more and better followed by last, bigger, higher in the UK children and cleaner, higher, bigger, later in the USA children. § Overregularization error type and frequency of ‘little’ Correct Overregularization less littler littlest Age UK USA 1 0 0 0 2 3 1 1 7 0 5 3 0 3 2 6 1 9 4 0 6 1 4 5 0 7 0 4 1 0 6 0 0 1 4 1 3 7 0 1 0 0 Total UK 0 4 3 1 1 2 0 USA 0 12 18 16 11 7 1 22

CORPORA-2013 3. Corpus Study 3. 4. 3 Comparative and Superlative (-er_-est_irr(A)) § The overregularization

CORPORA-2013 3. Corpus Study 3. 4. 3 Comparative and Superlative (-er_-est_irr(A)) § The overregularization errors were found till the age of 6 but still show the Ushape developmental pattern. § U-shape developmental pattern of ‘little’ 100% 90% 80% 70% 60% UK_littler 50% UK_littlest 40% USA_littler 30% USA_littlest 20% 10% 0% 1 2 3 4 5 6 7 age § Overregularization rate of ‘little’ between the UK and the USA was significantly different by the Pearson chi-square. 23

CORPORA-2013 3. Corpus Study 3. 4. 3 Comparative and Superlative (-er_-est_irr(A)) § 4 files

CORPORA-2013 3. Corpus Study 3. 4. 3 Comparative and Superlative (-er_-est_irr(A)) § 4 files that littler was found in both child and mother. In these files, children produced 8 times while mothers produced 17 times. § This finding tells us the possible influence of mothers’ input on child langauage. 24

CORPORA-2013 4. Discussion 1) Do children apply the inflectional morphemes to diverse verbs as

CORPORA-2013 4. Discussion 1) Do children apply the inflectional morphemes to diverse verbs as they get older? ØD Present progressive(23. 68~28. 37) > Past tense (12. 22~20. 81) > Comparative/Superlative(2. 32~2. 23) - D confirms the grammatical morpheme developmental order proposed by Brown(1973). Ø The developmental patterns of each inflectional morpheme were different as children got older. Ø That irregular verbs were found more than 4 times than regular verbs in 50 most frequently used verbs supports Brown(1973)’s claim that children acquired irregular verbs earlier than regular verbs. 25

CORPORA-2013 4. Discussion 2) Is the overregularization error found? And is the U-shape developmental

CORPORA-2013 4. Discussion 2) Is the overregularization error found? And is the U-shape developmental pattern found in children’s language acquisition? Ø The overregularization errors were found and the U-shape developmental pattern which was claimed in the previous studies like Brown(1973) and Marcus et al. (1992) were confirmed in CHILDES Corpus on a large scale. Ø The overregularizaiton errors were found in past tense the most and rarely found in present progressive. 26

CORPORA-2013 4. Discussion 3) Related to questions 1 -2 above, is there a difference

CORPORA-2013 4. Discussion 3) Related to questions 1 -2 above, is there a difference between the UK and the USA children’s language development? If so, is it due to mothers’ input? Ø Similarities (1) As children get older, they apply the inflectional morpheme to more diverse words. (2) U-shape developmental patterns were found in both the UK and the USA. Ø Differences (1) The overregularization error rate in English children was lower than that in American children. (2) The possible influence of mothers’ input on children’s language is suggestive. 27

CORPORA-2013 5. Conclusion § This study investigated the inflectional morpheme development in child language

CORPORA-2013 5. Conclusion § This study investigated the inflectional morpheme development in child language using the data from CHILDES Corpus from 1 -7 years old. § Our findings are: 1) Children tended to apply the inflectional morpheme to more diverse words as they got older. 2) U-shape developmental pattern was confirmed. 3) The overregularization errors were found while children applied the inflectional morphemes to words. 4) With Ds, this study supports the grammatical developmental order proposed

CORPORA-2013 References [1] Berko, Jean(1958), The child’s learning of English morphology. Word, 14, 47

CORPORA-2013 References [1] Berko, Jean(1958), The child’s learning of English morphology. Word, 14, 47 -56 [2] Brown, Roger(1973), A first Language-The early Stages, Harvard University Press [3] CHILDES (http: //childes. psy. cmu. edu/) [4] Johansson, Victoria(2008), Lexical diversity and lexical density in speech and writing: a developmental perspective, Lund University, Dept. of Linguistics and Phonetics, Working Papers 53. p. 61 -79 [5] Kuczaj, Stan A. (1977), Why do children fail to overgeneralize the progressive inflection? , Journal of Child Language 5. p. 167 -171 [6] Mac. Whinney, B. & Snow, C. E. (2000), The Child Language Data Exchange System: An Update. Journal of Child Language 17. p. 457 -472 [7] Marcus, Gary F. ; Pinker, Steven; Ullman, Michael; Hollander, Michelle; Rosen, T. John; and Su, Fei(1992), Overregularization in Language Acquisition, MONOGRAPHS OF THE SOCIETY FOR RESEARCH IN CHILD DEVELOPMENT Serial No. 228 Vol. 57 [8] Malvern, David; Brian Richards; Ngoni Chipeer & Pilar Duran(2004), Lexical diversity and language development: quantification and assessment New York: Palgrave Macmillan [9] Maslen, Robert J C; Theakston, Anna L; Lieven, Elena V M; Tomasello, Michael(2004), A Dense Corpus Study of Past Tense and Plural Overregularization in English, Journal of Speech, Language, and Hearing Research 47. 6. p. 1319 -1333 [10] Mc. Cathy, Philip M. & Jarvis S(2004), vocd: A theoretical and empirical evaluation, Language Testing 24. 4 p. 459 -488 [11] Richards, Brian J. & David Malvern(1997), Quantifying lexical diversity in the study of language development. Reading: Faculty of Education and Community Studies [12] Templin, M. C. (1957), Certain language skills in children. Minneapolis: University of Minnesota Press 29

CORPORA-2013 Contact Info. § Myung Sook Min (michelleclick@empal. com) § Sun-Young Lee (alohasylee@cufs. ac.

CORPORA-2013 Contact Info. § Myung Sook Min (michelleclick@empal. com) § Sun-Young Lee (alohasylee@cufs. ac. kr) § Jong-Sup Jun (jongsupjun@korea. com) 30