Schema Theorem in Language Acquisition A Rags to

  • Slides: 23
Download presentation
Schema Theorem in Language Acquisition A Rags to Riches Story BOOT-LA, Indiana University, April

Schema Theorem in Language Acquisition A Rags to Riches Story BOOT-LA, Indiana University, April 23, 2003

Schema Theorem in Language Acquisition

Schema Theorem in Language Acquisition

Poverty of the Stimulus “The poverty-of-the-stimulus argument, otherwise known as Plato’s Problem, claims that

Poverty of the Stimulus “The poverty-of-the-stimulus argument, otherwise known as Plato’s Problem, claims that the nature of language knowledge is such that it could not have been acquired from the actual samples of language available to the human child. ” Cook & Newson(1996: 86) Schema Theorem in Language Acquisition

Poverty of the Stimulus What counts as evidence? • positive evidence requirement: no correction,

Poverty of the Stimulus What counts as evidence? • positive evidence requirement: no correction, explanation etc. • occurrence requirement: must occur in normal language situations • uniformity requirement: must be available to all children regardless of culture, class, language • take-up requirement: must be used by children Schema Theorem in Language Acquisition

Poverty of the Stimulus Rational Steps for Inclusion in UG/LAD A native speaker of

Poverty of the Stimulus Rational Steps for Inclusion in UG/LAD A native speaker of a particular language knows a particular aspect of syntax. Ex. structuredependency, Binding Principles, etc. B. This aspect of syntax could not have been acquired from the language input available to children. C. This aspect of syntax is not learnt from outside. D. This aspect of syntax is built-in to the mind. Cook & Newson(1996: 86) A. Schema Theorem in Language Acquisition

Poverty of the Stimulus A Problem: A native speaker of a particular language knows

Poverty of the Stimulus A Problem: A native speaker of a particular language knows a particular aspect of syntax. Ex. structuredependency, Binding Principles, etc. B. This aspect of syntax could not have been acquired from the language input available to children. C. This aspect of syntax is not learnt from outside. D. This aspect of syntax is built-in to the mind. A. Schema Theorem in Language Acquisition

Poverty of the Stimulus • “Step B” is in practice assumed, and rarely rigorously

Poverty of the Stimulus • “Step B” is in practice assumed, and rarely rigorously demonstrated • increasingly we find existence proofs of acquisition tasks previously believed impossible via statistical, data-driven methods (ex. Chalmers, 1990; Elman, 1995) Schema Theorem in Language Acquisition

Poverty of the Stimulus Faulty “Step B” Reasoning: a) Helen said that Janei voted

Poverty of the Stimulus Faulty “Step B” Reasoning: a) Helen said that Janei voted for herselfi. b)*Heleni said that Jane voted for herselfi. Cook & Newson (1996: 84) • • “no context could let them unerringly distinguish the binding of anaphors and of pronominals. ” implicitly assumes that at this point, the only utterances / experience the child has access to are these two possible interpretations in fact, by the time children produce / understand sentences of this level of complexity, they’ve had extensive experience producing and interpreting anaphors and pronominals (O’Grady, 1997) moreover, from the outset children show a bias towards binding to the nearest antecedent – they have the most trouble with sentences like: *Helen said that Janei voted for heri. Schema Theorem in Language Acquisition

Poverty of the Stimulus Faulty “Step B” Reasoning: a) It is likely that John

Poverty of the Stimulus Faulty “Step B” Reasoning: a) It is likely that John will be delayed. b) It is probable that John will be delayed. c) John is likely to be delayed. d)*John is probable to be delayed. O’Grady (1997: 246) • • • common argument against analogy as a learning method denies analogy based on anything but these specific cases – by the time a child produces / understands sentences such as these, they already have extensive linguistic knowledge that would preclude such naive analogies Other studies have shown analogy can be a useful technique for the acquisition of categories and grammatical structure (Mc. Lennan, ms. ; Tomasello, 2000 for example) Schema Theorem in Language Acquisition

What to do? • Simply denying UG doesn’t solve our problem since traditional linguists’

What to do? • Simply denying UG doesn’t solve our problem since traditional linguists’ intuitions about the input remain unchanged and lead us back to the same conclusions • Genetic Algorithms seem to have a similar problem – they look more efficient than they possibly could be – similar sense of “getting something for nothing” Schema Theorem in Language Acquisition

Genetic Algorithms • • problem solving technique which is capable of assessing an extremely

Genetic Algorithms • • problem solving technique which is capable of assessing an extremely large and complicated problem space on the basis of a restricted “impoverished” input set Three primary elements: a population of “chromosomes” (bit string) 2. a fitness function (judges “goodness”) 3. mating and procreation 1. (Holland, 1975; Mitchell, 1996) Schema Theorem in Language Acquisition

Genetic Algorithms • from purely random beginnings a solution emerges very quickly – even

Genetic Algorithms • from purely random beginnings a solution emerges very quickly – even for optimizations that can’t be performed by traditional serial computational methods Schema Theorem in Language Acquisition

Genetic Algorithms • Schema Theorem: explanation of how GAs work 101 is an instantiation

Genetic Algorithms • Schema Theorem: explanation of how GAs work 101 is an instantiation of the categories (schemata): {***, 1**, *0*, **1, 10*, 1*1, *01, 101} (of a possible 27) 1** is a category representation of {100, 101, 110, 111, (1*1, 1*0, 11*, 10*)} Schema Theorem in Language Acquisition

Genetic Algorithms If “ 101” is judged as being 75% fit, it simultaneously guestimates

Genetic Algorithms If “ 101” is judged as being 75% fit, it simultaneously guestimates {***, 1**, *0*, **1, 10*, 1*1, *01, 101} as being 75% fit • Given a population with multiple instantiations, implicit calculation of category fitness becomes more accurate • Fuzzy judgments are still useful • Selection, biased by fitness, selects not for highly fit individuals but (implicitly) highly fit categories by targeting highly fit individuals • Schema Theorem in Language Acquisition

Genetic Algorithms the profound insight: GAs make use of category information without explicit category

Genetic Algorithms the profound insight: GAs make use of category information without explicit category definitions, explicit biases, or explicit reference to category information. It implicitly acts on categories through category instantiations Schema Theorem in Language Acquisition

Genetic Algorithms • • taken in this light it is easier to see how

Genetic Algorithms • • taken in this light it is easier to see how GAs skip a great deal of the computational load through implicit parallelism Critical characteristics use a population of tokens (parallelism) • a selection process that targets / discovers salient / relevant dimensions of substructure within those tokens • Schema Theorem in Language Acquisition

Wealth of the Stimulus Schema Theorem in Language Acquisition Schema Theorem GAs Acquisition tokens

Wealth of the Stimulus Schema Theorem in Language Acquisition Schema Theorem GAs Acquisition tokens chromosomes experience evaluation fitness function learning outcome optimal solution grammar Schema Theorem in Language Acquisition

Wealth of the Stimulus Experiences entire sensory experiences that include linguistic stimuli • importantly,

Wealth of the Stimulus Experiences entire sensory experiences that include linguistic stimuli • importantly, all sensory information impacts memory and is available to be correlated • infants are exquisitely sensitive to detailed and correlated sensory information – at least until they learn what to ignore (Rovee-Collier, 1991) • “population” because stored distributed within the same neural structures – continuous, not digital • Schema Theorem in Language Acquisition

Wealth of the Stimulus Learning in most basic neural sense – continuous, correlative, passive

Wealth of the Stimulus Learning in most basic neural sense – continuous, correlative, passive • reduces “sensory noise” – reinforces correlated multimodal sensory experience • a type of “selection” process because salient dimensions emerge through the process • Schema Theorem in Language Acquisition

Wealth of the Stimulus Grammar Schematic / analogical (following Tomasello, 2000; Hofstadter; and usage

Wealth of the Stimulus Grammar Schematic / analogical (following Tomasello, 2000; Hofstadter; and usage based models) • More subtle correlations, or higher level correlations will take more time to be distinguished from “noise” – results in a course of development • Acquisitional prerequisites may exist, but it’s a mistake to believe that relevant information isn’t being collected long before certain phenomena appear – all input has a physiological impact • Schema Theorem in Language Acquisition

Wealth of the Stimulus Traditional Progression 1. 2. 3. 4. infants attend to phonetic

Wealth of the Stimulus Traditional Progression 1. 2. 3. 4. infants attend to phonetic features allow access to phonological system access to phonology allows access to words and short phrases access to words gives access to syntax • • matches the observed developmental increase in grammatical complexity input is only informative to the linguistic module acquired at each stage linguistic evidence sets innate parameters serial, computationally expensive (thus UG) Schema Theorem in Language Acquisition

Wealth of the Stimulus Schema Theorem Based Progression 1. Every utterance an infant hears

Wealth of the Stimulus Schema Theorem Based Progression 1. Every utterance an infant hears provides a tiny bit of information about the phonetics, phonotactics, phonology, morphology, word categories, syntax, tense and aspect system, pragmatics, semantic categories, diexis, references – every aspect of their language will also match the observed developmental increase in grammatical complexity • input is informative to every aspect of language even though its contribution may not clearly surface or be attended to immediately • parallel, computationally efficient, flexible, adaptable • in line with what’s going on in other fields • Schema Theorem in Language Acquisition

Conclusion A population of tokens implicitly carries exponentially more information about the set than

Conclusion A population of tokens implicitly carries exponentially more information about the set than the tokens themselves represent. Parallel systems (of which GAs and the brain are examples) that act on that population can make use of category information that is not explicitly stated. Formal systems cannot. Without changing our observations of the input, development, or the outcome, by taking a more biologically plausible perspective on the information processing going on, we can see that the linguistic environment is far richer than impoverished Schema Theorem in Language Acquisition