Ling 411 17 1 Learning 2 The proximity

Schedule of Presentations Tu Apr 13 Th Apr 15 Tu Apr 20 Th Apr

REVIEW Operations in relational networks § Relational networks are dynamic § Activation moves along

Review Operation of the Network in terms of cortical columns § The linguistic system

Additional operations: Learning § Links get stronger when they are successfully used (Hebbian learning)

Requirements that must be assumed (implied by the Hebbian learning principle) § Links get

Support for the abundance hypothesis § Abundance is a property of biological systems generally

Learning – The Basic Process Latent nodes Latent links Dedicated nodes and links

Learning – The Basic Process Latent nodes Let these links get activated

Learning – The Basic Process Latent nodes Then these nodes will get activated

Learning – The Basic Process That will activate these links

Learning – The Basic Process This node gets enough activation to satisfy its threshold

Learning – The Basic Process This node is therefore recruited A B These links

Learning – The Basic Process This node is now dedicated to function AB AB

Learning Next time it gets activated it will send activation on these links to

Learning: more terms AB Child nodes Potential Actual A B Parent nodes

Learning: Deductions from the basic process § Learning is generally bottom-up. § The knowledge

Learning in cortical networks: A Darwinian process § A trial-and-error process: • Thousands of

Learning – Enhanced understanding § This “basic process” is not the full story §

REVIEW Columns of different sizes § Minicolumn • • • Basic anatomically described unit

REVIEW Hypercolums: Modules of maxicolumns A homotypical area in the temporal lobe of a

Functional columns vis-à-vis minicolumns and maxicolumns § Maxicolumn • About 100 minicolumns • About

Learning in a system with columns of different sizes § At early learning stage,

Question on cortical columns E-mail from Kelly Banneyer: …. I understand that a minicolumn

REVIEW Functional columns in phonological recognition: A hypothesis § Demisyllable (e. g. /de-/) activates

REVIEW Functional columns in phonological recognition A hypothesis deb [de-] ded det de- den

REVIEW Phonological hypercolumns (a hypothesis) § Maybe we have • Hypercolumn of contiguous maxicolumns

REVIEW Adjacent maxicolumns in phonological cortex? Hypercolum de- te- be- pe- ge- ke- A

REVIEW Adjacent maxicolumns in phonological cortex? deb ded det de- den de- te- be-

Revisit the diagram: Each node of the diagram represents a group of minicolumns –

Learning – The Basic Process Let these links get activated

Learning – The Basic Process: Refined view Then these supercolumns get activated

Learning – The Basic Process: Refined view That will activate these links

Learning – Refined view This supercolumn gets enough activation to satisfy its threshold

Learning – Refined view This supercolumn is recruited for function AB AB A B

Learning: Refined view Next time it gets activated it will send activation on these

Learning Refined view Can get subdivided for finer distinctions AB A B

A further enhancement § Minicolumns within a supercolumn have mutual horizontal excitatory connections §

Learning: Refined view AB A Hypercolumn composed of 3 maxicolumns Can get subdivided for

Learning: refined view AB A If, later, C is activated along with A and

Learning: And the connection from C to ABC is strengthened – it is no

Learning phonological distinctions: A hypothesis deb ded det de- te- be- pe- ge- ke-

Remaining problems – lateral inhibition § When a hypercolumn is first recruited, no lateral

REVIEW Hypothesis applied to conceptual categories § A whole maxicolumn gets activated for the

Locating Functions: The Proximity Principle § Related functions tend to be in close proximity

Consequences of the Proximity Principle § Nodes in close competition will tend to be

Learning and the Proximity Principle § § § Start with the observation: • Related

Two aspects of the proximity principle A node that integrates a combination of properties

How to Explain the Proximity Principle? § Factors responsible for observations of proximity in

Proximity: Economic necessity § Question: Could a given column be connected to any other

Limits on intercolumn connectivity § Number of cortical minicolumns: • If 27 billion neurons

Locations of available latent connections § Local • Surrounding area • Horizontal connections (grey

The role of long-distance fibers § Arcuate fasciculus • Genetically determined • Limits location

Two Factors in Localization § Genetic factors determine general area for a particular type

Genetically determined proximity § Genetically-determined proximity would have developed over a long period of

Some innate factors relating to localization § Primary areas § Long-distance fiber bundles

Innate factors relating to primary areas § Location • Genetically determined locations § But

REVIEW A Heterotypical (i. e. , genetically built-in) structure Visual motion perception An area

REVIEW A Heterotypical structure: Auditory areas in a cat’s cortex A 1 AAF –

Innate factors relating to localization § The primary areas § Long-distance fiber bundles •

Applying the proximity principle § For both types (genetic and experiencebased) we can make

Implications of the proximity principle § System level • Functionally related subsystems will tend

Deriving location from proximity hypothesis § The cortex has to provide for “decoding” speech

Speech Recognition in the Left Hemisphere Phonological Production Primary Auditory Area Phonological Recognition Wernicke’s

Exercise: Location of Wernicke’s area § Why is phonological recognition in the posterior superior

Answer: Location of Wernicke’s area § Wernicke’s area pretty much has to be where

More exercises § Explaining likely locations of morphemes • verb morphemes in the frontal

Experience-based proximity § Can be expected to be operative • more at higher (more

Innate features that support language Columnar structure Coding of frequencies in Heschl’s gyrus Arcuate

Slides: 70

Download presentation

Ling 411 – 17 (1) Learning (2) The proximity principle and “evolutionary learning”

Schedule of Presentations Tu Apr 13 Th Apr 15 Tu Apr 20 Th Apr 22 Delclos Planum Temp Banneyer Categories Ruby Tso Writing Bosley Synesthesia Rasmussen 2 nd language Brown Bilingualism Tsai Tones

REVIEW Operations in relational networks § Relational networks are dynamic § Activation moves along lines and through nodes § Links have varying strengths • A stronger link carries more activation, other things being equal § All nodes operate on two principles: • Integration • § Of incoming activation Broadcasting § To other nodes

Review Operation of the Network in terms of cortical columns § The linguistic system operates as distributed processing of multiple individual components • • “Nodes” in an abstract model These nodes are implemented as cortical columns • Integration: A column is activated if it receives enough activation from other columns § Can be activated to varying degrees § Can keep activation alive for a period of time Broadcasting: An activated column transmits activation to other columns § Exitatory – contribution to higher level § Inhibitory – dampens competition at same level § Columnar Functions •

Additional operations: Learning § Links get stronger when they are successfully used (Hebbian learning) • Learning consists of strengthening them • Hebb 1948 § Threshold adjustment • When a node is recruited its threshold increases • Otherwise, nodes would be too easily satisfied

Requirements that must be assumed (implied by the Hebbian learning principle) § Links get stronger when they are successfully used (Hebbian learning) • Learning consists of strengthening them § Prerequisites: • Initially, connection strengths are very weak • • § Term: Latent Links They must be accompanied by nodes § Term: Latent Nodes Latent nodes and latent connections must be available for learning anything learnable § The Abundance Hypothesis • Abundant latent links • Abundant latent nodes

Support for the abundance hypothesis § Abundance is a property of biological systems generally • Cf. : Acorns falling from an oak tree • Cf. : A sea tortoise lays thousands of eggs • • § Only a few will produce viable offspring Cf. Edelman: “silent synapses” § The great preponderance of cortical synapses are “silent” (i. e. , latent) Electrical activity sent from a cell body to its axon travels to thousands of axon branches, even though only one or a few of them may lead to downstream activation

Learning – The Basic Process Latent nodes Latent links Dedicated nodes and links

Learning – The Basic Process Latent nodes Let these links get activated

Learning – The Basic Process Latent nodes Then these nodes will get activated

Learning – The Basic Process That will activate these links

Learning – The Basic Process This node gets enough activation to satisfy its threshold

Learning – The Basic Process This node is therefore recruited A B These links now get strengthened and the node’s threshold gets raised

Learning – The Basic Process This node is now dedicated to function AB AB A B

Learning Next time it gets activated it will send activation on these links to next level AB A B

Learning: more terms AB Child nodes Potential Actual A B Parent nodes

Learning: Deductions from the basic process § Learning is generally bottom-up. § The knowledge structure as learned by the cognitive network is hierarchical — has multiple layers § Hierarchy and proximity: • Logically adjacent levels in a hierarchy can be expected to be locally adjacent § Excitatory connections are predominantly from one layer of a hierarchy to the next § Higher levels will tend to have larger numbers of nodes than lower levels

Learning in cortical networks: A Darwinian process § A trial-and-error process: • Thousands of possibilities available § The abundance hypothesis • Strengthen those few that succeed § “Neural Darwinism” (Edelman) § The abundance hypothesis • Needed to allow flexibility of learning • Abundant latent nodes § Must be present throughout cortex • Abundant latent connections of a node § Every node must have abundant latent links

Learning – Enhanced understanding § This “basic process” is not the full story § The nodes of this depiction: • Are they minicolumns, maxicolumns, or what? • Most likely, a bundle of contiguous columns • Perhaps usually a maxicolumn or hypercolumn

REVIEW Columns of different sizes § Minicolumn • • • Basic anatomically described unit 70 -110 neurons (avg 75 -80) Diameter barely more than that of pyramidal cell body (30 -50 μ) § Maxicolumn (term used by Mountcastle) • • Diameter 300 -500 μ Bundle of 100 or more contiguous minicolumns • • Can be long and narrow rather than cylindrical Bundle of contiguous maxicolumns • • Intermediate between minicolumn and maxicolumn A contiguous group of minicolumns § Hypercolumn – up to 1 mm diameter § Functional column

REVIEW Hypercolums: Modules of maxicolumns A homotypical area in the temporal lobe of a macaque monkey

Functional columns vis-à-vis minicolumns and maxicolumns § Maxicolumn • About 100 minicolumns • About 300 -500 microns in diameter § Functional column • A group of one to several contiguous • • minicolumns within a maxicolumn Established during learning Initially it might be an entire maxicolumn

Learning in a system with columns of different sizes § At early learning stage, maybe a whole hypercolumn gets recruited § Later, maxicolumns for further distinctions § Still later, functional columns as subcolumns within maxicolumns § New term: Supercolumn – a group of minicolumns of whatever size, hypercolumn, maxicolumn, functional column § Links between supercolumns will thus consist of multiple fibers

Question on cortical columns E-mail from Kelly Banneyer: …. I understand that a minicolumn is the smallest unit and maxicolumns are composed of minicolumns and functional columns are intermediate in size while hypercolumns are composed of several maxicolumns. I wonder if there can exist a minicolumn or functional column in the brain that is not part of a larger type of column. For example, I know that there exists hierarchical structure, but is there maybe some concept so exact and unrelated to anything else that a mini/functional column exists that is not part of a maxicolumn?

REVIEW Functional columns in phonological recognition: A hypothesis § Demisyllable (e. g. /de-/) activates a maxicolumn § Different functional columns within the maxicolumn for syllables with this demisyllable • /ded/, /deb/, /det/, /dek/, /den/, /del/

REVIEW Functional columns in phonological recognition A hypothesis deb [de-] ded det de- den del dek A maxicolumn (ca. 100 minicolumns) Divided into functional columns (Note that all respond to /de-/)

REVIEW Phonological hypercolumns (a hypothesis) § Maybe we have • Hypercolumn of contiguous maxicolumns for /e/ • With maxicolumns for /de-/, /be-/, etc. • Each such maxicolumn subdivided into functional columns for different finals § /det/, /ded/, /den/, /deb/, /dem/. /dek/ § (N. B. : This is just a hypothesis) • Maybe someday soon we’ll be able to test with sensitive brain imaging

REVIEW Adjacent maxicolumns in phonological cortex? Hypercolum de- te- be- pe- ge- ke- A module of contiguous maxicolumns Each of these maxicolumns is divided into functional columns Note that the entire module responds to [-e-]

REVIEW Adjacent maxicolumns in phonological cortex? deb ded det de- den de- te- be- pe- ge- ke- del dek The entire maxicolumn responds to [de-] The entire module responds to [-e-] A module of six contiguous maxicolumns

Revisit the diagram: Each node of the diagram represents a group of minicolumns – a supercolumn Latent supercolumns Bundles of latent links Dedicated supercolumns and links

Learning – The Basic Process Let these links get activated

Learning – The Basic Process: Refined view Then these supercolumns get activated

Learning – The Basic Process: Refined view That will activate these links

Learning – Refined view This supercolumn gets enough activation to satisfy its threshold

Learning – Refined view This supercolumn is recruited for function AB AB A B

Learning: Refined view Next time it gets activated it will send activation on these links to next level AB A B

Learning Refined view Can get subdivided for finer distinctions AB A B

A further enhancement § Minicolumns within a supercolumn have mutual horizontal excitatory connections § Therefore, some minicolumns can get activated from their neighbors even if they don’t receive activation from outside

Learning: Refined view AB A Hypercolumn composed of 3 maxicolumns Can get subdivided for finer distinctions B

Learning: refined view AB A If, later, C is activated along with A and B, then maxicolumn ABC is recruited for ABC B C

Learning: And the connection from C to ABC is strengthened – it is no longer latent refined view AB A ABC B C

Learning phonological distinctions: A hypothesis deb ded det de- te- be- pe- ge- ke- den del dek 3. The maxicolumn gets divided into functional columns 1. In learning, this hypercolumn gets established first, responding to [-e-] 2. It gets subdivided into maxicolumns for demisyllables

Remaining problems – lateral inhibition § When a hypercolumn is first recruited, no lateral inhibition among its internal subdivisions § Later, when finer distinctions are learned, they get reinforced by lateral inhibition § Problem: How does this work?

REVIEW Hypothesis applied to conceptual categories § A whole maxicolumn gets activated for the category • Example: DRINKING-VESSEL § Different functional columns within the maxicolumn for subcategories • CUP, GLASS, etc. § Adjacent maxicolumns for categories related to DRINKING VESSEL • BOWL, JAR, etc.

Locating Functions: The Proximity Principle § Related functions tend to be in close proximity • If very closely related, they tend to be adjacent § Areas which integrate properties of different subsystems (e. g. , different sensory modalities) tend to be in locations intermediate between those subsystems

Consequences of the Proximity Principle § Nodes in close competition will tend to be neighbors • And their mutual competition is preordained even though the properties they are destined to integrate will only be established through the learning process § Therefore, inhibitory connections should exist predominantly among nodes of the same hierarchical level • The presence of their mutual inhibitory connections could be genetically specified

Learning and the Proximity Principle § § § Start with the observation: • Related areas tend to be adjacent to each other § Primary auditory and Wernicke’s area § V 1 and V 2, etc. § Wernicke’s area and lexical-conceptual information – angular gyrus, SMG, MTG Thus we have the ‘proximity principle’ Question: Why – How to explain?

Two aspects of the proximity principle A node that integrates a combination of properties of different subsystems can be expected to lie in a location intermediate between those subsystems 2. A node that integrates a combination of properties of the same subsystem should be within the same subsystem, and maximally close to the properties it integrates 1.

How to Explain the Proximity Principle? § Factors responsible for observations of proximity in cortical structure Economic necessity 2. Genetic factors 3. Experience – provides details of localization within the limits imposed by genetic factors 1.

Proximity: Economic necessity § Question: Could a given column be connected to any other column anywhere in the cortex? § That would require a huge number of available latent connections § Way more than are present § Hence there are strict limits on intercolumn connectivity § Therefore, proximity is necessary just for economy of representation

Limits on intercolumn connectivity § Number of cortical minicolumns: • If 27 billion neurons in entire cortex • If avg. 77 neurons per minicolumn • Then 350 million minicolumns in the cortex § Extent of available latent connections to other columns • Perhaps 35, 000 to 350, 000 • Do the math. . § A given column has available latent connections to between 1/1000 and 1/10000 of the other columns in the cortex

Locations of available latent connections § Local • Surrounding area • Horizontal connections (grey matter) § Intermediate • Short-distance fibers in white matter • For example from one gyrus to neighboring gyrus § Long-distance • Long-distance fiber bundles • At ends, considerable branching

The role of long-distance fibers § Arcuate fasciculus • Genetically determined • Limits location of phonological recognition area § Interhemispheric fibers • Also genetically determined • Wernicke’s area – RH homolog of W’s area • Broca’s area – RH homolog of B’s area • Etc.

Two Factors in Localization § Genetic factors determine general area for a particular type of knowledge § Within this general area the learningbased proximity factors select a more narrowly defined location § Thus the exact localization depends on experience of the individual § When part of the system is damaged, learning-based factors can take over and result in an abnormal location for a function – plasticity

Genetically determined proximity § Genetically-determined proximity would have developed over a long period of evolution • Many features are shared with other mammals • A process of trial-and-error: § Trial § This process could be called ‘evolutionary learning’ § According to standard evolutionary theory. . • Produce varieties § Error: • Most varieties will not survive/reproduce • The others – the best among them – are selected § Other genetic factors supplement proximity • Long-distance fiber bundles

Some innate factors relating to localization § Primary areas § Long-distance fiber bundles

Innate factors relating to primary areas § Location • Genetically determined locations § But there are exceptions • Malformation • Damage § Structure • Genetically determined structures adapted to sensory modality (they have to be where they are) § Heterotypical structures • Found in primary areas » Primary visual » Primary auditory

REVIEW A Heterotypical (i. e. , genetically built-in) structure Visual motion perception An area in the posterior bank of the superior temporal sulcus of a macaque monkey (“V-5”) A heterotpical area 40 0 - 50 0μ Albright et al. 1984

REVIEW A Heterotypical structure: Auditory areas in a cat’s cortex A 1 AAF – Anterior auditory field A 1 – Primary auditory field PAF – Posterior auditory field VPAF – Ventral posterior auditory field

Innate factors relating to localization § The primary areas § Long-distance fiber bundles • Interhemispheric – via corpus callosum • Longitudinal – from front to back § Arcuate fasciculus is part of the superior longitudinal fasciculus § They allow for exceptions to proximity • Areas closely related yet not neighboring

Applying the proximity principle § For both types (genetic and experiencebased) we can make predictions of where various functions are most likely to be located, based on the proximity principle • Broca’s area near the inferior precentral gyrus • Wernicke’s area near the primary auditory area § Such predictions are possible even in cases where we don’t know whether genetics or learning is responsible • maybe both

Implications of the proximity principle § System level • Functionally related subsystems will tend to be • close to one another Neighboring subsystems will probably have related functions § Cortical column level • Nodes for similar functions should be physically • close to one another Nodes that are physically close to one another probably have similar functions § Therefore. . • • Neighboring nodes are likely to be competitors They need to have mutually inhibitory connections

Deriving location from proximity hypothesis § The cortex has to provide for “decoding” speech input § Speech input enters the cortex in the primary auditory area § Results of the “decoding” (recognition of syllables etc. ) are represented in Wernicke’s area § Why is Wernicke’s area where it is?

Speech Recognition in the Left Hemisphere Phonological Production Primary Auditory Area Phonological Recognition Wernicke’s Area

Exercise: Location of Wernicke’s area § Why is phonological recognition in the posterior superior temporal gyrus? • Alternatives to consider: § Anterior to primary auditory cortex • Advantage: would be close to phonological production § Inferior to primary auditory cortex § (There are two reasons)

Answer: Location of Wernicke’s area § Wernicke’s area pretty much has to be where it is to take advantage of the arcuate fasciculus § The location of W. ’s area makes it close to angular gyrus, likely area for noun lemmas (morphemes and complex morphemes) § Also, close to SMG, presumed area for phonological monitoring • (Why? § Because it is adjacent to primary somatosensory area)

More exercises § Explaining likely locations of morphemes • verb morphemes in the frontal lobe • noun morphemes in the angular gyrus and/or middle temporal gyrus § The dorsal (where) pathway of visual perception

Experience-based proximity § Can be expected to be operative • more at higher (more abstract) levels, less at • lower levels for areas of knowledge that have developed too recently for evolution to have played a role § Reading § Writing § Higher mathematics § Physics, computer technology, etc.

Innate features that support language Columnar structure Coding of frequencies in Heschl’s gyrus Arcuate fasciculus Interhemispheric connections (via corpus callosum) – e. g. , connect Wernicke’s area with RH homolog § Spread of myelination from primary areas to successively higher levels § Left-hemisphere dominance for grammar etc. § §

end