Modular Neural Networks SOM ART and CALM Jaap

  • Slides: 44
Download presentation
Modular Neural Networks: SOM, ART, and CALM Jaap Murre University of Amsterdam University of

Modular Neural Networks: SOM, ART, and CALM Jaap Murre University of Amsterdam University of Maastricht jaap@murre. com http: //www. neuromod. org

Modular neural networks • • Why modularity? Kohonen’s Self-Organizing Map (SOM) Grossberg’s Adaptive Resonance

Modular neural networks • • Why modularity? Kohonen’s Self-Organizing Map (SOM) Grossberg’s Adaptive Resonance Theory Categorizing And Learning Module, CALM (Murre, Phaf, & Wolters, 1992)

Modularity: limitations on connectivity LAP L. . CAP . A. CAB . . P

Modularity: limitations on connectivity LAP L. . CAP . A. CAB . . P . . B

Modularity • Scalability • Re-use in design and evolution • Coarse steering of development;

Modularity • Scalability • Re-use in design and evolution • Coarse steering of development; learning provides fine structure • Improved generalization because of fewer connections • Strong evidence from neurobiology

Self-Organizing Maps (SOMs) Topological Representations

Self-Organizing Maps (SOMs) Topological Representations

Map formation in the brain • Topographic maps omnipresent in the sensory regions of

Map formation in the brain • Topographic maps omnipresent in the sensory regions of the brain – retinotopic maps: neurons ordered as the locations of their visual field on the retina – tonotopic maps: neurons ordered according to tone for which they are sensitive – maps in somatosensory cortex: neurons ordered according to body part for which they are sensitive – maps in motor cortex: neurons ordered according to muscles they control

Auditory cortex has a tonotopic map that is hidden in the transverse temporal gyrus

Auditory cortex has a tonotopic map that is hidden in the transverse temporal gyrus

Somatosensory maps

Somatosensory maps

Somatosensory maps II © Kandel, Schwartz & Jessell, 1991

Somatosensory maps II © Kandel, Schwartz & Jessell, 1991

Many maps show continued plasticity Reorganization of sensory maps in primate cortex

Many maps show continued plasticity Reorganization of sensory maps in primate cortex

Kohonen maps • Teuvo Kohonen was the first to show maps may develop •

Kohonen maps • Teuvo Kohonen was the first to show maps may develop • Self-Organizing Maps (SOMs) • Demonstration: the ordering of colors (colors are vectors in a 3 -dimensional space of brightness, hue, saturation).

Kohonen algorithm • Finding the activity bubble • Updating the weights for the nodes

Kohonen algorithm • Finding the activity bubble • Updating the weights for the nodes in the active bubble

Finding the activity bubble Lateral inhibition

Finding the activity bubble Lateral inhibition

Finding activity bubble II • Find the winner • Activate all nodes in the

Finding activity bubble II • Find the winner • Activate all nodes in the neighbourhood of the winner

Updating the weights • Move weight vector of winner towards the input vector •

Updating the weights • Move weight vector of winner towards the input vector • Do the same for the active neighbourhood nodes weight vectors of neigboring nodes will start resembling each other

Simplest implementation • Weight vectors & input patterns all have length 1 (e. i.

Simplest implementation • Weight vectors & input patterns all have length 1 (e. i. , (wij)2 = 1 ) • Find node whose weight vector has mimimal distance to the input vector: min. (aj - wij)2 • Activate all nodes in neighbourhood radius Nt • Update weights of active nodes by moving weights towards the input vector: wij = t * ( aj - wij) wij(t+1) = wij(t) + t * ( aj - wij(t) )

Results of Kohonen © Kohonen, 1982

Results of Kohonen © Kohonen, 1982

Influence of neighbourhood radius Larger neighbourhood size leads to faster learning © Kohonen, 1982

Influence of neighbourhood radius Larger neighbourhood size leads to faster learning © Kohonen, 1982

Results II: the phonological typewriter humpplia (Finnish) © Kohonen, 1988

Results II: the phonological typewriter humpplia (Finnish) © Kohonen, 1988

Conclusions for SOM • • Elegant Prime example of unsupervised learning Biologically relevant and

Conclusions for SOM • • Elegant Prime example of unsupervised learning Biologically relevant and plausible Very good at discovering structure: – discovering categories – mapping the input onto a topographic map

Adaptive Resonance Theory (ART) Stephen Grossberg (1976)

Adaptive Resonance Theory (ART) Stephen Grossberg (1976)

Grossberg’s ART • Stability-Plasticity Dilemma • How to disentangle overlapping patterns?

Grossberg’s ART • Stability-Plasticity Dilemma • How to disentangle overlapping patterns?

ART-1 Network

ART-1 Network

Phases of Classification (a) Initial pattern (b) Little support from F 2 (c) Reset:

Phases of Classification (a) Initial pattern (b) Little support from F 2 (c) Reset: second try starts (d) Different category (F 2) node gives sufficient support: resonance

Categorizing And Learning Module (CALM) Murre, Phaf, and Wolters (1992)

Categorizing And Learning Module (CALM) Murre, Phaf, and Wolters (1992)

CALM: Categorizing And Learning Module • CALM module is basic unit in multimodular networks

CALM: Categorizing And Learning Module • CALM module is basic unit in multimodular networks • Categorizes arbitrary input activation patterns and retains this categorization over time • CALM is developed for unsupervised learning but also works with supervision • Motivated by psychological, biological, and practical considerations

Important design principles in CALM • Modularity • Novelty dependent categorization and learning •

Important design principles in CALM • Modularity • Novelty dependent categorization and learning • Wiring scheme inspired by neocortical minicolumn

Elaboration versus activation • Novelty dependent categorization and learning derived from memory psychology (Graf

Elaboration versus activation • Novelty dependent categorization and learning derived from memory psychology (Graf and Mandler, 1984) – Elaboration learning: Active formation of new associations – Activation learning: Passive strengthening of pre-existing associations • In CALM: Relative novelty of patterns determines either type of learning

How elaboration learning is implemented in CALM • Novel pattern – > Much competition

How elaboration learning is implemented in CALM • Novel pattern – > Much competition – > High activation of Arousal Node – > High activation of External Node – > High learning parameter – > High noise amplitude on Representation Nodes • Elaboration learning drives: – Self-induced noise – Self-induced learning

Self-induced noise (cf. Bolzmann Machine) • Non-specific activations from sub-cortical structures in cortex •

Self-induced noise (cf. Bolzmann Machine) • Non-specific activations from sub-cortical structures in cortex • Optimal level of arousal for optimal learning performance (Yerkes-Dodson Law) • Noise drives search for new representations • Noise breaks symmetry deadlocks • Noise may lead to convergence in deeper attractors

Self-induced learning • Possible role of hippocampus and basal forebrain (cf. modulatory system in

Self-induced learning • Possible role of hippocampus and basal forebrain (cf. modulatory system in Trace. Link) • Shift from implicit to explicit memory • Remedy of the Plasticity-Stability Dilemma

Stability-Plasticity Dilemma or the Problem of Real-Time Learning • How can a learning system

Stability-Plasticity Dilemma or the Problem of Real-Time Learning • How can a learning system be designed to remain plastic, or adaptive, in response to significant events and yet remain stable in response to irrelevant events? ” Carpenter and Grossberg, 1988, p. 77)

Novelty dependent categorization • Novel patterns implies search for new representations • Search process

Novelty dependent categorization • Novel patterns implies search for new representations • Search process is driven by novelty dependent noise

Novelty dependent learning • Novel pattern: increased learning rate • Old pattern: base-rate learning

Novelty dependent learning • Novel pattern: increased learning rate • Old pattern: base-rate learning

Learning rule derived from Grossberg’s ART • Extension of the Hebb Rule • Increases

Learning rule derived from Grossberg’s ART • Extension of the Hebb Rule • Increases and decreases in weight • Only applied to excitatory connections (no sign changes allowed) • Weights are bounded between 0 and 1 • Allows separation of complex patterns from their composing subpatterns • In contrast to ART: weight change is influenced by weighed neighbor activations

CALM Learning Rule Weight from node j to node i Neighbor activations k dampen

CALM Learning Rule Weight from node j to node i Neighbor activations k dampen the weight change

Learning rule

Learning rule

Avoid neurobiologically implausible architectures • Random organization of excitatory and inhibitory connections • Learning

Avoid neurobiologically implausible architectures • Random organization of excitatory and inhibitory connections • Learning may change a connections sign • Single nodes may give off both excitatory and inhibitory connections

Neurons form a dichotomy (Dale’s Law) • Neurons involved in long-range connections in cortex

Neurons form a dichotomy (Dale’s Law) • Neurons involved in long-range connections in cortex give off excitatory connections • Inhibitory neurons in cortex are inhibitory

CALM: Categorizing And Learning Module By Murre, Phaf, & Wolters (1992)

CALM: Categorizing And Learning Module By Murre, Phaf, & Wolters (1992)

Activation rule

Activation rule

Parameter CALM Parameters Up weight 0. 5 Possible parameters for the CALM module Down

Parameter CALM Parameters Up weight 0. 5 Possible parameters for the CALM module Down weight -1. 2 Cross weight 10. 0 They do not need to adjusted for each new architecture Flat weight -1. 0 High weight -0. 6 Low weight 0. 4 AE weight 1. 0 ER weight 0. 25 wµE 0. 05 k 0. 05 K 1. 0 L 1. 0 d 0. 01

Main processes in the CALM module

Main processes in the CALM module

Inhibition between nodes • Example: inhibition in CALM

Inhibition between nodes • Example: inhibition in CALM