Neural Networks Ellen Walker Hiram College Connectionist Architectures

  • Slides: 25
Download presentation
Neural Networks Ellen Walker Hiram College

Neural Networks Ellen Walker Hiram College

Connectionist Architectures • Characterized by (Rich & Knight) – Large number of very simple

Connectionist Architectures • Characterized by (Rich & Knight) – Large number of very simple neuron-like processing elements – Large number of weighted connections between these elements – Highly parallel, distributed control – Emphasis on automatic learning of internal representations (weights)

Basic Connectionist Unit (Perceptron)

Basic Connectionist Unit (Perceptron)

Classes of Connectionist Architectures • Constraint networks – Positive and negative connections denote constraints

Classes of Connectionist Architectures • Constraint networks – Positive and negative connections denote constraints between the values of nodes – Weights set by programmer • Layered networks – Weights represent contribution from one intermediate value to the next – Weights are learned using feedback

Hopfield Network • A constraint network • Every node is connected to every other

Hopfield Network • A constraint network • Every node is connected to every other node – If the weight is 0, the connection doesn’t matter • To use the network, set the values of the nodes and let the nodes adjust their values according to the weights. • The “result” is the set of all values in the stabilized network.

Hopfield Network as CAM • Nodes represent features of objects • Compatible features support

Hopfield Network as CAM • Nodes represent features of objects • Compatible features support each other (weights > 0) • Stable states (local minima) are “valid” interpretations • Noise features (incompatible) will be suppressed (network will fall into nearest stable state)

Hopfield Net Example

Hopfield Net Example

Relaxation • Algorithm to find stable state for Hopfield network (serial or parallel) –

Relaxation • Algorithm to find stable state for Hopfield network (serial or parallel) – Pick a node – Compute [incoming weights]*[neighbors] – If above sum > 0, node =1, else node = -1 • When values aren’t changing, network is stable • Result can depend on order of nodes chosen

Line Labeling and Relaxation • Given an object, each vertex contrains the labels of

Line Labeling and Relaxation • Given an object, each vertex contrains the labels of its connected lines

Hopfield Network for Labeling Each gray box contains 4 mutually exclusive nodes (with negative

Hopfield Network for Labeling Each gray box contains 4 mutually exclusive nodes (with negative links between them) Lines denote positive links between compatible labels

Boltzmann Machine • Alternative training method for a Hopfield network, based on simulated annealing

Boltzmann Machine • Alternative training method for a Hopfield network, based on simulated annealing • Goal: to find the most stable state (rather than the nearest) • Boltzmann rule is probabilistic, based on the “temperature” of the system

Deterministic vs. Boltzman • Deterministic update rule • Probabilistic update rule – As temperature

Deterministic vs. Boltzman • Deterministic update rule • Probabilistic update rule – As temperature decreases, probabilistic rule approaches deterministic one

Networks and Function Fitting • We earlier talked about function fitting – Finding a

Networks and Function Fitting • We earlier talked about function fitting – Finding a function that approximates a set of data so that • Function fits the data well • Function generalized to fit additional data

What Can a Neuron Compute? • n inputs (i 0=1, i 1…in) • n+1

What Can a Neuron Compute? • n inputs (i 0=1, i 1…in) • n+1 weights (w 0…wn) • 1 output: – 1 if g(i) > 0 – 0 if g(i) < 0 – g(i) = • G denotes a linear surface, and the output is 1 if the point is above this surface

Classification by a Neuron

Classification by a Neuron

Training a Neuron 1. 2. 3. 4. Initialize weights randomly Collect all misclassified examples

Training a Neuron 1. 2. 3. 4. Initialize weights randomly Collect all misclassified examples If there are none, we’re done. Else compute gradient & update weights 1. Add all points that should have fired, subtract all points that should not have fired 2. Add a constant (0<C<1) * gradient back to the weights. 5. Repeat steps 2 -5 until done (Guaranteed to converge -- loop will end)

Training Example

Training Example

Perceptron Problem • We have a model and a training algorithm, but we can

Perceptron Problem • We have a model and a training algorithm, but we can only compute linearly separable functions! • Most interesting functions are not linearly separable. • Solution: use more than one line (multiple perceptrons)

Multilayered Network input hidden output Layered, fully-connected (between layers), feed-forward

Multilayered Network input hidden output Layered, fully-connected (between layers), feed-forward

Backpropagation Training • Compute a result: – input->hidden->output • Compute error for each hidden

Backpropagation Training • Compute a result: – input->hidden->output • Compute error for each hidden node, based on desired result • Propagate errors back: – Output->hidden, hidden->input – Weights are adjusted using gradient

Backpropagation Training (cont’d) • Repeat above for every example in the training set (one

Backpropagation Training (cont’d) • Repeat above for every example in the training set (one epoch) • Repeat above until stopping criterion is reached – Good enough average performance on training set – Little enough change in network • Hundreds of epochs…

Generalization • If the network is trained correctly, results will generalize to unseen data

Generalization • If the network is trained correctly, results will generalize to unseen data • If overtrained, network will “memorize” training data, random outputs otherwise • Tricks to avoid memorization – Limit number of hidden nodes – Insert noise into training data

Unsupervised Network Learning • Kohonen network for classification

Unsupervised Network Learning • Kohonen network for classification

Training Kohonen Network • Create inhibitory links among nodes of output layer (“winner take

Training Kohonen Network • Create inhibitory links among nodes of output layer (“winner take all”) • For each item in training data: – Determine an input vector – Run network - find max output node – Reinforce (increase) weights to maximum node – Normalize weights so they sum to 1

Representations in Networks • Distributed representation – Concept = pattern – Examples: Hopfield, backpropagation

Representations in Networks • Distributed representation – Concept = pattern – Examples: Hopfield, backpropagation • Localist representation – Concept = single node – Example: Kohonen • Distributed can be more robust, also more efficient