Topic 8 Machine Learning Neural Networks Machine learning

Machine learning basics • Nature shows that oganisms do better if they can change

Types of learning (con’t) - discovery (deduction) – Like learning from problem solving, but

Principles of learning for game design • Learn what? What might an animat need

Principles of learning for game design • In practice games developers generally allow factual

More distinctions • Another distinction is that of batch learning, in which facts or

How the (natural) Brain Works • A neural network can be thought of as

How the (natural) Brain Works The neuron ICT 219 8

An artificial 'neuron' • A simple mathematical model of the biological neuron (first proposed

Artificial Neural Networks • An artificial neural network consists of a number of very

Artificial Neural Networks • A typical architecture is shown here: input patterns from sensors

Artificial Neural Networks • After the network had been trained, the weights would be

Perceptrons • The earliest type of network to be studied was the perceptron, a

Perceptron computations • Operating the perceptron in recognition mode is done in 2 steps:

Perceptron recognition mode • So the following algorithm may be applied to simulate the

Training a Perceptron • One way of doing this is using the delta rule

Training a Perceptron • The delta rule ∆wi = -η ∂E = ηx 1(t-y),

The Problem with Perceptrons • But there is an entire family of patterns which

References • Minsky, M. & Papert, S, Perceptrons, An Introduction to Computational Geometry. Cambridge:

Slides: 20

Download presentation

Topic 8 Machine Learning, Neural Networks • • • Machine learning basics Types of learning Principles of learning for game design Learning Distinctions How the (natural) brain works Neural networks Perceptrons Training a perceptron Evaluating a perceptron The problem with perceptrons Reading: Champandard Chapters 35, 17, 19 Links to neural networks on the website The JNNS manual ICT 219 1

Machine learning basics • Nature shows that oganisms do better if they can change their behaviour in reponse to their environments, and thus profit from experience • Learning is “Any change in a system that allows it to perform better the second time on repetition of the same task or on another task drawn from the same population. ” (Simon, 1983) • Researchers distinguish a lot of different types of machine learning: - rote learning (memorisation) – have the machine store new behaviour-controlling data - taking advice (direct instruction) – communicate procedures to the machine, and have it follow them - learning for problem solving – generating new ways of problem solving from own experience ie without an instructor/advisor - learning from examples (induction) – observe examples of solutions to problems, then adapt these to own problems ICT 219 2

Types of learning (con’t) - discovery (deduction) – Like learning from problem solving, but focussed on logical inference from facts to answers - explanation-based learning - extracts the concept behind the explanation for one example, then generalize to other instances. Requires domain-specific knowledge. - analogy – similar to inductions, but uses analogical knowledge structures eg “infectious desease is like an enemy invasion” - neural nets and genetic algorithms (more on these later) • Here we will take a fairly pragmatic, engineering approach, distinguishing two goals of learning - adaption vs optimisation • adaption – the nature and goal of the problem changes over time, and the behaviour continues to change incrementally (dynamic) • optimisation – attempts to find a solution to a known problem, usually by adjusting some parameters to an ideal value; the nature and goals of 3 problem do not change (static) ICT 219

Principles of learning for game design • Learn what? What might an animat need to learn to play better? • Facts e. g. “behind the gold door there’s a dangerous monster” • Behaviours e. g. swimming when in water so as not to drown • Sometimes it’s not clear whether the knowledge we are interested in is about facts (declarative), behaviours (procedures), or both • For example, shooting is partly about knowing where to aim, and partly about the skill required to aim properly • From a technology point of view, it’s possible to use a good representation system for both. For example, a frame can be used both to store facts and to launch actors, which generate behaviour • To make sensible distinctions between these things, one has to get specific about a particular representation technique ICT 219 4

Principles of learning for game design • In practice games developers generally allow factual knowledge acquired during the game to be queried by a process that’s fixed at design-time e. g. with a problem-solving RBS, the rules are learned but the reasoning process is programmed in • Behaviours, on the other hand, are generally learned continuously and dynamically as the game is played, possibly with a different kind of AI technique, such as a neural network • The important distinction here is at what stage of game development the learning is applied – offline, while the game is being written vs online, while the game is being played • Optimisation is most naturally associated with offline learning, while adaption is most naturally carried out online • Both should be considered for any animat, but Chapandard argues that offline learning is underused and should be considered first for many behaviours ICT 219 5

More distinctions • Another distinction is that of batch learning, in which facts or learned representations are extracted all at once from a historical record of many experiences. . . • . . . vs incremental learning, in which sample experiences are processed one by one, with small changes to facts or learned representations are made at each turn • Batch and incremental learning are independent of online or offline methods (but batch learning is generally done offline, while incremental learning is generally done online) • Yet another distinction is that of supervised vs unsupervised learning • In supervised learning, example data is provided in the form of pairs, mapping input to output of the form x->y, where y is the desired output • In unsupervised learning, general mathematical principles are applied to the input data so as to allow a representation to selforganise. This representation may then be interpreted by other AI methods for meaning ICT 219 6

How the (natural) Brain Works • A neural network can be thought of as a model of the human brain • The brain consists of a very densely interconnected set of nerve cells, called neurons • The human brain incorporates nearly 10 billion neurons and 60 trillion connections, called synapses, between them • Our brain can be considered as a highly complex, non-linear and massively parallel information processing system • Information is stored and processed in a neural network simultaneously throughout the whole network, rather than at specific locations • Learning is a fundamental and essential characteristic of biological neural networks • The ease with which they can learn led to attempts to emulate a biological neural network in a machine ICT 219 7 Image: a 2 zpsychology. com

How the (natural) Brain Works The neuron ICT 219 8

An artificial 'neuron' • A simple mathematical model of the biological neuron (first proposed in 1943 by Mc. Culloch & Pitts) consists of a summing function, an activation function (internal threshold), and "weighted" inputs as shown below. summing function activation function ICT 219 9

Artificial Neural Networks • An artificial neural network consists of a number of very simple processing units which are like the biological neurons in the brain. Today this is almost always simulated in software, using a conventional computer • The ‘neuron’ units are connected by a network of weighted links passing signals (numbers) from one neuron to another • The outgoing connection splits into a number of branches that all transmit the same output signal. The outgoing branches are the incoming connections of other neurons in the network • Neural networks generally have inputs that come from sensory cells in the outside world. Others send their output to the outside world. Processor units deep in the network that are not connected to inputs or outputs are called hidden units • The weighted links can be made to change systematically in response to patterns of input applied to machine by means of an algorithm – hence the machine can exhibit a kind of learning • One can experiment with different activation functions for the neurons, with different network architectures, and with different learning algorithms ICT 219 10

Artificial Neural Networks • A typical architecture is shown here: input patterns from sensors output code class A B C D • There are three (vertical) layers: input, output and one hidden layer. Each neuron in a layer is connected to the layer before it • In a typical experiment, we might want the network to learn to classify patterns arriving from input ie it would output a code representing the class it chose for its current input pattern • Each processing unit would be set to compute a neuronal function • First, the network would be put into training mode, meaning the weights between neurons would be allowed to change • Then a set of training patterns – consisting of pairs (example inputs and their correct outputs) would be applied, and the weights ICT 219 11 incrementally changed by a learning algorithm.

Artificial Neural Networks • After the network had been trained, the weights would be fixed, and the learning algorithm turned off – this is recognition mode • Old or new patterns arriving at the imputs would then be allowed to generate a class code. The performance score of the network would be the proportion of patterns correctly classified • For example, we might feed a 8 x 10 black-and-white ‘retina’ in as input, and ask the network to distinguish numbers, upper-case characters and lower-case characters no meaning ICT 219 lower case letters inputs ‘retina’ upper case letters 80 inputs numbers 6 a Q output code 12

Perceptrons • The earliest type of network to be studied was the perceptron, a single layer of processing elements (Rosenblatt, 1959) x 1 X 1 w 1 x 2 X 2 w 2 x 3 X 3 w 3 x 4 X 4 w 0 y Input vector ->x = [x 1. . . xn] Weight vector ->w = [w 0. . . wn] (w 0 is a bias) Output vector ->x = [y 1. . . ym] can have multiple outputs; here this is only one w 4 • Data type for inputs, outputs may be binary (0, 1) or continuous (real numbers). Weights are usually continuous (typically 0. 0 - 1. 0) • ANNs in FEAR use 32 -bit floating point numbers for weights • Now that we have a network architecture, we need to specify an activation function for neurons, an algorithm for recognition mode, and a weight change algorithm for ICT 219 training mode 13

Perceptrons ICT 219 14

Perceptron computations • Operating the perceptron in recognition mode is done in 2 steps: • 1) Combine weighted inputs (compute the net sum) ζ n ζ = w 0 + ∑ wixi i=1 • 2) Apply activation function σ to the combination function to produce the output, y y = σ(ζ) The activation function here will be a step or threshold function (others are possible) σ(x) 1 σ(x) = { 1, if x > 0 0 otherwise 0 0 x ICT 219 In practice, smoother, more continuous functions are preferred because they allow continuous outputs, and it makes training easier 15

Perceptron recognition mode • So the following algorithm may be applied to simulate the output of a single perceptron. Assume initialised vectors input and weight For convenience we treat w 0 as a (fixed) input at input[0] netsum = 0 for all i netsum = netsum + input [i] * weight [i] end output = activation (netsum) • With a random set of weights, the perceptron will do a poor job of approximating the function we want. There will be an ideal set of weights that makes it approximate the required function well • We find those weights by repeatedly clamping the input and output of the perceptron to a known good example pair, then somehow changing the ICT 219 16 weights to get closer to that mapping. This is supervised learning.

Training a Perceptron • One way of doing this is using the delta rule to optimise the weights incrementally as good training pairs are applied • The delta rule works on the principles that - each weight contributes to the output by a variable amount - The scale of this contribution depends on the input - The error of the output can be assigned to (blamed on) the weights • At each example pair, we compare the actual output of the perceptron y with the ideal output t, which gives the relative error E. This is computed by an error function, which is commonly least mean square (LMS) E = ½ (t - y)2 • Then adjust each weight so that the actual output will be slightly closer to the ideal (lower LMS) next time • If the error function is well-behaved (descends monotonically as we get closer to the ideal perceptron function) we can use this to learn ICT 219 17

Training a Perceptron • The delta rule ∆wi = -η ∂E = ηx 1(t-y), where η is a small learning rate ∂wi • Algorithm to apply delta rule. Uses an objective function is a test which determines the overall quality of output (eg [total weight adjustment > some threshold] ie not much change in the wieghts) to decide when to stop! initialise weights randomly while (objective function is unsatisfied) for each sample pair simulate the perceptron if result is in error //eg E > some small value for all inputs i delta = desired - output weight[i] += eta * delta * input[i] end for end if end for ICT 219 18 end while

The Problem with Perceptrons • But there is an entire family of patterns which perceptrons cannot learn at all • In 1969, Minsky & Papert published an influential book in which they proved mathematically that a perceptron cannot learn functions which involve patterns that are linearly seperable eg XOR • In essence, the argument from these classical AI supporters was that if a function as simple as XOR could not be learned, neural networks were not worth persuing • The book was very influential, and for a decade, funding for neural net projects all but evaporated • With an extra (hidden) layer, a neural network could handle such functions – but then there would be no way of computing an error term for the weights of the hidden layer, so the network could not be made to learn ICT 219 to the credit assignment • That is, there was no known solution problem 19

References • Minsky, M. & Papert, S, Perceptrons, An Introduction to Computational Geometry. Cambridge: MIT Press, 1969 • Mc. Culloch, W. & Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 1943, 7: 115 - 133. • Rosenblatt, F. Principles of Neurodynamics. Spartan Books, 1958 • Simon, H. "Why should machines learn? ", in Machine Learning: An Artificial Intelligence Approach, R. S. Michalski, J. G. Carbonell and T. M. Mitchell, Editors 1983, Berlin: Springer-Verlag pp. 25 -37. 1983. ICT 219 20