Holland Goodman Caltech Banbury 2001 Holland Goodman Caltech
Holland Goodman – Caltech – Banbury 2001
Holland Goodman – Caltech – Banbury 2001
When an autonomous embodied system, with a difficult animal-like mission in a difficult environment, has a sufficiently high level of intelligence (i. e. is able to achieve that mission well), then it may exhibit consciousness, either as a necessary component for achieving the mission, or as a by-product. Holland Goodman – Caltech – Banbury 2001
Holland Goodman – Caltech – Banbury 2001
Holland Goodman – Caltech – Banbury 2001
Holland Goodman – Caltech – Banbury 2001
Holland Goodman – Caltech – Banbury 2001
Holland Goodman – Caltech – Banbury 2001
Holland Goodman – Caltech – Banbury 2001
Holland Goodman – Caltech – Banbury 2001
Holland Goodman – Caltech – Banbury 2001
Holland Goodman – Caltech – Banbury 2001
Holland Goodman – Caltech – Banbury 2001
Holland Goodman – Caltech – Banbury 2001
Holland Goodman – Caltech – Banbury 2001
Robots Holland Goodman – Caltech – Banbury 2001
A Simple Robot • The Khepera miniature robot • Features • 8 IR sensors which allow it to detect objects • two independently controlled motors. 5. 5 cm Holland Goodman – Caltech – Banbury 2001
Webots – Khepera Embodied Simulators allow faster operation than real robots – particularly if learning involved. Simlulator complexity is OK for a simple robot like the Khepera, but for more complex robots, the simulator may be too complex or not simulate the real word accurately. Holland Goodman – Caltech – Banbury 2001
A Generic Robot Controller Architecture Sensory Inputs Including Motors and effectors HIDDEN UNITS INPUT UNITS STATE UNITS OUTPUT UNITS Controller outputs to motors and effectors Recurrent Neural Machine • The controller of the robot is an artificial neural network with recurrent feedback, capable of forming internal representations of sensory information in the form of a neural state machine. • Sensory inputs (vision, sound, smell, etc) from sensors are fed to this structure • Sensory inputs also include feedback from the motors and effectors. • Controller outputs drive the locomotion and manipulators of the robot. • The neural controller learns to perform a task, using neural network and genetic algorithm techniques. • But - the internal model of the controller is implicit and therefore hidden from us. Holland Goodman – Caltech – Banbury 2001
Understanding the internal model • Introduce a second recurrent neural network, separate from the first system, which learns the inverse relationship between the internal activity of the controller and the sensory input space Sensory inputs, including feedback from motors & effectors Outputs of inverse in same sensory space as inputs of OBSERVE forward controller HIDDEN UNITS OUTPUT UNITS INPUT UNITS STATE UNITS Motor & effector drive outputs Recurrent Neural Machine INVERSE Recurrent Neural Machine • This mechanism will allow us to represent the hidden internal state of the controller in terms of the sensory inputs that correspond to that state. • Thus we may claim to know something of “what the robot is thinking”. • We assume that the controller be learned first, and that, once this is learned and reasonably stable, the inverse can be learned. Holland Goodman – Caltech – Banbury 2001
Simplified Inverse • In this experiment, we utilize a controller model which is much less powerful than the recurrent controllers described above, but allows us to illustrate the principle, and in particular makes “inversion” of the forward controller extremely simple. • The crucial simplification we make is that the controller will learn its representation directly in the input space. Thus there is no inverse to learn - the internal representation learned by the robot is directly visible as an input space vector. • The first phase is to learn or program the forward model or robot controller. In this simple experiment we program in a simple reactive wall -following behavior, rather than learn a complex behavior. The robot starts with no internal model, and adaptively learns its internal representation in an unsupervised manner as it performs its wall following behavior. Holland Goodman – Caltech – Banbury 2001
The Learning Algorithm (based on Linaker and Niklasson 2000 ARAVQ algorithm) • A 10 -dimensional feature space is formed from the 8 Khepera IR sensor signals plus the 2 motor drive signals. • Clusters feature-vectors by change detection, to form prototype feature vector “models”. • Unsupervised • Adds new models based on two criteria: • Novelty: Large distance from existing models • Stability: Low variance in buffered history of features • Adapts existing models over time • We program in a simple “wall following” behavior to act as a “teacher”. Holland Goodman – Caltech – Banbury 2001
Learning in action Colors show learned concepts: Black – right wall Blue – ahead wall Green – 45 degree right wall Red – corridor Light Blue – outside corner Holland Goodman – Caltech – Banbury 2001
Running with the model • Switch off the wall follower • The robot “sees” features as it moves • Choose the closest learned model vector at each tick • Use the model vector motor drive values to actually drive the motors. Holland Goodman – Caltech – Banbury 2001
Running with the model Color indicates which is the current “best”model feature Holland Goodman – Caltech – Banbury 2001
Run the model in the real robot Holland Goodman – Caltech – Banbury 2001
Invert the motor signals back to sensory signals to infer an egocentric “map” of the environment as “seen” by the robot. Holland Goodman – Caltech – Banbury 2001
Keeping it Real • Mapping with the real robot Holland Goodman – Caltech – Banbury 2001
Manipulating the model “mentally” to make a decision - “planning” • • Take the sequence of learned model feature vectors and cluster sub –sequences into higherlevel concepts For example: • Blue-Green-black = Left Corner • Red = Corridor • Black = right wall • • • At any instant ask the robot to go to “home” Run the model forwards mentally to decide if it is shorter to go ahead or to go back Take appropriate action Holland Goodman – Caltech – Banbury 2001
Decision Time Corridor corner is home Rotate = Home is behind me Flash LED’s = Home is ahead of me Holland Goodman – Caltech – Banbury 2001
Inverse Predictor Architecture • We now allow the inverse to be fed back into the controller via the switch CONTROLLER Switch Real World Sense Signals • Thus the controller has an image of its internal hidden state or “self” in the same feature space as its real sensory inputs • Thus it can “see” what it “itself” is thinking. Switch Motor Signals To Real Robot INVERSE Model World Sense Signals • As before “we” can also observe what the machine is “thinking”. Holland Goodman – Caltech – Banbury 2001
Consequences of the architecture • In “normal” mode - the controller is producing motor signals based on the sensory input it “sees” (including motor/effector feedback). Normally we expect to see what it is seeing. The inverse allows for detecting mismatch between a predicted an actual sensory input – thus indicating a novel experience, which in turn could focus attention and learning in the main controller. Noisy, ambiguous, and partial inputs can be “completed”. • In “thinking or planning” mode the real world is disconnected from the controller input, and the mental images being output by the inverse are input to the controller instead. Thus sequences of planned action towards a goal can take place in mental space, and executed as action. Note that by switching between normal mode and “thinking” mode in some way, we can emulate the robot doing both reactive control and thinking at the same (multiplexed really) time. That is, like humans do when driving a car on “automatic” while “thinking” of something else. • In “sleeping” mode we shut off the sensory input and allow noise to be input. Then the inverse will output “mental images”, which themselves can be fed back into the input (because they have the same representation) producing a complex series of “imagined” mental images or “dreams”. Note that we can use this “sleeping” mode to actually learn (or at least update) the inverse. The input noise vector is a “sensory input” vector like any other (whether it is structured accordingly or not), thus the inverse should be able to output this vector like any other from the state and motor signals. Thus we can use the error to update the inverse. • If we do not disconnect the motors during “dreaming” we will have “sleepwalking” or “twitching”. If we assume that the controller is continually learning, then the inverse must be continually updated. If they get too much out of synchronization we could get irrational sequences in “thinking” or worse in execution mode - an analog of “madness”. Holland Goodman – Caltech – Banbury 2001
Where’s the Consciousness? • • Not there yet More complex robots More complex environments More complex architecture SONY DREAM ROBOT Head: 2 degrees of freedom Body: 2 degrees of freedom Arms: 4 degrees of freedom (x 2) Legs: 6 degrees of freedom (x 2) (Total of 24 degrees of freedom) Holland Goodman – Caltech – Banbury 2001
Increasing complexity Environment Agent Fixed environment Moving objects Movable objects Objects with different values Other agents – prey Other agents – predators Other agents – competitors Other agents – collaborators Other agents – mates Etc Movable body More sensors Effectors Articulated body Metabolic state Acquired skills Tools Imitative learning Language Etc Holland Goodman – Caltech – Banbury 2001
Multi-stage planning At each step: - what actions could it take? - what actions should it take? - what actions would it take? The planning system needs - a good and current model of the world - a good and current model of the agent’s abilities, expressible in terms of their effects on the model world - an associated executive system to use the information generated by the planning system Holland Goodman – Caltech – Banbury 2001
A framework? Updates Self Model To executive Updates Environment Model Holland Goodman – Caltech – Banbury 2001
Speculation… There may be something it is like to be such a self-model linked to such a world model in a robot with a mission Holland Goodman – Caltech – Banbury 2001
- Slides: 37