Toward Grounding Knowledge in Prediction or Toward a
- Slides: 23
Toward Grounding Knowledge in Prediction or Toward a Computational Theory of Artificial Intelligence Rich Sutton AT&T Labs with thanks to Satinder Singh and Doina Precup
It’s Hard to Build Large AI Systems • Brittleness • Unforeseen interactions • Scaling • Requires too much manual complexity management – people must understand, intervene, patch and tune – like programming • Need more autonomy – learning, verification – internal coherence of knowledge and experience
Marr’s Three Levels of Understanding • Marr proposed three levels at which any information -processing machine must be understood – Computational Theory Level • What is computed and why – Representation and Algorithm Level – Hardware Implementation Level • We have little computational theory for Intelligence – Many methods for knowledge representation, but no theory of knowledge – No clear problem definition – Logic
Reinforcement Learning provides a little Computational Theory • Policies (controllers) : States Pr(Actions) • Value Functions • 1 -Step Models
Outline of Talk • Experience • Knowledge Prediction • Macro-Predictions • Mental Simulation offering a coherent candidate computational theory of intelligence
Experience • AI agent should be embedded in an ongoing interaction with a world Agent actions observations World Experience = these 2 time series • Enables clear definition of the AI problem – Let {reward } be function of {observation } t t – Choose actions to maximize total reward cf. textbook definitions • Experience provides something for knowledge to be about
What is Knowledge? Deny the physical world Deny existence of objects, people, space… Deny all non-answers, correspondence theories All we really know about is our experience Knowledge must be in terms of experience
Grounded Knowledge A is always followed by B if = A then A, B observations =B if A( ) then B( ) A, B predicates Action conditioning: if A( ) and C( ) then B( All of these are predictions )
World Knowledge Predictions • The world is a black box, known only by its I/O behavior (observations in response to actions) • Therefore, all meaningful statements about the world are statements about the observations it generates • The only observations worth talking about are future ones Therefore: The only meaningful things to say about the world are predictions
Non-predictive “Knowledge” • Mathematical knowledge, theorems and proofs – always true, but tell us nothing about the world – not world knowledge • Uninterpretted signals, e. g. , useful representations – real and useful, but not by themselves world knowledge, only an aid to acquiring it • Knowledge of the past • Policies – could be viewed as predictions of value – but by themselves are more like uninterpretted signals Predictions capture “regular”, descriptive world knowledge
Grounded Knowledge A is always followed by B if 1 -step preds. = A then A, B observations =B if A( ) then B( ) A, B predicates Action conditioning: if A( ) and C( ) then B( ) Still a pretty limited kind of knowledge. Can’t say anything beyond one step!
Grounded Knowledge A is always followed by B if 1 -step preds. = A then A, B observations =B if A( ) then B( ) A, B predicates Action conditioning: if A( ) and C( ) then B( ) steps later many steps long if A( ) and <arbitrary experiment> then many B(<outcome>) macropred. prior grounding posterior grounding
Both Prior and Posterior Grounding are Needed • “Classical” AI systems omit prior grounding – e. g. , “Tweety is a bird”, “John loves Mary” – sometimes called the “symbol grounding problem” • Modern AI sytems tend to skimp the posterior – supervised learning, Bayes nets, robotics… • It is not OK to leave posterior grounding to external, human observers – the information is just not in the machine – we don’t understand it; we haven’t done our job! • Yet this is such an appealing shortcut that we have almost always done it
Outline of Talk • Experience • Knowledge Prediction • Macro-Predictions • Mental Simulation offering a coherent candidate computational theory of intelligence
Macro-Predictions (Options) a la Sutton, Precup & Singh, 1999 et al. Let : States Pr(Actions) be an arbitrary policy Let b : States Pr({0, 1}) be a termination condition Then < , b> is a kind of experiment – do until b=1 – measure something about the resulting experience Suppose we measure the outcome: – the state at the end of the experiment – the total reward during the experiment Then the macro-prediction for < , b> would predict Pr(end-state), E{total reward} given start-state This is a very general, expressive form of prediction
Sutton, Precup, & Singh, 1999 Rooms Example Policy of one option:
Planning with Macro-Predictions
Learning Path-to-Goal with and without Hallway Macros (Options)
Mental Simulation • Knowledge can be gained from experience – by actually performing experiments • But knowledge can also be gained without overt experience – we call this thinking, reasoning, planning, cognition… • This can be done through “thought experiments” – internal simulation of experience – generated from predictive knowledge – subject to learning methods as before • Much thought can be achieved this way. . .
Illustration: Dynamic Mission Planning for UAVs Reward=25 • 15 8 • ? • • 5 10 • Base Expected Reward/ Mission 60 • 50 • 40 High Fuel 30 RL planning w/strategies and real-time control RL planning w/strategies Static Replanner Low Fuel Mission: Fly over (observe) most valuable sites and return to base Stochastic weather affects observability (cloudy or clear) of sites Limited fuel Intractable with classical optimal control methods Temporal scales: – Tactics: which way to fly now – Strategies: which site to head for Strategies compress space and time – – Reduce no. states from ~1011 to ~106 Reduce tour length from ~600 to ~6 Reinforcement Learning with strategies and real-time control outperforms optimal tour planner that assumes static weather Barto, Sutton, and Moll, Adaptive Networks Laboratory, University of Massachusetts
What to compute and Why Reward Policy Value Functions The ultimate goal is reward, but our AI spends most of its time with knowledge Knowledge/ Predictions
A Candidate Computational Theory of Artificial Intelligence • AI Agent should be focused on finding general macro -predictions of experience • Especially seeking predictions that enable rapid computation of values and optimal actions • Predictions and their associated experiments are the coin of the realm – they have a clear semantics, can be tested & learned – can be combined to produce other predictions, e. g. values • Mental Simulation (plus learning) – makes new predictions from old – start of a computational theory of knowledge use
Conclusions • World knowledge must be expressed in terms of the data • Such posterior grounding is challenging, – lose expressiveness in the short term – lose external (human) coherence, explainability • But can be done step by step, • And brings palpable benefits – autonomous learning/verification/extension of knowledge – autonomous complexity management due to internal coherence – knowledge suited to general reasoning process – mental simulation • We must provide this grounding!
- Job card grounding
- Megadyne grounding pad
- Grounding transformers
- Ark tsisserev
- Ground ring installation
- Pool bonding diagram
- Equal potential grounding
- Advantages and disadvantages of resistance grounding
- Arrl grounding and bonding pdf
- Grounding system design
- Ground grid design
- Decoupling from utility grounding system
- Star grounding
- End fed zepp antenna
- Virtual loss of gm formula
- Detaching from emotional pain grounding
- Grounding separately derived systems
- Teamworknet
- Grounding
- What is the objective of earthing
- Odd behavior chart
- Pxie-5775
- Non separately derived system grounding diagram
- Grounding and bonding level 1 lesson 5