An Application of Reinforcement Learning to Aerobatic Helicopter
An Application of Reinforcement Learning to Aerobatic Helicopter Greg Mc. Chesney Texas Tech University Greg. mcchesney@ttu. edu Apr 08, 2009 CS 5331: Autonomous Mobile Robots 1
Overview Creating a robot that can fly autonomously ¡ Software developed at Stanford as part of their AI lab ¡ This paper is slightly outdated as many new maneuvers have been created. ¡ Apr 08, 2009 CS 5331: Autonomous Mobile Robots 2
Learning Approach ¡ Apprenticeship l l Apr 08, 2009 Collect data from human trying maneuver (multiple times) Learn a model from the data Find controller than can simulate based on model Test on helicopter (pray it doesn’t crash) CS 5331: Autonomous Mobile Robots 3
Helicopters State Position ¡ Velocity ¡ Angular Velocity ¡ Controlled with 4 dimensions ¡ l l ¡ Apr 08, 2009 Cyclic pitch Tail rotor Take gravity out when calculating the model CS 5331: Autonomous Mobile Robots 4
Controller Design Use a Markov decision process ¡ Sextuple (S, A, T, H, s(0), R) ¡ l l l Apr 08, 2009 S-set of states A-set of actions (inputs) T-dynamic model-set of probability distributions for the next state H-horizon or number of time steps of interest s(0)-initial state R-reward function CS 5331: Autonomous Mobile Robots 5
Differential Dynamic Programming(DDP) Compute the linear approximation ¡ Compute the optimal solution to the linear quadratic regulator ¡ l l Apr 08, 2009 Must take into account error state Cost for change in input-needed in real testing CS 5331: Autonomous Mobile Robots 6
DDP-Continued ¡ 2 phases l l ¡ Apr 08, 2009 DDP to find open loop input sequence Use DDP again refining the inputs as a deviation from the nominal open-loop input sequence Integral control-take into account wind and errors in the model CS 5331: Autonomous Mobile Robots 7
Rewards 24 features ¡ Used inverse reinforcement learning ¡ Rewards from inverse reinforcement usually did not produce correct result ¡ Took inverse results and manually tuned them to get good results ¡ Apr 08, 2009 CS 5331: Autonomous Mobile Robots 8
Helicopter Xcell Tempest ¡ 54” long ¡ 19” high ¡ 13 lbs ¡ Two-stroke engine ¡ Orientation sensors ¡ GPS-doesn’t work during flips ¡ Apr 08, 2009 CS 5331: Autonomous Mobile Robots 9
Apr 08, 2009 CS 5331: Autonomous Mobile Robots 10
Flip Apr 08, 2009 CS 5331: Autonomous Mobile Robots 11
Roll Apr 08, 2009 CS 5331: Autonomous Mobile Robots 12
Tail-In Funnel Apr 08, 2009 CS 5331: Autonomous Mobile Robots 13
Nose-In Funnel Apr 08, 2009 CS 5331: Autonomous Mobile Robots 14
Questions ¡ Motivations/Who pays for it l l ¡ Could more maneuvers be done just by changing some parameters? l Apr 08, 2009 I can see applications in the defense sector DARPA Probably not because the filter is learned based on a model so you would need to create a new model CS 5331: Autonomous Mobile Robots 15
More Questions ¡ What's the relationship between reinforcement learning and MDP? l ¡ Apr 08, 2009 Not Sure Could a helicopter like this operate in the West Texas wind storms? CS 5331: Autonomous Mobile Robots 16
Fun Stuff ¡ Videos: l l ¡ Helicopter l Apr 08, 2009 http: //heli. stanford. edu/ http: //www. youtube. com/watch? v=VC dxqn 0 fcn. E http: //www. miniatureaircraftusa. com/h elicopterkits/1025_Spectra_G/1025_kit _main. asp CS 5331: Autonomous Mobile Robots 17
- Slides: 17