An Application of Reinforcement Learning to Aerobatic Helicopter

An Application of Reinforcement Learning to Aerobatic Helicopter Greg Mc. Chesney Texas Tech University Greg. mcchesney@ttu. edu Apr 08, 2009 CS 5331: Autonomous Mobile Robots 1

Overview Creating a robot that can fly autonomously ¡ Software developed at Stanford as part of their AI lab ¡ This paper is slightly outdated as many new maneuvers have been created. ¡ Apr 08, 2009 CS 5331: Autonomous Mobile Robots 2

Learning Approach ¡ Apprenticeship l l Apr 08, 2009 Collect data from human trying maneuver (multiple times) Learn a model from the data Find controller than can simulate based on model Test on helicopter (pray it doesn’t crash) CS 5331: Autonomous Mobile Robots 3

Helicopters State Position ¡ Velocity ¡ Angular Velocity ¡ Controlled with 4 dimensions ¡ l l ¡ Apr 08, 2009 Cyclic pitch Tail rotor Take gravity out when calculating the model CS 5331: Autonomous Mobile Robots 4

Controller Design Use a Markov decision process ¡ Sextuple (S, A, T, H, s(0), R) ¡ l l l Apr 08, 2009 S-set of states A-set of actions (inputs) T-dynamic model-set of probability distributions for the next state H-horizon or number of time steps of interest s(0)-initial state R-reward function CS 5331: Autonomous Mobile Robots 5

Differential Dynamic Programming(DDP) Compute the linear approximation ¡ Compute the optimal solution to the linear quadratic regulator ¡ l l Apr 08, 2009 Must take into account error state Cost for change in input-needed in real testing CS 5331: Autonomous Mobile Robots 6

DDP-Continued ¡ 2 phases l l ¡ Apr 08, 2009 DDP to find open loop input sequence Use DDP again refining the inputs as a deviation from the nominal open-loop input sequence Integral control-take into account wind and errors in the model CS 5331: Autonomous Mobile Robots 7

Rewards 24 features ¡ Used inverse reinforcement learning ¡ Rewards from inverse reinforcement usually did not produce correct result ¡ Took inverse results and manually tuned them to get good results ¡ Apr 08, 2009 CS 5331: Autonomous Mobile Robots 8

Helicopter Xcell Tempest ¡ 54” long ¡ 19” high ¡ 13 lbs ¡ Two-stroke engine ¡ Orientation sensors ¡ GPS-doesn’t work during flips ¡ Apr 08, 2009 CS 5331: Autonomous Mobile Robots 9

Apr 08, 2009 CS 5331: Autonomous Mobile Robots 10

Flip Apr 08, 2009 CS 5331: Autonomous Mobile Robots 11

Roll Apr 08, 2009 CS 5331: Autonomous Mobile Robots 12

Tail-In Funnel Apr 08, 2009 CS 5331: Autonomous Mobile Robots 13

Nose-In Funnel Apr 08, 2009 CS 5331: Autonomous Mobile Robots 14

Questions ¡ Motivations/Who pays for it l l ¡ Could more maneuvers be done just by changing some parameters? l Apr 08, 2009 I can see applications in the defense sector DARPA Probably not because the filter is learned based on a model so you would need to create a new model CS 5331: Autonomous Mobile Robots 15

More Questions ¡ What's the relationship between reinforcement learning and MDP? l ¡ Apr 08, 2009 Not Sure Could a helicopter like this operate in the West Texas wind storms? CS 5331: Autonomous Mobile Robots 16

Fun Stuff ¡ Videos: l l ¡ Helicopter l Apr 08, 2009 http: //heli. stanford. edu/ http: //www. youtube. com/watch? v=VC dxqn 0 fcn. E http: //www. miniatureaircraftusa. com/h elicopterkits/1025_Spectra_G/1025_kit _main. asp CS 5331: Autonomous Mobile Robots 17