SEQUENTIAL TESTING AND SIMULATION VALIDATION FOR AUTONOMOUS SYSTEMS

























- Slides: 25

SEQUENTIAL TESTING AND SIMULATION VALIDATION FOR AUTONOMOUS SYSTEMS May 1, 2020 Jim Simpson

T&E of Autonomous Systems * Autonomy / AI make testing hard when they are part of system attributes that make behavior unpredictable. Attributes and DOE/Test Science: State Space Explosion Many factors / variables and vast test space Non-smooth, fractal response Complex, high order, discontinuous surfaces Transparency lacking Less opportunity for cause-effect System behavior changes Dynamic, evolving models and times series focus Emergent behaviors Unstable input-process-output platform * Tate, D. , Sparrow, D. and IDA Tech Reports, (2019). 2

Adaptive Testing for Autonomous Systems – Sequential and Simulation Validation 3

T&E Process Questions Plan Sequentially for Discovery How Many Tests and Factors? Analyze Statistically to Model Performance What was Learned? Testing Autonomous Systems Design to Control Risks of Wrong Conclusions and Span Battlespace Which Points Based on Phase? Execute to Control Uncertainty How to Sequence? 4

The Span of Testing System Complexity Example Mission or Capability Subsystems Air-to-air Systems Close Air Support System of Systems Interdiction Test Phase 10/16/2021 Early DT Mid DT DT/OT OT

Simulation’s Role in T&E Each stage of development, testers seek empirical evidence to support decisions Ultimate question – how will this system function in service? Various simulations of system differ in fidelity and cost Differing test goals (screen, optimize, characterize, robust design, troubleshoot) Goal – distinguish truth from fiction: What matters? What doesn’t? 6

HH-60 G Hover System Example Purpose: to assess the operational effectiveness and suitability of the Improved Altitude Hold and Hover Stabilization System (IAHHSS) for HH-60 G personnel recovery, and combat search and rescue (CSAR) missions. The test team must assess the adequacy of the IAHHSS for fielding and for developing tactics, techniques and procedures. Response: Deviation from Desired Altitude over time 7

Factors and Levels Factor Level 1 Level 2 Level 3 Level 4 Level 5 Level 6 Altitude (in feet AGL) Environment Low Medium High Desert Mountain Sea Forest / Jungle Urban Snow Time of Day Night Mode (Hover related) Visibility Cruise Decel to HOV Reduced HOV Ground Approach Land AIE Clear 8

IAHSS Autonomy Issues Is this just a standard DOE with some factors and levels to vary efficiently and effectively using a randomized design? Autonomy Considerations The factors are mostly environmental, so very little about the hold and hover system The factors are selected to challenge IAHHS What if the system starts to deviate from expected conditions? How and when should the pilot respond? What other factors should be considered when the system is not predictable? Is a sequential design relevant? Has a validated simulation/simulator been exercised? 9

Methods for Sequential Testing of Autonomous Systems 10

Sequential Concept Often it is better to conduct a series of experiments to be most efficient and effective in test Involves combining knowledge of design augmentation strategies with knowledge gained regarding factors during test Data (facts, phenomena) … Deduction Induction Deduction Idea (model, hypothesis, theory, conjecture) From Box, Hunter (2005), Statistics for Experimenters, Wiley, NY. Induction …

Sequential and Adaptive for Autonomous Systems Autonomous systems offer considerable additional complexities that should supplement to the traditional plan, design, execute, analyze process Near real-time sequential test design may benefit T&E for autonomous systems – new process and policy for TEMPs Plan combined DOE / observational study for recorded variables Responses of interest may change during test event System factors and factor levels may change – add / delete, morph Analysis should be iterative, build on previous knowledge and must adapt and respond quickly

Sequential Planning Factors N 1 Plan Responses, Factors, Covariates, N Analyze Existing and Proposed Model Continuous Learning N 2 Design Space Coverage, Power Execute Design and Observational Study N

Sequential Construct Fractional Full Factorial Response Surface Nested CCD Main Effects ME + Interactions 2 nd order nonlinear 3 rd or 4 th order Add Factors Observed Variables Change Levels 14

Methods for Simulation Validation of Autonomous Systems 15

Test Design Space Simulation vs. Live Strategies for selecting the design space and points for live and simulation-based testing Ideal : the live design encompasses the simulation design space so comparisons between them are interpolations, not extrapolations. Limitation in Live Space: often due to practical constraints that exist in live testing. Here the domain of the live testing should span the maximum possible domain of the simulation experiment and regions of extrapolation should be clearly identified in the validation limitations Ideal Limited Live 16

Example: Golden Horde Swarm Effectiveness Air-to-ground munitions and decoys launched then cooperating autonomously in flight to maximize effectiveness against targets Weapon systems are AGM-158 JASSM, GBU-53 A SDB-II and ADM-160 MALD Respond to pop-up threats and images from weapons just before impact sent via datalink to provide BDA to modify targets of weapons in flight 17

Truth & Models & Live Testing True System Behavior Noise physicist Referent Data Subsystem engineer Legacy Models Monte Carlo Variables software engineer Fidelity Autonomy Algorithms analyst Noise component, bench subsystem, captive Simulation Test design and analysis Sim V&V Test design and analysis live end-to-end 18

M&S and Live and Statistical VV&A True System Behavior Noise Simulation Live M 1 Aero M 2 Propulsion M 3 Guidance M 4 Seeker M 5 Fuse M 6 Target Statistical Validation of M&S Activity and Use Referent Acceptability Criteria Design & Test Plan Analysis Plan Execution Analysis and Reporting T 1 Model Lab, bench, tunnel T 2 Subsystem Captive, SCTV, GTV T 3 System AUR Free Flight 19

The Process – Statistical M&S V&V Plan Sequentially for Discovery Factors, Responses and Levels Analyze Statistically to Model Performance Model, Predictions, Bounds Statistical M&S V&V Design With Type I Risk and Power to Span the Battlespace N, a, Power, Test Matrices Execute to Control Uncertainty Randomize, Block, Replicate 20

Hybrid in 2 -stages – Evaluate Design Start with a Space-Fill Design and Augment for a Quadratic Model SF SF + I-optimal 21

Comparing Live Test to Simulation 22

Hybrid plus Augmentation – Locating Exceptional Performance 1. Construct Hybrid Design 2. Execute Runs, Analyze Data, 3. Locate Exceptional Region(s) 4. Augment Design 5. Execute and Analyze 6. Estimate Performance 23

Sequential Summary Benefits and Challenges B Design can be dynamic and adapt to increased understanding of how system autonomy functions, how to better cover the state space, and what data needs to be collected to best improve the current statistical model C How to design in presence of limited understanding of how system will react to situations and environment C How to best handle numerous covariates, state space continually growing and shifting C Factors in/out, levels changing, variables in/out, responses in/out C New measures needed for coverage and power 24

Simulation Validation Summary Benefits and Challenges B Simulation is routinely playing integral role in development of our complex systems B Advances in statistically based methods for simulation validation increase confidence that intended uses are viable for T&E C Complexity of autonomy forces developers to include fidelity where most important C The needed frequency of validation engagements insurmountable C When to reengage with validation C Prioritize validation focus based on decided levels of fidelity C. . . Challenges are rich and future research plentiful – join the team! 25