Augmenting Definitive Screening Designs for Estimating Quadratic Models




































- Slides: 36

Augmenting Definitive Screening Designs for Estimating Quadratic Models Abigael C. Nachtsheim Research Training Group 2017 SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES

Motivating Example: Laser Etching SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 2

Motivating Example: Laser Etching • Screening experiment • 6 factors, line width, pulse rate, scan speed, laser power, height, color • Response: etch clarity SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 3

Motivating Example: Laser Etching SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 4

Motivating Example: Laser Etching • Screening experiment • 6 factors, line width, pulse rate, scan speed, laser power, height, color • Response: etch clarity • The experimenter wants to use a Definitive Screening Design (DSD) SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 5

6 -Factor DSD SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 6

Why a DSD? • Small number of runs required SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 7

Why a DSD? • Small number of runs required • Orthogonal for main effects SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 8

Why a DSD? • Small number of runs required • Orthogonal for main effects • Foldover pairs result in statistical independence of main effects and two-factor interactions SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 9

Why a DSD? • Small number of runs required • Orthogonal for main effects • Foldover pairs result in statistical independence of main effects and two-factor interactions • Second-order effects not completely confounded with other second-order effects SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 10

Why a DSD? • Small number of runs required • Orthogonal for main effects • Foldover pairs result in statistical independence of main effects and two-factor interactions • Second-order effects not completely confounded with other second-order effects • 3 -level design: all quadratic effects are estimable SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 11

However… • If the number of active effects is less than n/2, the estimated model form is probably right • If not, the model is probably wrong Need to augment the design to find the right second-order model! *Errore, Jones, Li, and Nachtsheim, Journal of Quality Technology, 2017 SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 12

Today’s Plan 1) Show to augment DSDs 2) Perform a simulation study to evaluate the augmented designs 3) Conclusions SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 13

Two Possible Goals 1) Estimation -correctly identify active model terms SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 14

Two Possible Goals 1) Estimation -correctly identify active model terms 2) Prediction -precise estimation of responses SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 15

Two Possible Goals 1) Estimation Previous work -correctly identify active model terms 2) Prediction -precise estimation of responses SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 16

Two Possible Goals 1) Estimation -correctly identify active model terms 2) Prediction Today’s focus -precise estimation of responses SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 17

Approach 1: Response Surface Design (RSD) • Augment to obtain a (equivalent of) response surface design • Classical design---widely used for prediction • Most expensive approach - 6 -factor central composite response surface design requires 46 runs SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 18

Approach 2: Saturated Design (SD) • Saturated designs are the smallest designs that allow estimation of all possible effects for the model of interest • For full quadratic model, n = (m+1)(m+2)/2 • For m = 6, n = 28. • Intermediate cost SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 19

Approach 3: Supersaturated Design • Designs with sample size less than the number of terms in the model of interest: n < p • DSDs are supersaturated designs for the full quadratic model • We will examine small augmentations such that the result is still a supersaturated design SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 20

Three approaches: DSD SD SSDs Runs: 0 13 RSD Estimable Designs 28 SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 46 21

Design Construction and Evaluation 1) Construct a series of augmented designs 2) Perform simulation study to calculate predictive mean square error for each design 3) Determine the number of augmented runs necessary to effectively fit the correct model SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 22

Background: Design Augmentation • Collect data at naug additional points, having already collected data at n points • Where do we place these points? • Various methods exist - Dykstra (1971), Galil and Kiefer (1980), - We focus on: Bayesian I-optimal and Bayesian Doptimal augmentation Du. Mouchel and Jones (1994) SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 23

Bayesian I-optimality Criterion: Minimize where M is the moment matrix and where p is the number of terms in the full quadratic model SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 24

Bayesian D-optimality Criterion: Maximize SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 25

Simulation Study Cases SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 26

Simulation Study Step 1: Choose active effects at random Step 2: Generate nonzero beta coefficients having signal to noise ratio greater than 3 Step 3: Generate the sign of the coefficient at random Step 4: Compute the response vector Y = Xβ + ε where ε = standard normal errors SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 27

Simulation Study Analysis For each of 100 response vectors: • Employ forward selection with the AICc criterion to select the model • Compute the MSE SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 28

6 Factor, 13 -Run Base DSD, Bayesian I-optimal Augment Model with 4 MEs and 6 2 nd Order Effects DSD RSD SD MSE n SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 29

6 Factor, 13 -Run Base DSD, Bayesian I-optimal Augment Model with 4 MEs and 6 2 nd Order Effects MSE n SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 30

6 Factor, 13 -Run Base DSD, Bayesian D-optimal Augment Model with 4 MEs and 6 2 nd Order Effects DSD RSD SD MSE n SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 31

6 Factor, 13 -Run Base DSD, Comparison Bayesian I-optimal Bayesian D-optimal MSE n SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 32

6 Factor, 13 -Run Base DSD, Comparison Bayesian I-optimal Bayesian D-optimal MSE n SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 33

Preliminary Conclusions • No need to use n > (m+1)(m+2)/2 (RSDs are not worth the price!) SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 34

Preliminary Conclusions • No need to use n > (m+1)(m+2)/2 (RSDs are not worth the price!) • Bayesian D-optimal designs appear to be superior to Bayesian I-optimal designs SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 35

Future work 1) More extensive simulation study involving - Different numbers of factors - Different model selection approaches (e. g. , Dantzig, Lasso) - Different levels of sparsity (true models) 2) Evaluate the effect of augmentation on MSE for sparse models 3) Incorporate into previous work SCHOOL OF MATHEMATICAL AND STATISTICAL SCIENCES Abigael C. Nachtsheim 36