# Statistical inference Statistical thinking will one day be

• Slides: 21

Statistical inference “Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write. ” (H. G. Wells, 1946) “There are three kinds of lies: white lies, which are justifiable; common lies, which have no justification; and statistics. ” (Benjamin Disraeli) “Statistics is no substitute for good judgment. ” (unknown) 1 ETM 620 - 09 U

Statistical inference Suppose – A mechanical engineer is considering the use of a new composite material in the design of a vehicle suspension system and needs to know how the material will react under a variety of conditions (heat, cold, vibration, etc. ) An electrical engineer has designed a radar navigation system to be used in high performance aircraft and needs to be able to validate performance in flight. An industrial engineer needs to validate the effect of a new roofing product on installation speed. A motorist must decide whether to drive through a long stretch of flooded road after being assured that the average depth is only 6 inches. 2 ETM 620 - 09 U

Statistical inference What do all of these situations have in common? How can we address the uncertainty involved in decision making? a priori a posteriori 3 ETM 620 - 09 U

Probability A mathematical means of determining how likely an event is to occur. Classical (a priori): Given N equally likely outcomes, the probability of an event A is given by, ________ where n is the number of different ways A can occur. Empirical (a posteriori): If an experiment is repeated M times and the event A occurs m. A times, then the probability of event A is defined as, __________ We’ll talk more about this next time … 4 ETM 620 - 09 U

Descriptive statistics Numerical values that help to characterize the nature of data for the experimenter. Example: The absolute error in the readings from a radar navigation system was measured with the following results: 17, 31, 22, 39, 28, 147, and 52 the sample mean, x = _____________ ~ the sample median, x = _______ 5 the sample mode = ________ ETM 620 - 09 U

Descriptive Statistics Measure of variability Our example: 17, 31, 22, 39, 28, 147, and 52 sample range: sample variance: 6 ETM 620 - 09 U

Variability of the data sample variance, sample standard deviation, 7 ETM 620 - 09 U

Other descriptors Discrete vs Continuous discrete: continuous: Categorical and identifying categorical: unit identifying: Distribution of the data “What does it look like? ” 8 ETM 620 - 09 U

Graphical methods Dot diagram and scatter plot useful for understanding relationships between factor settings and output example (pp. 174 -175) 70 60 0 10 20 30 40 50 Pull Strength 60 70 80 Pull strength 50 40 30 20 10 0 9 5 10 15 Wire Length 20 25 0 0 5 10 Wire length 15 20 ETM 620 - 09 U

70 70 60 60 50 50 Pull strength Using graphical methods … 40 30 20 20 10 10 0 5 10 Wire length 15 20 0 100 200 300 400 Die height 500 600 Which factor(s) (or independent variable(s)) appears to have an effect on the output (or dependent variable), and what does that relationship look like? 10 ETM 620 - 09 U

Graphical methods (cont. ) Stem and leaf plot example (radar data): 17, 31, 22, 39, 28, 147, and 52 11 ETM 620 - 09 U

Another example Bottle-bursting strength data (pg. 176) Stem 12 Leaf Frequency 17 6 1 18 7 2 19 7 3 20 0 5 8 6 21 0 4 5 9 22 0 1 3 8 23 1 1 4 5 5 5 24 2 2 3 5 6 8 8 25 0 0 0 1 3 4 4 7 8 8 8 26 0 0 1 2 3 3 4 4 5 5 27 0 1 1 2 4 4 5 6 6 7 8 8 28 0 0 1 1 3 3 6 7 29 0 3 4 6 8 9 9 30 0 1 7 8 31 7 8 7 32 1 8 5 33 4 7 3 34 6 13 19 26 37 5 5 7 7 8 9 9 (21) 42 28 18 11 1 ETM 620 - 09 U

Graphical methods (cont. ) Frequency Distribution (histogram) equal-size class intervals – “bins” ‘rules of thumb’ for interval size 7 -15 intervals per data set √ n more complicated rules Identify midpoint Determine frequency of occurrence in each bin Calculate relative frequency Plot frequency vs midpoint 13 ETM 620 - 09 U

Relative frequency histogram Example: stride lengths (in inches) of 25 male students were determined, with the following results: Stride Length 28. 60 26. 50 30. 00 27. 10 27. 80 26. 10 29. 70 27. 30 28. 50 29. 30 28. 60 26. 80 27. 00 27. 30 26. 60 29. 50 27. 00 27. 30 28. 00 29. 00 27. 30 25. 70 28. 80 31. 40 What can we learn about the distribution of stride lengths for this sample? 14 ETM 620 - 09 U

Constructing a histogram Determining relative frequencies 15 Class Interval Class Midpt. Frequency, F Relative frequency 25. 7 - 26. 9 26. 3 5 0. 2 27. 0 - 28. 2 27. 6 9 0. 36 28. 3 - 29. 5 28. 9 8 0. 32 29. 6 - 30. 8 30. 2 2 0. 08 30. 8 - 32. 0 31. 4 1 0. 04 ETM 620 - 09 U

Relative frequency graph 16 ETM 620 - 09 U

What can you see? Unimodal, Bimodal, or Multi-modal distribution Recognizable distribution? Skewness 17 ETM 620 - 09 U

Another example … Bottle-bursting strength data (pg. 176) (from Minitab) 18 (from Excel) ETM 620 - 09 U

Other useful graphical methods Box plot (aka, box and whisker plot) bottle bursting data and another example (viscosity measurement, pg. 181) 19 ETM 620 - 09 U

Other useful graphical methods (cont. ) Pareto diagram frequency count for categorical data arranged in descending order of frequency of occurrence useful for identifying “high value” targets sources of defects level of effort required in maintenance activities etc. Time plot of observed values vs a time scale (hour of day, month, etc. ) useful for identifying patterns effect of time of day on electricity usage seasonal effects 20 etc. ETM 620 - 09 U

Your turn** … Look at problem 8 -8 on page 194 do parts a & b draw conclusions ** - time permitting (Note: this also makes a good study problem) 21 ETM 620 - 09 U