# INTRODUCTION TO STATISTICS 1 What Is Statistics Statistics

• Slides: 28

INTRODUCTION TO STATISTICS 1

What Is Statistics? Statistics is the science of collecting, organizing, and interpreting numerical facts, which we call data Examples of data: 1. High and low temperatures for this week in Hayward 2. The number of imported car sales and that of domestic car sales in this year 2

The Goal of Statistics The goal of statistics is to gain understanding from data and then make decision using the gain. 1. So you know what to wear. 2. So the government can justify their tax policy for importing car. 3

The Rise of Statistics The ideas and methods of statistics developed gradually as society grew interested in collecting and using data for a variety of applications. 1. The earliest origin is the desire of rulers to measure the value of taxable land in their domains. 2. Starting at 17 th century, the importance of careful measurements of weights, distances, and other physical quantities grew. 3. By the 19 th century, the agricultural and life sciences began to reply on data to answer fundamental questions. 4

WHO USE STATISTICS? Who is using Statistics? 5

Media l Television News – l News Papers – l 6 USA Today Polls – l KTVU Channel 2 Gallup Poll Web/TV Commercials

Government l l 7 U. S. Department of Labor – Bureau of Labor Statistics U. S. Census Bureau Department of Health &Human Services Others

Other Organizations l l 8 American Heart Association American Lung Association American Cancer Society California Lottery

Data Collection l Observational studies: Surveys (Sec 2. 2, 2. 3) – – l Scientific studies: Randomized (control) experiments (Sec 2. 4) – – 9 A study in which participants are only observed and measured. Cannot support a cause-and-effect conclusion. A study in which treatments or groups are randomly assigned to participants. Can support a cause-and-effect conclusion.

Simple Random Sample A simple random sample of size n is a sample chosen by a method in which each collection of n population items is equally likely to comprise the sample. A simple random sample is analogous to a lottery. Suppose that 10, 000 lottery tickets are sold and 5 are drawn as the winning tickets. Each collection of 5 tickets than can be formed is equally likely to comprise the group of 5 that is drawn. Copyright © 2014 The Mc. Graw-Hill Companies, Inc. Permission required for reproduction or display.

Example A physical education professor wants to study the physical fitness levels of 20, 000 students enrolled at her university. She obtains a list of all 20, 000 students, numbered from 1 to 20, 000 and uses a computer random number generator to generate 100 random integers between 1 and 20, 000, then invites the 100 students corresponding to those numbers to participate in the study. Is this a simple random sample? Solution: Yes, this is a simple random sample since any group of 100 students would have been equally likely to have been chosen. Copyright © 2014 The Mc. Graw-Hill Companies, Inc. Permission required for reproduction or display.

Example The professor in the last example now wants to draw a sample of 50 students to fill out a questionnaire about which sports they play. The professor’s 10: 00 am class has 50 students. She uses the first 20 minutes of class to have the students fill out the questionnaire. Is this a simple random sample? Solution: No. A simple random sample is like a lottery, in which each student in the population has an equal chance to be part of the sample. This sample does not meet that criterion. Copyright © 2014 The Mc. Graw-Hill Companies, Inc. Permission required for reproduction or display.

Samples of Convenience In some cases, it is difficult or impossible to draw a sample in a truly random way. In these cases, the best one can do is to sample items by some convenient method. A sample of convenience is a sample that is not drawn by a well-defined random method. Copyright © 2014 The Mc. Graw-Hill Companies, Inc. Permission required for reproduction or display.

Example A construction engineer has just received a shipment of 1000 concrete blocks. The blocks have been delivered in a large pile. The engineer wishes to investigate the crushing strength of the blocks by measuring the strengths in a sample of 10 blocks. Explain why it might be difficult to draw a simple random sample of blocks. Solution: To draw a simple random sample would require removing blocks from the center and bottom of the pile. One way to draw a sample of convenience would be to simply take 10 blocks off the top of the pile. Copyright © 2014 The Mc. Graw-Hill Companies, Inc. Permission required for reproduction or display.

Problems with Sample of Convenience The problem with samples of convenience is that they may differ systematically in some way from the population. If it is reasonable to believe that no important systematic difference exists, then it is acceptable to treat the sample of convenience as if it were a simple random sample. Copyright © 2014 The Mc. Graw-Hill Companies, Inc. Permission required for reproduction or display.

Stratified Random Sampling In stratified random sampling, the population is divided up into groups, called strata, then a simple random sample is drawn from each stratum. Stratified sampling is useful when the strata differ from one another, but the individuals within a stratum tend to be alike. Copyright © 2014 The Mc. Graw-Hill Companies, Inc. Permission required for reproduction or display.

Example – Stratified Random Sampling A company has 800 full-time and 200 part-time employees. To draw a sample of 100 employees, a simple random sample of 80 full-time employees is selected and a simple random sample of 20 part-time employees is selected. Choose simple random sample of 80 full-time employees GROUP 1 Full-time Employees Stratified Random Sample of 100 GROUP 2 Part-time Employees Choose simple random sample of 20 part-time employees Copyright © 2014 The Mc. Graw-Hill Companies, Inc. Permission required for reproduction or display.

Cluster Sampling In cluster sampling, items are drawn from the population in groups, or clusters. Cluster sampling is useful when the population is too large and spread out for simple random sampling to be feasible. Copyright © 2014 The Mc. Graw-Hill Companies, Inc. Permission required for reproduction or display.

Example To estimate the unemployment rate, a government agency draws a simple random sample of households in a county. Someone visits each household and asks how many adults live in the household, and how many of them are unemployed. What are the clusters? Why is this a cluster sample? Copyright © 2014 The Mc. Graw-Hill Companies, Inc. Permission required for reproduction or display.

Solution Household Household Household Household Random sample of households is taken Household This a cluster sample because a simple random sample of clusters is selected, and every individual in each selected cluster is part of the sample. The clusters are the groups of adults in each of the households in the county. Cluster Sample of Individuals Interview every individual in each household Copyright © 2014 The Mc. Graw-Hill Companies, Inc. Permission required for reproduction or display.

Systematic Sampling In systematic sampling, items are ordered and every kth item is chosen to be included in the sample. Systematic sampling is sometimes used to sample products as they come off an assembly line, in order to check that they meet quality standards. Copyright © 2014 The Mc. Graw-Hill Companies, Inc. Permission required for reproduction or display.

Example Automobiles are coming off an assembly line. It is decided to draw a systematic sample for a detailed check of the steering system. The starting point will be third car, then every fifth car after that will be sampled. Which cars will be sampled? Copyright © 2014 The Mc. Graw-Hill Companies, Inc. Permission required for reproduction or display.

The sample will consist of cars numbered 3, 8, 13, 18, 23, 25, and so on. Solution 1 2 3 4 5 16 17 18 19 20 6 21 7 8 9 10 11 12 13 22 23 24 25 26 27 28 Copyright © 2014 The Mc. Graw-Hill Companies, Inc. Permission required for reproduction or display. 14 29 15 30

Voluntary Response Sampling Voluntary response samples are often used by the media to try to engage the audience. For example, a radio announcer will invite people to call the station to say what they think. Voluntary response samples are never reliable for the following reasons: l People who volunteer an opinion tend to have stronger opinions than is typical of the population. l People with negative opinions are often more likely to volunteer their response. Copyright © 2014 The Mc. Graw-Hill Companies, Inc. Permission required for reproduction or display.

Who Are Those Speedy Drivers? Goal: to find the characteristics of speedy drivers such as gender, age. 25 l Can we conduct a scientific study to achieve the goal? l What questions would you ask?

Who Are Those Speedy Drivers? l l 26 In our class, how would you get a proper dataset? Simple random sampling. If on average, male drivers are 2 mph speedier than female drivers, would male drivers get more speedy tickets than females? Is this difference “practically” significant?

Does Aspirin Reduce Heart Attack Rates? Goal: to find the effect of aspirin on heart attack l Can we conduct a scientific study to achieve the goal? l 27 How do we conduct such a study? Completely randomized design. (Sec 2. 5)

Reading Assignment l Chapter 1, Sections 2. 1 to 2. 4 of textbook l Find a case study and discuss the study: – – 28 Is it an observational study or a controlled study? Is the sample a representative sample? What are the responses? predictors? Units? Is its result practically significant?