Chapter 1 Defining and Collecting Data Copyright 2015

  • Slides: 42
Download presentation
Chapter 1 Defining and Collecting Data Copyright © 2015, 2012, 2009 Pearson Education, Inc.

Chapter 1 Defining and Collecting Data Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 1

Learning Objectives In this chapter you learn to: n n n Understand the types

Learning Objectives In this chapter you learn to: n n n Understand the types of variables used in statistics Know the different measurement scales Know how to collect data Know the different ways to collect a sample Understand the types of survey errors Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 2

Types of Variables DCOVA § § Categorical (qualitative) variables have values that can only

Types of Variables DCOVA § § Categorical (qualitative) variables have values that can only be placed into categories, such as “yes” and “no. ” Numerical (quantitative) variables have values that represent a counted or measured quantity. § § Discrete variables arise from a counting process Continuous variables arise from a measuring process Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 3

Developing Operational Definitions Is Crucial To Avoid Confusion / Errors DCOVA n n n

Developing Operational Definitions Is Crucial To Avoid Confusion / Errors DCOVA n n n An operational definition is a clear and precise statement that provides a common understanding of meaning In the absence of an operational definition miscommunications and errors are likely to occur. Arriving at operational definition(s) is a key part of the Define step of DCOVA Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 4

Operational Definitions Of Terms DCOVA VARIABLE A characteristic of an item or individual. DATA

Operational Definitions Of Terms DCOVA VARIABLE A characteristic of an item or individual. DATA The set of individual values associated with a variable. STATISTICS The methods that help transform data into useful information for decision makers. Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 5

Types of Variables DCOVA Variables Categorical Numerical Examples: n n n Marital Status Political

Types of Variables DCOVA Variables Categorical Numerical Examples: n n n Marital Status Political Party Eye Color (Defined categories) Discrete Examples: n n Number of Children Defects per hour (Counted items) Copyright © 2015, 2012, 2009 Pearson Education, Inc. Continuous Examples: n n Weight Voltage (Measured characteristics) Chapter 1, Slide 6

Levels of Measurement DCOVA A nominal scale classifies data into distinct categories in which

Levels of Measurement DCOVA A nominal scale classifies data into distinct categories in which no ranking is implied. Categorical Variables Categories Do you have a Facebook profile? Yes, No Type of investment Growth , Value, Other Cellular Provider AT&T, Sprint, Verizon, Other, None Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 7

Levels of Measurement (con’t. ) DCOVA An ordinal scale classifies data into distinct categories

Levels of Measurement (con’t. ) DCOVA An ordinal scale classifies data into distinct categories in which ranking is implied Categorical Variable Ordered Categories Student class designation Freshman, Sophomore, Junior, Senior Product satisfaction Very unsatisfied, Fairly unsatisfied, Neutral, Fairly satisfied, Very satisfied Faculty rank Professor, Associate Professor, Assistant Professor, Instructor Standard & Poor’s bond ratings AAA, A, BBB, B, CCC, C, DDD, D Student Grades A, B, C, D, F Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 8

Levels of Measurement (con’t. ) DCOVA § § An interval scale is an ordered

Levels of Measurement (con’t. ) DCOVA § § An interval scale is an ordered scale in which the difference between measurements is a meaningful quantity but the measurements do not have a true zero point. A ratio scale is an ordered scale in which the difference between the measurements is a meaningful quantity and the measurements have a true zero point. Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 9

Interval and Ratio Scales DCOVA Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter

Interval and Ratio Scales DCOVA Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 10

Establishing A Business Objective Focuses Data Collection DCOVA Examples Of Business Objectives: § §

Establishing A Business Objective Focuses Data Collection DCOVA Examples Of Business Objectives: § § A marketing research analyst needs to assess the effectiveness of a new television advertisement. A pharmaceutical manufacturer needs to determine whether a new drug is more effective than those currently in use. An operations manager wants to monitor a manufacturing process to find out whether the quality of the product being manufactured is conforming to company standards. An auditor wants to review the financial transactions of a company in order to determine whether the company is in compliance with generally accepted accounting principles. Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 11

Collecting Data Correctly Is A Critical Task DCOVA § § § Need to avoid

Collecting Data Correctly Is A Critical Task DCOVA § § § Need to avoid data flawed by biases, ambiguities, or other types of errors. Results from flawed data will be suspect or in error. Even the most sophisticated statistical methods are not very useful when the data is flawed. Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 12

Sources of Data DCOVA § Primary Sources: The data collector is the one using

Sources of Data DCOVA § Primary Sources: The data collector is the one using the data for analysis § § Data from a political survey Data collected from an experiment Observed data Secondary Sources: The person performing data analysis is not the data collector § § Analyzing census data Examining data from print journals or data published on the internet. Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 13

Sources of data fall into five categories n Data distributed by an organization or

Sources of data fall into five categories n Data distributed by an organization or an individual n The outcomes of a designed experiment n The responses from a survey n n DCOVA The results of conducting an observational study Data collected by ongoing business activities Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 14

Examples Of Data Distributed By Organizations or Individuals DCOVA n n n Financial data

Examples Of Data Distributed By Organizations or Individuals DCOVA n n n Financial data on a company provided by investment services. Industry or market data from market research firms and trade associations. Stock prices, weather conditions, and sports statistics in daily newspapers. Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 15

Examples of Data From A Designed Experiment n n n DCOVA Consumer testing of

Examples of Data From A Designed Experiment n n n DCOVA Consumer testing of different versions of a product to help determine which product should be pursued further. Material testing to determine which supplier’s material should be used in a product. Market testing on alternative product promotions to determine which promotion to use more broadly. Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 16

Examples of Survey Data DCOVA n n n A survey asking people which laundry

Examples of Survey Data DCOVA n n n A survey asking people which laundry detergent has the best stain-removing abilities Political polls of registered voters during political campaigns. People being surveyed to determine their satisfaction with a recent product or service experience. Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 17

Examples of Data Collected From Observational Studies n n n DCOVA Market researchers utilizing

Examples of Data Collected From Observational Studies n n n DCOVA Market researchers utilizing focus groups to elicit unstructured responses to open-ended questions. Measuring the time it takes for customers to be served in a fast food establishment. Measuring the volume of traffic through an intersection to determine if some form of advertising at the intersection is justified. Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 18

Examples of Data Collected From Ongoing Business Activities DCOVA n n n A bank

Examples of Data Collected From Ongoing Business Activities DCOVA n n n A bank studies years of financial transactions to help them identify patterns of fraud. Economists utilize data on searches done via Google to help forecast future economic conditions. Marketing companies use tracking data to evaluate the effectiveness of a web site. Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 19

Data Is Collected From Either A Population or A Sample DCOVA POPULATION A population

Data Is Collected From Either A Population or A Sample DCOVA POPULATION A population consists of all the items or individuals about which you want to draw a conclusion. The population is the “large group” SAMPLE A sample is the portion of a population selected for analysis. The sample is the “small group” Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 20

Population vs. Sample DCOVA Population All the items or individuals about which you want

Population vs. Sample DCOVA Population All the items or individuals about which you want to draw conclusion(s) Sample A portion of the population of items or individuals Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 21

Collecting Data Via Sampling Is Used When Selecting A Sample Is n n n

Collecting Data Via Sampling Is Used When Selecting A Sample Is n n n DCOVA Less time consuming than selecting every item in the population. Less costly than selecting every item in the population. Less cumbersome and more practical than analyzing the entire population. Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 22

Things To Consider / Deal With In Potential Sources Of Data n DCOVA Is

Things To Consider / Deal With In Potential Sources Of Data n DCOVA Is the source of data structured or unstructured? n How is electronic data formatted? n How is data encoded? Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 23

Structured Data Follows An Organizing Principle & Unstructured Data Does Not n A Stock

Structured Data Follows An Organizing Principle & Unstructured Data Does Not n A Stock Ticker Provides Structured Data: n n n The stock ticker repeatedly reports a company name, the number of shares last traded, the bid price, and the percent change in the stock price. Due to their inherent structure, data from tables and forms are structured data. E-mails from five people concerning stock trades is an example of unstructured data. n n DCOVA In these e-mails you cannot count on the information being shared in a specific order or format. This book will deal almost exclusively with structured data Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 24

Almost All Of The Methods In This Book Deal With Structured Data n n

Almost All Of The Methods In This Book Deal With Structured Data n n n DCOVA Some of the methods in Chapter 17 involve unstructured data. For many of the questions you might want to answer, the starting point will be tabular data. To deal with unstructured data, you will probably need to seek out help with more advanced methods / techniques. Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 25

Data Can Be Formatted and / or Encoded In More Than One Way n

Data Can Be Formatted and / or Encoded In More Than One Way n n n Some electronic formats are more readily usable than others. DCOVA Different encodings can impact the precision of numerical variables and can also impact data compatibility. As you identify and choose sources of data you need to consider / deal with these issues Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 26

Data Cleaning Is Often A Necessary Activity When Collecting Data DCOVA n Often find

Data Cleaning Is Often A Necessary Activity When Collecting Data DCOVA n Often find “irregularities” in the data n n n Typographical or data entry errors Values that are impossible or undefined Missing values Outliers When found these irregularities should be reviewed / addressed Both Excel & Minitab can be used to address irregularities Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 27

After Collection It Is Often Helpful To Recode Some Variables DCOVA n n Recoding

After Collection It Is Often Helpful To Recode Some Variables DCOVA n n Recoding a variable can either supplement or replace the original variable. Recoding a categorical variable involves redefining categories. Recoding a quantitative variable involves changing this variable into a categorical variable. When recoding be sure that the new categories are mutually exclusive (categories do not overlap) and collectively exhaustive (categories cover all possible values). Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 28

A Sampling Process Begins With A Sampling Frame DCOVA n n The sampling frame

A Sampling Process Begins With A Sampling Frame DCOVA n n The sampling frame is a listing of items that make up the population Frames are data sources such as population lists, directories, or maps Inaccurate or biased results can result if a frame excludes certain portions of the population Using different frames to generate data can lead to dissimilar conclusions Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 29

Types of Samples DCOVA Samples Non-Probability Samples Judgment Convenience Probability Samples Simple Random Stratified

Types of Samples DCOVA Samples Non-Probability Samples Judgment Convenience Probability Samples Simple Random Stratified Systematic Copyright © 2015, 2012, 2009 Pearson Education, Inc. Cluster Chapter 1, Slide 30

Types of Samples: Nonprobability Sample n DCOVA In a nonprobability sample, items included are

Types of Samples: Nonprobability Sample n DCOVA In a nonprobability sample, items included are chosen without regard to their probability of occurrence. n n In convenience sampling, items are selected based only on the fact that they are easy, inexpensive, or convenient to sample. In a judgment sample, you get the opinions of preselected experts in the subject matter. Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 31

Types of Samples: Probability Sample n DCOVA In a probability sample, items in the

Types of Samples: Probability Sample n DCOVA In a probability sample, items in the sample are chosen on the basis of known probabilities. Probability Samples Simple Random Systematic Copyright © 2015, 2012, 2009 Pearson Education, Inc. Stratified Cluster Chapter 1, Slide 32

Probability Sample: Simple Random Sample n n n DCOVA Every individual or item from

Probability Sample: Simple Random Sample n n n DCOVA Every individual or item from the frame has an equal chance of being selected Selection may be with replacement (selected individual is returned to frame for possible reselection) or without replacement (selected individual isn’t returned to the frame). Samples obtained from table of random numbers or computer random number generators. Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 33

Selecting a Simple Random Sample Using A Random Number Table DCOVA Sampling Frame For

Selecting a Simple Random Sample Using A Random Number Table DCOVA Sampling Frame For Population With 850 Items Item Name Item # Bev R. Ulan X. . . Joann P. Paul F. 001 002. . 849 850 Portion Of A Random Number Table 49280 88924 35779 00283 81163 07275 11100 02340 12860 74697 96644 89439 09893 23997 20048 49420 88872 08401 The First 5 Items in a simple random sample Item # 492 Item # 808 Item # 892 -- does not exist so ignore Item # 435 Item # 779 Item # 002 Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 34

Probability Sample: Systematic Sample n n DCOVA Decide on sample size: n Divide frame

Probability Sample: Systematic Sample n n DCOVA Decide on sample size: n Divide frame of N individuals into groups of k individuals: k=N/n Randomly select one individual from the 1 st group Select every kth individual thereafter N = 40 n=4 k = 10 First Group Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 35

Probability Sample: Stratified Sample n DCOVA Divide population into two or more subgroups (called

Probability Sample: Stratified Sample n DCOVA Divide population into two or more subgroups (called strata) according to some common characteristic n A simple random sample is selected from each subgroup, with sample sizes proportional to strata sizes n n Samples from subgroups are combined into one This is a common technique when sampling population of voters, stratifying across racial or socio-economic lines. Population Divided into 4 strata Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 36

Probability Sample Cluster Sample n n DCOVA Population is divided into several “clusters, ”

Probability Sample Cluster Sample n n DCOVA Population is divided into several “clusters, ” each representative of the population A simple random sample of clusters is selected All items in the selected clusters can be used, or items can be chosen from a cluster using another probability sampling technique A common application of cluster sampling involves election exit polls, where certain election districts are selected and sampled. Population divided into 16 clusters. Randomly selected clusters for sample Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 37

Probability Sample: Comparing Sampling Methods n n n DCOVA Simple random sample and Systematic

Probability Sample: Comparing Sampling Methods n n n DCOVA Simple random sample and Systematic sample n Simple to use n May not be a good representation of the population’s underlying characteristics Stratified sample n Ensures representation of individuals across the entire population Cluster sample n More cost effective n Less efficient (need larger sample to acquire the same level of precision) Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 38

Evaluating Survey Worthiness DCOVA n n n What is the purpose of the survey?

Evaluating Survey Worthiness DCOVA n n n What is the purpose of the survey? Is the survey based on a probability sample? Coverage error – appropriate frame? Nonresponse error – follow up Measurement error – good questions elicit good responses Sampling error – always exists Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 39

Types of Survey Errors n Coverage error or selection bias n n People who

Types of Survey Errors n Coverage error or selection bias n n People who do not respond may be different from those who do respond Sampling error n n Exists if some groups are excluded from the frame and have no chance of being selected Nonresponse error or bias n n DCOVA Variation from sample to sample will always exist Measurement error n Due to weaknesses in question design and / or respondent error Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 40

Types of Survey Errors DCOVA (continued) n Coverage error Excluded from frame n Nonresponse

Types of Survey Errors DCOVA (continued) n Coverage error Excluded from frame n Nonresponse error Follow up on nonresponses n Sampling error Random differences from sample to sample n Measurement error Bad or leading question Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 41

Chapter Summary In this chapter we have discussed: n n n The types of

Chapter Summary In this chapter we have discussed: n n n The types of variables used in statistics The different measurement scales How to collect data The different ways to collect a sample The types of survey errors Copyright © 2015, 2012, 2009 Pearson Education, Inc. Chapter 1, Slide 42