Chapter 8 Sections 8. 1 – 8. 4
Daily Agenda • Bell Ringer • Review Bell Ringer • I CAN • Chapter 8
The distinction between population and sample is basic to statistics. To make sense of any sample result, you must know what population the sample represents.
8. 1 Sampling Students. A political scientist wants to know how college students feel about the Social Security system. She obtains a list of the 3456 undergraduates at her college and mails a questionnaire to 250 students selected at random. Only 104 questionnaires are returned. (a)What is the population in this study? Be careful: about what group does she want information? (b)What is the sample? Be careful: from what group does she actually obtain information? The important message in this problem is that the sample can redefine the population about which information is obtained.
8. 2 Student Archaeologists. An archaeological dig turns up large numbers of pottery shards, broken stone implements, and other artifacts. Students working on the project classify each artifact and assign it a number. The counts in different categories are important for understanding the site, so the project director chooses 2% of the artifacts at random and checks the students’ work. What are the population and the sample here?
8. 3 Software Survey. A statistical software company is planning on updating Version 8. 1 of its software and wants to know what features are most important to users. The company’s managers have the email addresses of 1100 individuals, mostly faculty at universities, for whom they have supplied free courtesy copies of Version 8. 1. They email these 1100 individuals and ask them to complete a survey online. A total of 186 of these individuals complete the survey. (a)What is the population of interest to the software company? Do you think the 1100 individuals contacted are representative of the population? Explain your reasons. (b)What is the sample? From what group is information actually obtained?
8. 4 Sampling on Campus. You see a student standing in front of the student center, now and then stopping other students to ask them questions. She says that she is collecting student opinions for a class assignment. Explain why this sampling method is almost certainly biased.
8. 5 More Sampling on Campus. You would like to start a club for psychology majors on campus, and you are interested in finding out what proportion of psychology majors would join. The dues would be $35 and used to pay for speakers to come to campus. You ask five psychology majors from your senior psychology honors seminar whether they would be interested in joining this club and find that four of the five students questioned are interested. Is this sampling method biased, and if so, what is the likely direction of bias?
8. 6 Apartment Living. You are planning a report on apartment living in a college town. You decide to select three apartment complexes at random for in-depth interviews with residents. Use Table B (start at line 128. ) to select a simple random sample of four of the following apartment complexes. Ashley Oaks Bay Pointe Beau Jardin Bluffs Brandon. Place Briarwood Brownstone Burberry Place Cambridge Chauncey Village Country View Country Villa Crestview Del-Lynn Fairington Fairway Knolls Fowler Franklin Park Georgetown Greenacres Mayfair Village Nobb Hill Pemberly Courts Peppermill Pheasant Run River Walk Sagamore Ridge Salem Courthouse Village Square Waterford Court
8. 7 Minority Managers. A firm wants to understand the attitudes of its minority managers toward its system for assessing management performance. Following is a list of all the firm’s managers who are members of minority groups. Use Table B at line 141 to choose five managers to be interviewed in detail about the performance appraisal system Adelaja Ahmadiani Barnes Bonds Burke Deis Ding Draguljic Fernandez Fox Gao Gemayel Gupta Hernandez Huo Ippolito Jiang Jung Mani Mazzeo Modur Rettiganti Rodriguez Sanchez Sgambellone Yajima
8. 8 Sampling Gravestones. The local genealogical society in Coles County, Illinois, has compiled records on all 55, 914 gravestones in cemeteries in the county for the years 1825 to 1985. Historians plan to use these records to learn about African Americans in Coles County’s history. They first choose an SRS of 395 records to check their accuracy by visiting the actual gravestones. (a)How would you label the 55, 914 records? (b)Use Table B, starting at line 137, to choose the first five records for the SRS.
8. 9 Ask More People. In the 2012 presidential pre-election surveys, Pew Research sampled 1, 112 likely voters during October 4 -7, 2012, and asked if they were planning to vote for Obama, and then asked the same question of a sample of 1, 495 likely voters taken from October 24 -28, 2012. However, in their last survey taken October 31 -November 3, 2012, just before the election held on November 6, 2012, they asked this question of a sample of 2, 709 likely voters. Why do you think Pew did this?
8. 10 How Accurate Is the Poll? The New York Times/CBS News poll conducted during February 19 -23, 2014, included 1644 adults, of which 519 were Republican, 515 were Democrats, 550 were Independent, and 60 didn’t know or didn’t respond. Each person sampled was asked their opinion on a variety of issues facing the nation, such as, “Do you feel that the distribution of money and wealth in this country is fair, or do you feel that the money and wealth in this country should be more evenly distributed among more people? ” The margin of error (we will give more detail in later chapters) was reported as ± 3% for the entire sample. When considering the opinions of only the Republicans in the sample, the margin of error was reported as ± 6%. What do you think explains the fact that estimates for Republicans were less precise than for the entire sample?
8. 11 Sampling Metro Chicago. Cook County, Illinois, has the second-largest population of any county in the United States (after Los Angeles County, California). Cook County has 30 suburban townships and an additional eight townships that make up the city of Chicago. The suburban townships are: Barrington New Trier Stickney Leyden Park. River Forest Elk Grove Palatine Bremen Norwood Park Worth The Chicago townships are Hyde Park West Lake North Chicago Maine Schaumburg Lemont Rich Orland Bloom Northfield Wheeling Riverside Hanover Proviso Cicero Berwyn Niles Thornton Lyons Chicago South Chicago Jefferson Lake View Rogers Evanston Palos Calumet Oak Because city and suburban areas may differ, the first stage of a multistage sample chooses a stratified sample of five suburban townships and three of the more heavily populated Chicago townships. Use software, the Simple Random Sample applet, or Table B to choose this sample. (If you use Table B, assign labels in alphabetical order and start at line 116 for the suburbs and at line 126 for Chicago.
8. 12 Academic Dishonesty. A study of academic dishonesty among college students used a two-stage sampling design. The first stage chose a sample of 30 colleges and universities. Then, the study authors mailed questionnaires to a stratified sample of 200 seniors, 100 juniors, and 100 sophomores at each school. One of the schools chosen has 1127 freshmen, 989 sophomores, 943 juniors, and 895 seniors. You have alphabetical lists of the students in each class. Explain how you would assign labels for stratified sampling. Then use software or Table B, starting at line 140, to select the first five students in the sample from each stratum. After selecting five students for a stratum, continue to select the students for the next stratum.
8. 13 A Survey of 100, 000 Physicians. In 2010, the Physicians Foundation conducted a survey of physicians’ attitude about health care reform, calling the report “a survey of 100, 000 physicians. ” The survey was sent to 100, 000 randomly selected physicians practicing in the United States: 40, 000 via post-office mail and 60, 000 via email. A total of 2, 379 completed surveys were received. 10 (a)State carefully what population is sampled in this survey and what is the sample size. Could you draw conclusions from this study about all physicians practicing in the United States? (b)What is the rate of nonresponse for this survey? How might this affect the credibility of the survey results? (c)Why is it misleading to call the report “a survey of 100, 000 physicians”?
8. 14 Gays in the Military. In 2010, a Quinnipiac University Poll and a CNN Poll each asked a nationwide sample about their views on openly gay men and women serving in the military. 11 Here are the two questions: Question A: Federal law currently prohibits openly gay men and women from serving in the military. Do you think this law should be repealed or not? Question B: Do you think people who are openly gay or homosexual should or should not be allowed to serve in the U. S. military? One of these questions had 78% responding “should, ” and the other question had only 57% responding “should. ” Which wording is slanted toward a more negative response on gays in the military? Why?
First, call screening is now common. A large majority of American households have answering machines, voicemail, or caller ID, and many use these methods to screen their calls. Calls from polling organizations are rarely returned. More seriously, the percent of cell-phone-only households is increasing rapidly. By mid 2007, 14% of American households had a cell phone but no landline phone; by the end of 2009, that percent had increased to almost 25%; and in 2012, the percent was almost 36%. It’s clear from these numbers that RDD reaching only landline numbers is in trouble. Can surveys just add cell phone numbers? Not easily. Federal regulations prohibit automated dialing to cell phones, which rules out computerized RDD sampling and requires hand dialing of cell phone numbers, which is expensive. A cell phone can be anywhere, and many people keep their cell number despite moving, so stratifying by location becomes difficult. And a cell phone user may be driving or otherwise unable to talk safely.
One alternative is to use web surveys, an increasingly popular survey method, rather than telephone surveys. Web surveys have several advantages over more traditional survey methods. It is possible to collect large amounts of survey data at lower costs than traditional methods allow. Anyone can put survey questions on dedicated sites offering free services; thus large-scale data collection is available to almost every person with access to the Internet. Furthermore, web surveys allow delivery of multimedia survey content to respondents, opening up new realms of survey possibilities that would be extremely difficult to implement using traditional methods. Some argue that eventually web surveys will replace traditional survey methods.
Although web surveys are easy to do, they are not easy to do well. Three major problems are voluntary response, undercoverage, and nonresponse. Voluntary response appears in several forms in online surveys. Example 8. 3 is a survey that invited individuals to a particular website to participate in a poll. Other web surveys solicit participation through announcements in news groups, email invitations, and banner ads on high-traffic sites. Undercoverage is a serious problem for even careful web surveys, because about 25% of Americans lack Internet access and only about 70% have broadband access. People without Internet access are more likely to be poor, elderly, minority, or rural than the overall population, so the potential for bias in a web survey is clear. There is no easy way to choose a random sample even from people with web access because there is no technology that generates personal email addresses at random in the way that RDD generates residential telephone numbers, and individuals may have several email addresses. Even if such technology existed, etiquette and regulations aimed at spammers would prevent mass emailing. For the present, web surveys work well only for restricted populations, for example, surveying students at your university using the school’s list of student email addresses. Here is an example of a successful web survey.
8. 15 NPR Facebook Survey. In 2010, National Public Radio (NPR) conducted a survey of preferences and habits of its Facebook fans by recruiting respondents through messages posted on its Facebook page. The survey was conducted online and deployed July 12 -19. A total of 40, 043 respondents began the survey, with 33, 304 completing all questions. It was found that people accessed NPR on the radio, at NPR. org, through i. Phone apps, and several other platforms. Asked about time spent with NPR, about 20% of respondents indicated that they spent more than three hours per day, including radio listening. (a)Here is what NPR says about the survey methodology: “Respondents were self-selected and the resulting sample is non -random—therefore a margin of error cannot be calculated, and the survey results cannot be projected to any population other than the sample itself. ” 17 Why can’t inference about any population be made? (b)Suppose that people who spent more time with NPR were more likely to respond to the survey. Do you think the true percentage of NPR’s Facebook fans who spend more than three hours with NPR is higher or lower than the 20% found from the survey? Explain why.
8. 16 More on Random Digit Dialing. In the first half of 2013, about 38% of adults lived in households with a cell phone and no landline phone. Among adults aged 25 to 29, this percent was about 65%, while among adults over 65, the percent was only 13%. 18 (a)Write a survey question for which the opinions of adults with landline phones only are likely to differ from the opinions of adults with cell phones only. Give the direction of the difference of opinion. (b)For the survey question in part (a), suppose a survey was conducted using random digit dialing of landline phones only. Would the results be biased? What would be the direction of bias? (c)Most surveys now supplement the landline sample contacted by RDD with a second sample of respondents reached through random dialing of cell phone numbers. The landline respondents are weighted to take account of household size and number of telephone lines into the residence, whereas the cell phone respondents are weighted according to whether they were reachable only by cell phone or also by landline. Explain why it is important to include both a landline sample and a cell phone sample. Why is the number of telephone lines into the residence important? (Hint: How does the number of telephone lines into the residence affect the chance of the household being included in the RDD sample? )