MATB 344 Applied Statistics Chapter 0 Introduction to
MATB 344 Applied Statistics Chapter 0: Introduction to Statistics
What is Statistics? • • • Satistics - a branch of mathematics that have applications in daily life like a “language” that have to be learned Involves gathering, analysis and presentation of data. Always encounter in our daily life. Unavoidable. Example, in department of statistics, Malaysia… 2
Uses of Statistics • General usage: – a theoretical discipline in its own right – a tool for researchers in other fields – a general tool to draw general conclusions in a large variety of applications • In Politics – Forecasting and predicting winners of elections • where to concentrate campaign appearances • how and when to advertise • where to spend money effectively … • In Industry – To market product. • For example, to predict the average length of life of a light bulb, cannot test all the bulbs, so choose some sample to obtain the statistics. 3
Statistics in IT? • Analysis of experimentation results • Design and developments of statistical softwares • One of the machine learning methods uses statistical concepts – The Statistical Learning theory 4
What’s involved in Statistics? • Collects numbers or data • Systematically organizes or arranges the data • Analyzes the data…extracts relevant information to provide a complete numerical description • Infers general conclusions about the problem using this numerical description 5
Example use of Statistics • A newspaper produce the results of its survey on terrorism as follows ON TERRORISM Do you. THE think. WAR that the United States war on terrorism will spread to countries other than Afghanistan? YES 64% How do we get the poll? Ask everyone? NO 34% Do you think that the United States should be directly involved in negotiating peace agreements in other parts of the world? YES 62% Is it possible? OF COURSE. . NOT NO 31% 6
Problems in Statistics • Need the conclusion and prediction POPULATION about the whole body of measurements, eg: Malaysians • But we cannot survey on them all. • Sometimes, a the whole body of measurements is large and cannot be totally enumerated. SAMPLE • Solution: Use a smaller set of measurement to represent the whole body of measurements 7
Examples • To predict the average length of life of a light bulb - to enumerate the population is destructive. We cannot take all light bulb and test. - So, select a smaller number of light bulb as a sample • To forecast the winner of an election - population of the whole country is too big and people do change their mind - So, select a group of people in certain location to be the sample. 8
Common Terms • Experimental Units: The set of items or objects on which measurements are taken • Sample (or Population): The set of measurements taken on the experimental units. • Examples: – Light bulbs • Experimental unit = a bulb • Sample = the measures of operating life of all the bulbs. – Opinion polls • Experimental unit = person • Sample = the set of opinions of each person of who the winner of an election is. 9
Two kinds of Statistics • Inferential Statistics • Descriptive Statistics 10
Descriptive Statistics • Used to describe sets of measurements. • Example : Bar charts, pie charts, line charts etc. • Suitable for entire population. • DESCRIPTIVE STATISTICS consists of procedures used to summarize and describe the set of measurements. 11
Inferential Statistics • Used to describe / make inferences about a population based on statistics of the sample. • Used when we cannot enumerate the whole population • INFERENTIAL STATISTICS: Procedures used to draw conclusions or inferences about the population from information contained in the sample. 12
Objective of Inferential Statistics • To make inferences about a population – draw conclusions – make prediction – make decision from information contained in a sample. • The statistician’s job is to find the best way to do this. 13
But, … Our conclusions could be incorrect… consider this internet opinion poll… Who makes the best burgers? Votes Percent Mc. Donalds 123 Votes 13% Burger King 384 Votes 39% Wendy’s 304 Votes 31% All three have equally good burgers 72 Votes 7% None of these have good burgers 98 Votes 10% • How can we be sure that the poll result is reliable? • We need a measure of reliability. 14
Steps in Inferential Statistics • Define the objective of the experiment and the population of interest • Determine the design of the experiment and the sampling plan to be used • Collect and analyze the data • Make inferences about the population from information in the sample • Determine the goodness or reliability of the inference. 15
Step 1: • Define the objective of the experiment and the population of interest Example : In Presidential Election Objective : To determine Who will get the most votes Population: Set of all votes (from all registered voters) Sample : Sample voters from each states in USA. 16
Step 2: • Determine the design of the experiment and the sampling plan to be used • To decide how to select sample. • How big a sample to select. • How much will it cost if the sample is selected. 17
Step 3: • Collect and analyze the data • Collect information from the sample • Use appropriate method of analysis 18
Step 4: • Use information from the analysis to make inference • Many methods but only one is the most accurate. Choose the best. • Make inferences about the population from information in the sample 19
Step 5: • Inference might be wrong because we’re not looking at the whole population. • Need to have measure of reliability • Determine the goodness or reliability of the inference. 20
Conclusion: Learn to View Statistics Critically • Why? Because Statistics can lie. • According to people against statistics - there are three kinds of lies…. . – Lies – Damn Lies – Statistics • Be positive!!! You need to make statistics work for you, not lie for you! 21
Making Reliable Statistics • Use software tools to help perform the procedures. • List of Softwares: – MINITAB – SPSS – Microsoft EXCEL – Java applets. 22
- Slides: 22