1 BAN 280 Chapter 1 Introduction to Statistics
1 BAN 280 Chapter 1 Introduction to Statistics
Business Analytics Defined BAN stands for Business ANalytics. Scientific process of transforming data into insight for making better decisions. (INFORMS) Used for data-driven or fact-based decision making 2
Introduction • Three developments spurred recent explosive growth in the use of analytical methods in business applications: Technological advances in computing algorithms Data generation from personal electronic devices, web pages, POS, wireless devices produce incredible amounts of data for businesses. Technological advances in speed and storage capacity of computers 3
A Categorization of Analytical Methods and Models Descriptive analytics: Encompasses the set of techniques that describes what has happened in the past. Descriptive statistics Data Summary Data Visualization (data dashboards) Basic what-if spreadsheet models. 4
A Categorization of Analytical Methods and Models Predictive analytics: Consists of techniques that use models constructed from past data to predict the future or ascertain the impact of one variable on another based on probabilities. Regression Time Series Analysis Data Mining o Decision Trees o Artificial Neural Networks Simulation 5
A Categorization of Analytical Methods and Models Prescriptive Analytics: Indicates a best course of action to take based on known parameters and to a lesser extent probabilities Optimization Models Simulation Models Decision Models 6
The Spectrum of Business Analytics 7
8 Why have a class in Statistics? What is a definition for statistics? The field of Statistics is concerned with the collection, presentation, and analysis of data in order to assist a manager in the decisions making process. What is the “story” of the data?
9 Two Main Branches of Statistics Inferential Descriptive Describe the data Central Tendency Dispersion Distribution Infer or make conclusions from an analysis of the data
Statistical Terminology 10 Population – the collection of ALL entities possessing some characteristic we are interested in. Sample – some subset of a Population Parameter – a summary measure of some characteristic we are interested in for all entities in a population. sample statistic – a summary measure computed from a sample and used to estimate a Parameter from the Population where the sample was derived from.
11 TYPES OF Variables QUALITATIVE QUANTITATIVE Data which is Qualities numerical in nature Characteristics Can use mathematical that are not functions like add, subtract, etc. measurable with an interval or ratio number scale
12 QUALITATIVE ORDINAL NOMINAL Data classified into categories with no order implied Categorical data with ordering implied How was the movie last night? What color are your eyes? What is your Occupation? Accountant Economist Teacher Unemployed Rate your Professor Manager (Student) Excellent Very Good Fair Poor 1 2 3 4 5
13 QUANTITATIVE (Continuous) Interval Scale Numerical but no Zero (ie: tempature, change in employment, etc. ) The distant between consecutive values of the interval scale DOES have meaning You can perform math operations on interval variables Ratio Scale Numerical with a meaningful Zero Weight Age Height Time
14 Sources of Data Customer Surveys Historical Company Records Competitor Data Manufacturing and Sales Data (internal) MIS issues? IT issues? OPS issues?
15 Types of Data 1. Time Series Data is data collected through time. 2. Stock prices are an example of time series data. Tomorrow’s starting price for a stock depends on the ending price of that stock today. Stock prices “move” over time so it is important to factor in this effect. Cross Sectional Data does not have a “time” component Data collected on a variable at a single point in time. For example you might be interested in doing a study of comparative housing prices for the 8 major cities in June 2000.
16 Examining the Data First step in any analysis is to examine the data Arrays Listing the data in ascending or descending order. Useful in identifying common or outlying values Tables Summarizing Useful for visualizing important characteristics of the data Frequency the data into categories Distributions Graphical Representations Pie and Bar Charts Histograms
17 Descriptive Measures Central Tendency Mean Median Mode Dispersion Range Mean Absolute Deviation Standard Deviation
18 Selecting a sample Why sample? Cost and time advantages Population size - Census too cumbersome Destructive sampling
19 simple random sampling (srs) Definition Each member of the population has an equally likely chance of being selected. sampling with replacement Basis of most statistical inference
20 Errors in Collecting Data Sampling error • Error caused because no sample is exactly representative of population • Chance differences that occur when a sample is selected Non sampling error • Error caused by human.
21 Population µx sx Population Parameters are computed from a census of the entire Population and are used to describe some characteristic about the Population you are interested in (X). Parameters
22 Population µx Population Parameters are computed from a census of the entire Population and are used to describe some characteristic about the Population you are interested in (X). sx Parameters A sample is a subset of a larger Population sample statistics are computed from sample data and used to estimate Population Parameters sx statistics
- Slides: 22