S 519 Evaluation of Information Systems Social Statistics
- Slides: 23
S 519: Evaluation of Information Systems Social Statistics Ch 5: Correlation
This week l l l What is correlation? How to compute? How to interpret?
Correlation Coefficients l The relations between two variables l l How the value of one variable changes when the value of another variable changes A correlation coefficient is a numerical index to reflect the relationship between two variables. l l Range: -1 ~ +1 Bivariate correlation (for two variables)
Correlation Coefficients l Parametric l l Pearson product-moment correlation (named for inventor Karl Pearson) Non-parametric l l Spearman’s rank correlation Kendall tau rank correlation coefficient
Pearson correlation coefficient l For two variables which are continuous in nature l l Height, age, test score, income But not for discrete or categorical variables l Race, political affiliation, social class, rank Rxy is the correlation between variable X and variable Y
Types of correlation coefficients l Direct correlation (positive correlation): l l If both variables change in the same direction Indirect correlation (negative correlation): l If both variables change in opposite directions l See table 5. 1 (S-p 112) l -0. 70 and +0. 5, which is stronger?
Pearson product-moment correlation coefficient n X Y XY X 2 Y 2 The correlation coefficient between X and Y the size of the sample the individual’s score on the X variable the individual’s score on the Y variable the product of each X score times its corresponding Y score the individual X score, squared the individual Y score, squared
Exercise l Calculate Pearson correlation coefficient X Y 2 4 5 6 4 7 8 5 6 7 1. Is variable X and variable Y correlated? 2. What does this correlated mean? 3 2 6 5 3 6 5 4 4 5
Using Excel to calculate l l CORREL function Or Pearson function
Visualizing a correlation l Scatterplot or scattergram X Y 2 4 5 6 4 7 8 5 6 7 Y X 3 2 6 5 3 6 5 4 4 5
Visualizing a correlation
Direct (positive) correlation 9 8 7 6 5 4 3 2 1 0 0 l l 1 2 3 4 5 6 7 8 9 r =1, a perfect direct (or positive) correlation In real life case, 0. 7 and 0. 8 could be the highest you will see
Indirect (or negative) correlation 9 8 7 6 5 4 3 2 1 0 0 l 1 2 3 4 5 6 7 8 9 Strength and direction are important
Excel Scatterplot Four sets of data with the same correlation of 0. 816
Linear correlation l Linear correlation means that X and Y are in one straight line l Curvlilinear correlation l Age and memory
More than 2 variables? income How to calculate the correlation coefficient? education 74190 80931 81314 73089 62023 61217 84526 87251 62659 76450 70512 78858 78628 86212 74962 58828 61471 78621 60071 attitude 13 12 11 11 11 10 11 11 12 10 12 9 13 14 9 11 10 12 9 vote 1 3 4 5 4 5 6 7 8 8 9 8 7 8 1 2 2 2 1 1 2 2 4 5 5 4 1. CORREL() 2. Correlation in data analysis toolset
More than 2 variables? l Correlation matrix Income Education Attitude Vote 1. 00 0. 35 -0. 19 0. 51 1. 00 -0. 21 0. 43 1. 00 0. 55 1. 00
Excel l Data Analysis tool - correlation
Meaning of Correlation coefficient l Correlation value: l l - finite number ~ + finite number Correlation coefficient value: l -1. 00 ~ +1. 00 rxy value Interpretation 0. 8 ~ 1. 0 Very strong relationship (share most of the things in common) 0. 6 ~0. 8 Strong relationship (share many things in common) 0. 4 ~ 0. 6 Moderate relationship (share something in common) 0. 2 ~ 0. 4 Weak relationship (share a little in common) 0. 0 ~ 0. 2 Weak or no relationship (share very little or nothing in common)
Coefficient of determination l Coefficient of determination: l l The percentage of variance in one variable that is accounted for by the variance in the other variable. = square of coefficient 49% of the variance in GPA can be explained by the variance in studying time
Coefficient of nondetermination l The amount of unexplained variance is called the coefficient of undetermination (coefficient of alienation) correlation determination 0 0 0. 5 0. 25 0. 9 0. 81 interpretation
Ice cream and crime l l In a small town in Greece, The local police found the direct correlation between ice cream and crime
Correlation vs. causality l l The correlation represents the association between two or more variables It has nothing to do with causality (there is no cause relation between two correlated variables) l l Ices cream and crime are correlated, but Ices cream does not cause crime
- Point of common coupling (pcc)
- Upenn machine learning
- Cis 419 upenn
- Upenn cis 519
- Ieee-519
- Cs 519
- Cs 519
- Distinguish between gdp and gnp
- Upenn cis 519
- Dan roth upenn
- Introduction to statistics what is statistics
- Key technology trends that raise ethical issues
- Social media information system
- Ethical and social issues in information systems doc
- Social media information systems
- Social media information systems
- Chapter 4 ethical issues
- Chapter 4 ethical and social issues in information systems
- Ethical and social issues in information systems
- Social thinking social influence social relations
- Social thinking social influence social relations
- Oman national centre for statistics and information
- Precision recall information retrieval
- Information retrieval evaluation