Virtual COMSATS Inferential Statistics Lecture25 Ossam Chohan Assistant

  • Slides: 16
Download presentation
Virtual COMSATS Inferential Statistics Lecture-25 Ossam Chohan Assistant Professor CIIT Abbottabad 1

Virtual COMSATS Inferential Statistics Lecture-25 Ossam Chohan Assistant Professor CIIT Abbottabad 1

Recap of previous lectures • We are working on hypothesis testing. And we discussed

Recap of previous lectures • We are working on hypothesis testing. And we discussed following topics in our last lectures: – – – Introduction to Hypothesis Testing. Six Steps of hypothesis testing. Hypothesis testing for single population i. e for mean and proportion. Hypothesis testing for paired observations. Chi Square distribution. – – – Test of independence. Test for homogeneity. Test for variances. Goodness of fit test. Fisher’s Exact Test. – F-Distribution. – ANOVA – One Way ANOVA. – Two Way ANOVA. – Multiple Comparison using LSD. 2

Objective of lecture-25 • Introduction of Correlation and Regression. – – – – –

Objective of lecture-25 • Introduction of Correlation and Regression. – – – – – Simple Correlation and its significance in research. Solution to problems. Properties of Correlation. Scatter Plot. Coefficient of determination. Regression Analysis. Simple Regression model. Probable error and standard error. Hypothesis testing for correlation coefficient. 3

Correlation and Regression • Is there a relationship between x and y? • What

Correlation and Regression • Is there a relationship between x and y? • What is the strength of this relationship – Pearson’s r • Can we describe this relationship and use it to predict y from x? – Regression • Is the relationship we have described statistically significant? – F- and t-tests 4

Discussion on Correlation 5

Discussion on Correlation 5

Correlation • Correlation analysis is used to measure strength of the association (linear relationship)

Correlation • Correlation analysis is used to measure strength of the association (linear relationship) between two variables like fertilizer and yield. – Only concerned with strength of the relationship. – No causal effect is implied. – Sample correlation coefficient is represented by r. 6

Scatter Plot • A scatter plot is a graph of a collection of ordered

Scatter Plot • A scatter plot is a graph of a collection of ordered pairs (x , y). • A scatter plot (or scatter diagram) is used to show the relationship between two variables. • Correlation can be represented using scatter plot. • Why Scatterplot? 7

Scatter Plot Examples Linear relationships y Curvilinear relationships y x y x x 8

Scatter Plot Examples Linear relationships y Curvilinear relationships y x y x x 8

Scatter Plot Examples (continued) Strong relationships y x Can we show it nume rically

Scatter Plot Examples (continued) Strong relationships y x Can we show it nume rically ? Weak relationships y x 9

Scatter Plot Examples (continued) No relationship y x 10

Scatter Plot Examples (continued) No relationship y x 10

Correlation Coefficient • The population correlation coefficient is shown by a Greek symbol ρ

Correlation Coefficient • The population correlation coefficient is shown by a Greek symbol ρ (pronounce it as “rho”). • It measures the strength of the association between the variables. • The sample correlation coefficient r is an estimate of ρ. • It is used to measure the strength of the linear relationship in the sample observations 11

Properties of ρ and r • Unit free • Values always lies between -1

Properties of ρ and r • Unit free • Values always lies between -1 and 1. • The closer to -1, the stronger the negative linear relationship. • The closer to 1, the stronger the positive linear relationship. • The closer to 0, the weaker the linear relationship. 12

Calculating the Correlation Coefficient Sample correlation coefficient: In algebraic equivalent: where: r = Sample

Calculating the Correlation Coefficient Sample correlation coefficient: In algebraic equivalent: where: r = Sample correlation coefficient n = Sample size-no of pairs of values x = Value of the independent variable y = Value of the dependent variable 13

Problem-27 • The test-Retest method is one way of establishing the reliability of a

Problem-27 • The test-Retest method is one way of establishing the reliability of a test. The test is administered and then, at a later date, the same test is re-administered to the same individuals. Find the correlation coefficient between two sets of scores. First score 75 87 60 75 98 80 68 84 47 72 Second score 72 90 52 75 94 78 72 80 53 70 14

Problem-27 Solution 15

Problem-27 Solution 15

Coefficient of Determination, R 2 • The coefficient of determination is the portion of

Coefficient of Determination, R 2 • The coefficient of determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable • The coefficient of determination is also called Rsquared and is denoted as R 2 where Note: In the single independent variable case, the coefficient of determination is where: R 2 = Coefficient of determination r = Simple correlation coefficient 16