ANALYTICAL CHEMISTRY ERT 207 Lecture 5 Coefficient correlation

  • Slides: 42
Download presentation
ANALYTICAL CHEMISTRY ERT 207 Lecture 5 Coefficient correlation Coefficient determination Calibration curve (slope &

ANALYTICAL CHEMISTRY ERT 207 Lecture 5 Coefficient correlation Coefficient determination Calibration curve (slope & intercept) 19 JAN 2011

Scatter Plots and Correlation �A scatter plot (or scatter diagram) is used to show

Scatter Plots and Correlation �A scatter plot (or scatter diagram) is used to show the relationship between two variables �Correlation analysis is used to measure strength of the association (linear relationship) between two variables ◦ Only concerned with strength of the relationship ◦ No causal effect is implied

Scatter Plot Examples Linear relationships y Curvilinear relationships y x y x x

Scatter Plot Examples Linear relationships y Curvilinear relationships y x y x x

Scatter Plot Examples (continued) Strong relationships y Weak relationships y x y x x

Scatter Plot Examples (continued) Strong relationships y Weak relationships y x y x x

Scatter Plot Examples (continued) No relationship y x

Scatter Plot Examples (continued) No relationship y x

Correlation Coefficient (continued) population correlation coefficient ρ (rho) measures the strength of the association

Correlation Coefficient (continued) population correlation coefficient ρ (rho) measures the strength of the association between the variables �The sample correlation coefficient r is an estimate of ρ and is used to measure the strength of the linear relationship in the sample observations

Features of ρ and r �Unit free �Range between -1 and 1 �The closer

Features of ρ and r �Unit free �Range between -1 and 1 �The closer to -1, the stronger the negative linear relationship �The closer to 1, the stronger the positive linear relationship �The closer to 0, the weaker the linear relationship

Examples of Approximate r Values y y y r = -1 x r =

Examples of Approximate r Values y y y r = -1 x r = -. 6 y x r=0 y r = +. 3 x r = +1 x x

Calculating the Correlation Coefficient Sample correlation coefficient: or the algebraic equivalent: where: r =

Calculating the Correlation Coefficient Sample correlation coefficient: or the algebraic equivalent: where: r = Sample correlation coefficient n = Sample size x = Value of the independent variable y = Value of the dependent variable

Example: �You are developing a new analytical method for the determination of blood urea

Example: �You are developing a new analytical method for the determination of blood urea nitrogen (BUN). You want to determine whether your method differs significantly from a standard one for analyzing a range sample concentrations expected to be found in the routine laboratory. It has been ascertained that the two methods have comparable precisions. Following are two sets of the results for a number of individual

Sample Your Method (mg/d. L) , x Standard Method (mg/d. L) , y A

Sample Your Method (mg/d. L) , x Standard Method (mg/d. L) , y A 10. 2 10. 5 B 12. 7 11. 9 C 8. 6 8. 7 D 7. 5 16. 9 E 11. 2 10. 9 F 11. 5 11. 1

Coefficient of Determination, R 2 �The coefficient of determination is the portion of the

Coefficient of Determination, R 2 �The coefficient of determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable �The coefficient of determination is also called R-squared and is denoted as R 2 where

Coefficient of Determination, R 2 (continued) Coefficient of determination Note: In the single independent

Coefficient of Determination, R 2 (continued) Coefficient of determination Note: In the single independent variable case, the coefficient of determination is where: R 2 = Coefficient of determination r = Simple correlation coefficient

�Total variation is made up of two parts: Total sum of Squares Sum of

�Total variation is made up of two parts: Total sum of Squares Sum of Squares Error Sum of Squares Regression where: = Average value of the dependent variable y = Observed values of the dependent variable = Estimated value of y for the given x value

(continued) � SST = total sum of squares ◦ Measures the variation of the

(continued) � SST = total sum of squares ◦ Measures the variation of the yi values around their mean y � SSE = error sum of squares ◦ Variation attributable to factors other than the relationship between x and y � SSR = regression sum of squares ◦ Explained variation attributable to the relationship between x and y

Explained and Unexplained Variation y yi 2 SSE = (yi - yi ) _

Explained and Unexplained Variation y yi 2 SSE = (yi - yi ) _ y (continued) y SST = (yi - y)2 _2 SSR = (yi - y) _ y Xi _ y x

Examples of Approximate R 2 Values y R 2 = 1 x 100% of

Examples of Approximate R 2 Values y R 2 = 1 x 100% of the variation in y is explained by variation in x y R 2 = +1 Perfect linear relationship between x and y: x

Examples of Approximate R 2 Values y 0 < R 2 < 1 x

Examples of Approximate R 2 Values y 0 < R 2 < 1 x Weaker linear relationship between x and y: Some but not all of the variation in y is explained by variation in x y x

Examples of Approximate R 2 Values R 2 = 0 y No linear relationship

Examples of Approximate R 2 Values R 2 = 0 y No linear relationship between x and y: R 2 = 0 x The value of Y does not depend on x. (None of the variation in y is explained by variation in x)

Introduction to Regression Analysis �Regression analysis is used to: ◦ Predict the value of

Introduction to Regression Analysis �Regression analysis is used to: ◦ Predict the value of a dependent variable based on the value of at least one independent variable ◦ Explain the impact of changes in an independent variable on the dependent variable Dependent variable: the variable we wish to explain Independent variable: the variable used to explain the dependent variable

Simple Linear Regression Model �Only one independent variable, x �Relationship between x and y

Simple Linear Regression Model �Only one independent variable, x �Relationship between x and y is described by a linear function �Changes in y are assumed to be caused by changes in x

Types of Regression Models Positive Linear Relationship Negative Linear Relationship NOT Linear No Relationship

Types of Regression Models Positive Linear Relationship Negative Linear Relationship NOT Linear No Relationship

Method of Least Squares Find “best” line by minimizing vertical deviation between the points

Method of Least Squares Find “best” line by minimizing vertical deviation between the points and the line. Chemistry 215 Copyright D Sharma 24

Calculating the Residual 25

Calculating the Residual 25

Linear Regression Fitting a straight line to observations Small residual errors Error = (Actual

Linear Regression Fitting a straight line to observations Small residual errors Error = (Actual value) – (Predicted value) Large residual error

Least Squares Parameters SLOPE INTERCEP T 27

Least Squares Parameters SLOPE INTERCEP T 27

Calibration Curves A calibration curve shows the response of an analytical method to known

Calibration Curves A calibration curve shows the response of an analytical method to known quantities of analyte. For example, a spectroscopic analysis of a protein sample… Necessary solutions: 1. 2. 3. Standard solutions Blank solution Sample solution(s) 28 Protein from the cancer-causing oncogene called ras (Credit: Sung-Hou Kim/UC Berkeley)

Constructing a Calibration Curve Spectroscopic analysis of a protein sample… 29

Constructing a Calibration Curve Spectroscopic analysis of a protein sample… 29

Constructing a Calibration Curve Spectroscopic analysis of a protein sample…cont. Determination of an unknown

Constructing a Calibration Curve Spectroscopic analysis of a protein sample…cont. Determination of an unknown value (x) based on its response (y) Equation of linear response y = m (x) + b Abs = m (µg protein) + b y = 0. 0163 (x) +0. 004 …where y is the corrected abs. Determine the unknown concentration based on its absorbance 30

Tips for Calibrating Instruments �Know the limitations of your instrument ◦ Limits of detection

Tips for Calibrating Instruments �Know the limitations of your instrument ◦ Limits of detection (or LOD) ◦ Range of linearity �Watch-out ◦ ◦ for interferences Overlapping spectral responses (e. g. , from impurities) Unwanted sample precipitation Matrix effects Internal standards can help determine if this is the case �Use serial dilutions where possible ◦ Less error than preparing individual samples 31

Serial Dilution (A Review) 32

Serial Dilution (A Review) 32

33

33

Using spreadsheet for plotting calibration curves �Some 1. 2. 3. 4. 5. useful statistical

Using spreadsheet for plotting calibration curves �Some 1. 2. 3. 4. 5. useful statistical syntaxes: AVERAGE = mean of series of number MEDIAN = median of series of number STDEV = standard deviation VAR = variance RSQ = R- squared

Riboflavin (ppm) 0. 000 0. 100 0. 200 0. 400 0. 800 Fluoresence intensity

Riboflavin (ppm) 0. 000 0. 100 0. 200 0. 400 0. 800 Fluoresence intensity 0. 00 5. 80 12. 20 22. 30 43. 30

50. 00 Fluoresence intensity 45. 00 Calibration curve 40. 00 35. 00 30. 00

50. 00 Fluoresence intensity 45. 00 Calibration curve 40. 00 35. 00 30. 00 25. 00 R 2 = 0. 9989 20. 00 15. 00 10. 00 5. 00 0. 000 0. 100 0. 200 0. 300 0. 400 0. 500 0. 600 0. 700 0. 800 0. 900 Riboflavin (ppm)

Slope, intercept and coefficient determination �We can use the Excel statistical functions to calculate

Slope, intercept and coefficient determination �We can use the Excel statistical functions to calculate the slope and intercept for a series of data , and the R 2 value without a plot �Open a new spreadsheet and enter the calibration data from the previous example. �In cell A 9 type INTERCEPT, cell A 10, SLOPE AND cell A 11, R-Squared. �Highlight cell B 9 �Click on fx: Statistical �And scroll down to INTERCEPT and click OK

�For known_x’s, enter the array A 3: A 7 and for known_y’s, enter B

�For known_x’s, enter the array A 3: A 7 and for known_y’s, enter B 3: B 7, then click OK. �The INTERCEPT is displayed in cell B 9. �Now repeat , highlighting cell B 10, scrolling to SLOPE , and entering the same arrays. The Slope appears in cell B 10. �Followed the same way for R-squared.

Exercise �The following data were obtained to get a calibration curve for the determination

Exercise �The following data were obtained to get a calibration curve for the determination of Zn in the wastewater by using atomic absorption spectrometry (AAS). Using calculator or computer, plot the data and find the best straight line equation and correlation determination.

Zn concentration (ppm) Absorbance 0 0 2 0. 095 4 0. 194 6 0.

Zn concentration (ppm) Absorbance 0 0 2 0. 095 4 0. 194 6 0. 290 8 0. 390 10 0. 466

�Solution: �The equation of the straight line is: Y =0. 047 X + 0.

�Solution: �The equation of the straight line is: Y =0. 047 X + 0. 002 Correlation determination (R 2) = 0. 998

Calibration curve 0. 5 0. 4 Absorbance 0. 35 0. 3 R 2 =

Calibration curve 0. 5 0. 4 Absorbance 0. 35 0. 3 R 2 = 0. 9987 0. 25 0. 2 0. 15 0. 1 0. 05 0 0 2 4 6 8 Zn Concentartion (ppm) 10 12