WHY WE USE EXPLORATORY DATA ANALYSIS DATA ARE

  • Slides: 21
Download presentation
WHY WE USE EXPLORATORY DATA ANALYSIS DATA ARE SAMPLE DATA FROM „NORMAL“ POPULATION? YES

WHY WE USE EXPLORATORY DATA ANALYSIS DATA ARE SAMPLE DATA FROM „NORMAL“ POPULATION? YES ESTIMATES BASED ON NORMAL DISTRIB. NO WHY ? OUTLIERS EXTREMS KURTOSIS, SKEWNESS QUANTILE (ROBUST) ESTIMATES 1 CAN WE REMOVED THEM ? YES NO TRANSFORMATIONS QUANTILE (ROBUST) ESTIMATES TRANSFORMATIONS

METHODS OF EDA Graphical: dot plot box plot notched box plot QQ plot histogram

METHODS OF EDA Graphical: dot plot box plot notched box plot QQ plot histogram density plots 2 Tests: tests of normality minimal sample size

DOT PLOT 3

DOT PLOT 3

BOX PLOT outer fence inner outer inner median číselná osa lower quartil 4 upper

BOX PLOT outer fence inner outer inner median číselná osa lower quartil 4 upper kvartil interquartile range (H)

NOTCHED BOX PLOT RF confidence interval of median 5

NOTCHED BOX PLOT RF confidence interval of median 5

Q-Q PLOT measured values Y: sample quantiles (ordered values) 6 ideal match between sample

Q-Q PLOT measured values Y: sample quantiles (ordered values) 6 ideal match between sample values and theoretical distribution Line a=0, b = 1 X: theoretical quantiles (ordered values)of analysed distribution

Q-Q GRAF 7

Q-Q GRAF 7

Q-Q GRAF 8

Q-Q GRAF 8

Q-Q plot left-leaning – skewed to right-leaning – skewed to left platycurtic („flat, broad“)

Q-Q plot left-leaning – skewed to right-leaning – skewed to left platycurtic („flat, broad“) 9 leptocurtic(„steep, slender“)

10

10

11

11

HISTOGRAM 12

HISTOGRAM 12

HISTOGRAM correct width of interval: 13

HISTOGRAM correct width of interval: 13

HISTOGRAM – kernel density function 14

HISTOGRAM – kernel density function 14

TRANSFORMATION Aim of transformation: reduction of variance better level of symmetry(normality) of data Transformation

TRANSFORMATION Aim of transformation: reduction of variance better level of symmetry(normality) of data Transformation function: non-linear function monotonic function 15

TRANSFORMATION – basic concept 0. 8 transformed mean and its projection to original data

TRANSFORMATION – basic concept 0. 8 transformed mean and its projection to original data set Transformed data 0. 6 0. 4 0. 2 mean of original data 0 -0. 2 -0. 4 16 0 0. 5 1 1. 5 2 2. 5 Original data (tree-rings widths in mm) 3 3. 5

TRANSFORMATION – logaritmic transformation 17

TRANSFORMATION – logaritmic transformation 17

TRANSFORMATION – power transformation 18

TRANSFORMATION – power transformation 18

TRANSFORMATION – Box-Cox 19

TRANSFORMATION – Box-Cox 19

TRANSFORMATION – Box-Cox 20

TRANSFORMATION – Box-Cox 20

TRANSFORMATION– estimate of optimal = 1 is not included in interval estimate of .

TRANSFORMATION– estimate of optimal = 1 is not included in interval estimate of . It means that interval estimate of parameter transformation will be probably successful logarithm of likelihood function for various values of max. LF – 0, 5*quantile 2 optimal 1. 00 21