Evaluating Hypothesis Evaluating the accuracy of hypotheses is




















- Slides: 20

Evaluating Hypothesis 자연언어처리연구실 장정호

개요 • Evaluating the accuracy of hypotheses is fundamental to ML. - to decide whether to use this hypothesis - integral component of many learning system • Difficulty from limited set of data - Bias in the estimate - Variance in the estimate

1. Contents • Methods for evaluating learned hypotheses • Methods for comparing the accuracy of two learning algorithms when limited set of data is available

2. Estimating Hypothesis Accuracy • Two Interests 1. Given a hypothesis h and a data sample, what is the best estimate of the accuracy of h over unseen data? 2. What is probable error in accuracy estimate?

2. Evaluating… (Cont’d) • Two Definitions of Error 1. Sample Error with respect to target function f and data sample S, 2. True Error with respect to target function f and distribution D, How good an estimate of error. D(h) is provided by error. S(h)?

2. Evaluating… (Cont’d) • Problems Causing Estimating Error 1. Bias : if S is training set, error. S(h) is optimistically biased estimation bias = E[error. S(h)] - error. D(h) For unbiased estimate, h and S must be chosen independently 2. Variance : Even with unbiased S, error. S(h) may vary from error. D(h)

2. Evaluating… (Cont’d) • Estimators Experiment : 1. Choose sample S of size n according to distribution D 2. Measure error. S(h) is a random variable error. S(h) is an unbiased estimator for error. D(h) Given observed error. S(h) what can we conclude about error. D(h) ?

2. Evaluating… (Cont’d) • Confidence Interval if 1. S contains n examples, drawn independently of h and each other 2. n >= 30 then with approximately N% probability, error. D(h) lies in interval

2. Evaluating… (Cont’d) • Normal Distribution Approximates Binomial Distribution error. S(h) follows a Binomial distribution, with Approximate this by a Normal distribution with

2. Evaluating… (Cont’d) • More Correct Confidence Interval if 1. S contains N examples, drawn independently of h and each other 2. N>= 30 then with approximately 95% probability, error. S(h) lies in interval equivalently, error. S(h) lies in interval which is approximately

2. Evaluating… (Cont’d) • Two-sided and One-sided bounds 1. Two-sided What is the probability that error. D(h) is between L and U? 2. One-sided What is the probability that error. D(h) is at most U? 100(1 -a)% confidence interval in Two-sided implies 100(1 -a/2)% in One-sided.

3. General Confidence Interval • Consider a set of independent, identically distributed random variables Y 1…Yn, all governed by an arbitrary probability distribution with mean and variance 2. Define sample mean, • Central Limit Theorem As n , the distribution governing approaches a Normal distribution, with mean and variance 2 /n.

3. General Confidence Interval (Cont’d) 1. Pick parameter p to estimate error. D(h) 2. Choose an estimator error. S(h) 3. Determine probability distribution that governs estimator error. S(h) governed by Binomial distribution, approximated by Normal distribution when n>=30 4. Find interval (L, U) such that N% of probability mass falls in the interval

4. Difference in Error of Two Hypothesis • Assumption - two hypothesis h 1, h 2. - h 1 is tested on sample S 1 containing n 1 random examples. h 2 is tested on sample S 2 containing n 2 ramdom examples. • Object - get difference between two true errors. where, d = error. D(h 1) - error. D(h 2)

4. Difference in Error of Two Hypothesis(Cont’d) • Procedure 1. Choose an estimator for d 2. Determine probability distribution that governs estimator 3. Find interval (L, U) such that N% of probability mass falls in the interval

4. Difference in Error of Two Hypothesis(Cont’d) • Hypothesis Test Ex) size of S 1, S 2 is 100 error s 1(h 1)=0. 30, errors 2(h 2) = 0. 20 What is the probability that error. D(h 1) > error. D(h 2)?

4. Difference in Error of Two Hypothesis(Cont’d) • Solution 1. The problem is equivalent to getting the probability of the following 2. From former expression, 3. Table of Normal distribution shows that associated confidence level for two-sided interval is 90%, so for one-sided interval, it is 95%

5. Comparing Two Learning Algorithms • What we’d like to estimate: where L(S) is the hypothesis output by learner L using training set S But, given limited data D 0, what is a good estimator? Could partition D 0 into training set S and test set T 0, and measure error. T 0(LA(S 0)) - error. T 0(LB(S 0)) Even better, repeat this many times and average the results

5. Comparing Two Learning Algorithms(Cont’d) 1. Partition data D 0 into k disjoint test sets T 1, T 2, …, Tk of equal size, where this size if at least 30. 2. For 1 <= i <=k, do use Ti for the test set, and the remaining data for training set Si Si = {D 0 - Ti}, h. A= LA(Si), h. B= LB(Si) 3. Return the value i, where

5. Comparing Two Learning Algorithms(Cont’d) 4. Now, use paired t test on to obtain a confidence interval The result is… N% confidence interval estimate for :
What is evaluating hypothesis?
Marzano high yield instructional strategies
Chapter 20 testing hypotheses about proportions
Chapter 20 testing hypotheses about proportions
Two types hypothesis
Hypotheses development
Hypotheses
Theoretical framework in quantitative research
General to specific ordering of hypothesis
Hypothesis of the study example
Testing hypothesis
Ruling out rival hypotheses
Chapter 19 testing hypotheses about proportions
Theoretical framework and hypothesis development
Examples of quantitative research questions and hypotheses
Research hypothesis example
Analysis of competing hypotheses template
Nebular hypothesis and protoplanet hypothesis venn diagram
Developing null and alternative hypothesis
Null hypothesis and alternative hypothesis examples
Từ ngữ thể hiện lòng nhân hậu