LengthBiased Sampling A Review of Applications Termeh Shafie
Length-Biased Sampling: A Review of Applications Termeh Shafie Department of Statistics Umeå University termeh. shafie@stat. umu. se
Outline 1. 2. 3. Length-Biased Sampling & the Estimation-Problem Applications & Suggested Solutions Simulation under Misspecified Sampling Inclusion Probabilities
Length-Biased Sampling n n n The probability of sample inclusion of a population unit is related to the value of the variable measured. Cox (1969): Textile fibre sampling A simple illustration of the problem when estimating the population mean
The Estimation Problem Assume there is a population with elements The mean of the population is
The Estimation Problem Suppose observations form a sample with sample mean where if individual i is sampled otherwise
The Estimation Problem The expected value of the sample mean is where are the inclusion probabilities of the population units.
The Estimation Problem Using simple random sampling and thus
The Estimation Problem However in general is unknown and thus The sample mean becomes a biased estimator of the population mean.
Cox (1969) Derived the length-biased or weighted pdf and looked at the estimation of the population mean from a length-biased sample. Assume pdf is a random sample with
Cox (1969) It can be shown that An unbiased estimator of is
Cox (1969) with variance Note: ~N
Cox (1969) Relation between the moments of g(x) and f(x): The relative bias is thus
2. APPLICATIONS Technical/Industrial Sampling Cox (1969): Sampling textile fibres and the estimation of fibre length distribution.
Marketing n - - Shopping Center Sampling & Mall Intercept Surveys: Keillor et al (2001): Global consumer tendencies. Sudman (1980): Quota sampling techniques and weighting procedures to correct for frequency bias. Nowell et al (1991): correction techniques for lengthbiased sampling in two situations; when total length of stay is known or estimated and when only the recurrence time is known.
Epidemiology n n Sampling procedure for the collection of positive-valued or lifetime data are lengthbiased (Simon 1980, Zelen et al. 1969) Wang (1996): statistical analysis of lengthbiased data under proportional hazards model. A pseudo-likelihood approach for estimation of the parameters from length-biased data is presented.
Resource Economics n - - - On-site sampling: Deriving demand functions for a recreational site (Bockstael 1990, Ovaskainen et al. 2001) Charting trip taking behavior (Bowker 1998) Travel cost models of recreational demand (Moons et al. 2001) Contingent valuation surveys for the elicitation of non-market goods (Cameron et al. 1987, Nowell et al. 1988)
Resource Economics n 1. 2. 3. Shaw (1988): Three problems with on-site samples’ regression; Non-negative integers Truncation Endogeneous Stratification
Resource Economics 1. 2. Shaw (1988): recreational demand modeling under two assumptions about the dependent variable’s distribution: Normal distribution Poisson distribution: y=1, 2, …
Resource Economics n - Englin & Shonkwiler (1995): The Negative Binomial Model The truncated, stratified model is y=1, 2, …
Resource Economics Nunes (2003): Binary Choice Models The count variable is described by a Poisson distribution with an unobservable heterogeneity term correlated with the error term in a probit binary choice model
3. Misspecification of Sampling Probabilities: A Simulation n n Aim: To see whether or not the effect of missepecified sampling probabilities is large or not… What happens if time per visit is correlated with frequency of visits when estimating the expected number of visits?
Misspecification of Sampling Probabilities: A Simulation n Time is modeled as a function of frequency of visits when estimating the population mean. ~ Poisson ~ Exponential ~ Gamma The inclusion probabilities are proportional to the time spent at the site:
Misspecification of Sampling Probabilities: A Simulation The three estimators used for the simulation are: n The sample mean: n Shaw’s estimator: n Cox’s Estimator:
Simulation Results Sample mean 0. 689 (0. 481) 0. 964 (0. 939) 1. 058 (1. 131) 0. 780 (0. 656) 0. 983 (1. 016) 1. 118 (1. 301) Shaw’s estimator -0. 311 (0. 103) -0. 036 (0. 011) 0. 058 (0. 015) -0. 220 (0. 096) - 0. 017 (0. 050) 0. 118 (0. 065) Cox’s estimator 0. 398 (0. 162) 0. 567 (0. 327) 0. 642 0. 155 (0. 419) (0. 100) 0. 036 (0. 081) 0. 176 (0. 112)
Summary If the probabilities of sample inclusion of population units are related to the values of the variable measured, the parameter estimates will be biased and inconsistent. Thus correctly specified sampling inclusion mechanisms should not be neglected!
References n n n n n Bockstael , N. E. , Strand, I. E. , Mc. Connell, K. E. , Arsanjani, F. , 1990. Sample Selection Bias in the Estimation of Recreational Demand Functions: An Application to Sportfishing. Land Economics, vol. 66. No 1, 40 -49 Bowker, J. M. , Leeworthy, V. R. , 1998. Accounting for Ethnicity in Recreation Demand: A Flexible Count Data Approach. Journal of Leisure research 30(1), 64 -78. Bush, A. J, Hair, J. F. , 1985. An Assessment of the Mall Intercept as a Data Collection Method. Journal of Marketing Research 22, 158 -67. Cameron, T. A. , James, M. D. , 1987. Efficient Estimation Methods for "Close. Ended" Contingent Valuation Surveys. The Review of Economics and Statistics 69, 269 -276. Cox, D. R. , 1969. "Some Sampling Problems in Technology" in New Developments in Survey Sampling, U. L. Johnson and H. Smith, eds. New York: Wiley Interscience. Englin, J. , Shonkwiler, J. S. , 1995. Estimating Social Welfare Using Count Data Models: An Application to Long-Run Recreation Demand under Conditions of Endogenous Stratifications and Truncation. Review of Economics and Statistic 77, 104 -112. Keillor, B. D. , D'Amico, M. , Horton, V. , 2001. Global Consumer Tendencies, Psychology and Marketing 18, 1 -19. Laitila, T. , 1998. Estimation of Combined Site-Choice and Trip-Frequency Models of Recreational Demand using Choice-based and On-Site Samples. Economics Letters 64, 17 -23. Moons, E. , Loomis, J. , Proost, S. , Eggermont, K. , Hermy, M. , 2001. Travel Cost and
And finally she stops…
- Slides: 28