Optimal Sampling Strategies for Multidomain Multivariate Case with






















- Slides: 22
Optimal Sampling Strategies for Multidomain, Multivariate Case with different amount of auxiliary information Piero Demetrio Falorsi , Paolo Righi falorsi@istat. it , parighi@istat. it Italian National Statistical Institute Seminar UNECE, 12 June 2012
Outline n n n Aim of the talk Statement of the problem (The unified approach for) sampling design (Mgreg) Estimator Experimental results Conclusions
Aim of the talk An overall strategy
Statement of the problem
Statement of the problem: Challenging informative context Multiple sources of auxiliary information
Statement of the problem: Design
Statement of the problem: Estimation Ø Standard solution for estimation (calibration estimators) may allow for calibrating at domain level only for the register variables and does not calibrate on the domain existing totals deriving from auxiliary data sources Ø Main drawback: ü Too small sample size for some domains Biased estimation for small domains ü Risk that the estimation of variables that could derive from administrative Data Source are significantly different from known totals Effect of non response or measurement error
Sampling Design: Multiple sources of auxiliary information
Sampling Design: Multiple sources of auxiliary information
Sampling Design: Multiple sources of auxiliary information
Estimation: Multiple sources of auxiliary information
Estimation: The Working model
Estimation: The Mgreg Estimator
Estimation: Properties
Estimation: Properties
Estimation: Properties - auxiliary=interest
Empirical Results: Population of simulation - 1999 Italian enterprises from 1 to 99 employees- Computer and related economic activities (2 -digits NACE Rev. 1) Populatio n size Number of cross-classified strata 1 68 Cumulative (%) distribution 18. 89 2 37 29. 17 3 -5 63 46. 67 6 -10 50 60. 56 11 -100 119 93. 61 More than 100 23 100. 00 The domains of interest (44): (1) geographical region with 20 marginal domains (DOM 1); (2) economic activity group by Size class (24 domains) ITACOSM 2011 - 27 -29 June 2011, Pisa, Italy - 12
Empirical Results: Simulation: allocation comparison between the oneway and multi-way design Ø Prediction models: M 1 M 2 Model % Labour cost Value added 68. 1 64. 1 65. 1 61. 0
Empirical Results: multiple sources of auxiliary information: example – efficiency of the proposed strategy Sampling distributions over the partition with different auxiliary information
Conclusions
Conclusions n The last result (The unified approach) of a research that has lasted almost 6 years n Survey Methodology (2008) n Statistics in Transition (2006) n 2 books published by Franco Angeli illustrating the main findings of a research of strategic interest financed by the Ministry of University and Research n Presentations NTTS (2011), Neuchatel (2011) n Invited talk to the next scientific conference of the Italian Society of Statistics n Accepted talk for the ICES
References § § § § Bethel J. (1989) Sample Allocation in Multivariate Surveys, Survey Methodology, 15, 47 -57. Chromy J. (1987). Design Optimization with Multiple Objectives, Proceedings of the Survey Research Methods Sec-tion. American Statistical Association, 194199. Deville J. -C. , Tillé Y. (2004) Efficient Balanced Sampling: the Cube Method, Biometrika, 91, 893 -912. Deville J. -C. , Tillé Y. (2005) Variance approximation under balanced sampling, Journal of Statistical Planning and Inference, 128, 569 -591 Falorsi P. D. , Righi P. (2008) A Balanced Sampling Approach for Multi-way Stratification Designs for Small Area Estimation, Survey Methodology, 34, 223234 Falorsi P. D. , Orsini D. , Righi P. , (2006) Balanced and Coordinated Sampling Designs for Small Domain Estimation, Statistics in Transition, 7, 1173 -1198 Isaki C. T. , Fuller W. A. (1982) Survey design under a regression superpopulation model, Journal of the American Statistical Association, 77, 89 -96