Optimal Sampling Strategies for Multidomain Multivariate Case with

  • Slides: 22
Download presentation
Optimal Sampling Strategies for Multidomain, Multivariate Case with different amount of auxiliary information Piero

Optimal Sampling Strategies for Multidomain, Multivariate Case with different amount of auxiliary information Piero Demetrio Falorsi , Paolo Righi falorsi@istat. it , parighi@istat. it Italian National Statistical Institute Seminar UNECE, 12 June 2012

Outline n n n Aim of the talk Statement of the problem (The unified

Outline n n n Aim of the talk Statement of the problem (The unified approach for) sampling design (Mgreg) Estimator Experimental results Conclusions

Aim of the talk An overall strategy

Aim of the talk An overall strategy

Statement of the problem

Statement of the problem

Statement of the problem: Challenging informative context Multiple sources of auxiliary information

Statement of the problem: Challenging informative context Multiple sources of auxiliary information

Statement of the problem: Design

Statement of the problem: Design

Statement of the problem: Estimation Ø Standard solution for estimation (calibration estimators) may allow

Statement of the problem: Estimation Ø Standard solution for estimation (calibration estimators) may allow for calibrating at domain level only for the register variables and does not calibrate on the domain existing totals deriving from auxiliary data sources Ø Main drawback: ü Too small sample size for some domains Biased estimation for small domains ü Risk that the estimation of variables that could derive from administrative Data Source are significantly different from known totals Effect of non response or measurement error

Sampling Design: Multiple sources of auxiliary information

Sampling Design: Multiple sources of auxiliary information

Sampling Design: Multiple sources of auxiliary information

Sampling Design: Multiple sources of auxiliary information

Sampling Design: Multiple sources of auxiliary information

Sampling Design: Multiple sources of auxiliary information

Estimation: Multiple sources of auxiliary information

Estimation: Multiple sources of auxiliary information

Estimation: The Working model

Estimation: The Working model

Estimation: The Mgreg Estimator

Estimation: The Mgreg Estimator

Estimation: Properties

Estimation: Properties

Estimation: Properties

Estimation: Properties

Estimation: Properties - auxiliary=interest

Estimation: Properties - auxiliary=interest

Empirical Results: Population of simulation - 1999 Italian enterprises from 1 to 99 employees-

Empirical Results: Population of simulation - 1999 Italian enterprises from 1 to 99 employees- Computer and related economic activities (2 -digits NACE Rev. 1) Populatio n size Number of cross-classified strata 1 68 Cumulative (%) distribution 18. 89 2 37 29. 17 3 -5 63 46. 67 6 -10 50 60. 56 11 -100 119 93. 61 More than 100 23 100. 00 The domains of interest (44): (1) geographical region with 20 marginal domains (DOM 1); (2) economic activity group by Size class (24 domains) ITACOSM 2011 - 27 -29 June 2011, Pisa, Italy - 12

Empirical Results: Simulation: allocation comparison between the oneway and multi-way design Ø Prediction models:

Empirical Results: Simulation: allocation comparison between the oneway and multi-way design Ø Prediction models: M 1 M 2 Model % Labour cost Value added 68. 1 64. 1 65. 1 61. 0

Empirical Results: multiple sources of auxiliary information: example – efficiency of the proposed strategy

Empirical Results: multiple sources of auxiliary information: example – efficiency of the proposed strategy Sampling distributions over the partition with different auxiliary information

Conclusions

Conclusions

Conclusions n The last result (The unified approach) of a research that has lasted

Conclusions n The last result (The unified approach) of a research that has lasted almost 6 years n Survey Methodology (2008) n Statistics in Transition (2006) n 2 books published by Franco Angeli illustrating the main findings of a research of strategic interest financed by the Ministry of University and Research n Presentations NTTS (2011), Neuchatel (2011) n Invited talk to the next scientific conference of the Italian Society of Statistics n Accepted talk for the ICES

References § § § § Bethel J. (1989) Sample Allocation in Multivariate Surveys, Survey

References § § § § Bethel J. (1989) Sample Allocation in Multivariate Surveys, Survey Methodology, 15, 47 -57. Chromy J. (1987). Design Optimization with Multiple Objectives, Proceedings of the Survey Research Methods Sec-tion. American Statistical Association, 194199. Deville J. -C. , Tillé Y. (2004) Efficient Balanced Sampling: the Cube Method, Biometrika, 91, 893 -912. Deville J. -C. , Tillé Y. (2005) Variance approximation under balanced sampling, Journal of Statistical Planning and Inference, 128, 569 -591 Falorsi P. D. , Righi P. (2008) A Balanced Sampling Approach for Multi-way Stratification Designs for Small Area Estimation, Survey Methodology, 34, 223234 Falorsi P. D. , Orsini D. , Righi P. , (2006) Balanced and Coordinated Sampling Designs for Small Domain Estimation, Statistics in Transition, 7, 1173 -1198 Isaki C. T. , Fuller W. A. (1982) Survey design under a regression superpopulation model, Journal of the American Statistical Association, 77, 89 -96