Wir bewegen Quality Assessment of registerbased Statistics A
Wir bewegen Quality Assessment of register-based Statistics A Quality Framework Informationen Manuela LENK Directorate Population Statistics Register based census © STATISTIK AUSTRIA www. statistik. at 14. 4. 2010 S T A T I S T I K A U S T R I A 1
Introduction Switch from traditional to register-based census in Austria 2011 Compared to other countries transition time is relatively short Census Test was carried out for 2006 Seven base registers and several comparision registers Registers are combined using unique identifiers (b. PK) Gathering experience for the register-based census 14. 4. 2010 S T A T I S T I K A U S T R I A 2
Quality Issues in Census Test Quality is influenced by (according to the quality report as requested by Eurostat) Quality of the administrative data source Existence of an unique key or rather the availability to link a person Comparision between survey and administrative data Item imputation Consequence Establishing a quality framework for the register based census 2011 – cooperation with WU Vienna 14. 4. 2010 S T A T I S T I K A U S T R I A 3
Objectives Quality assessment of statistics based on administrative data Quality Indicator for Raw Data – data obtained from administrative sources Each attribute in each register Whole register (data source) Census 2011 Attributes of the Final Data Pool 14. 4. 2010 S T A T I S T I K A U S T R I A 4
Hyperdimensions Assessment of quality is derived from hyperdimensions (HD) for each attribute in each register. Documentation Pre-Processing External Sources Imputation 14. 4. 2010 S T A T I S T I K A U S T R I A 5
Hyperdimensions (1) Documentation Focus is on processes taking place before the data is transfered to Statistics Austria Data treatment at the source maintaining the register Reliability of the data source Quality criteria Relevance of attributes for the source (e. g. legal foundation) Availability for certain reference dates Compatible definitons Registers are benchmarked using a perfect pseudo register Informations are extracted from the meta database 14. 4. 2010 S T A T I S T I K A U S T R I A 6
Hyperdimensions (2) Pre-Processing Process of Data editing from Raw Data to the Edited Data Analysis of Raw Data Error corrections Recoding Plausibility checks Quality criteria Proportion of missing unique keys Proportion of values out of domain Item non-response Automated generation out of database 14. 4. 2010 S T A T I S T I K A U S T R I A 7
Hyperdimensions (3) External Source Checks accuracy of the data record linkage with survey data (e. g. Labour Force Survey) Quality criteria Consistency of attribute values For some attributes no external source for comparision exists, thus it is being replaced by Expert interviews Rating of accuracy – expert knowledge obtained from working experience Subjective – should be considered when allocating the weight for this hyperdimension 14. 4. 2010 S T A T I S T I K A U S T R I A 8
Quality Indicators (1) Each hyperdimension receives a weight in accordance with its relevance for the project. Within these hyperdimensions quality criteria are determined and quality indicators are derived for each hyperdimension (HDD, HDP, HDE). After weighting these dimensions (v. D, v. P, v. E) we get an aggregated quality indicator q. 14. 4. 2010 S T A T I S T I K A U S T R I A 9
Quality Indicators (2) Finally we are able to derive a quality indicator qij for each attribute in each register We could derive an aggregated quality indicator for each register too (weighted row sum). 14. 4. 2010 S T A T I S T I K A U S T R I A 10
Application on linked data sets Register-based census Quality assessment within the Census Database Combination of existing registers to a new data base. Quality assessment of the Final Data Pool Census Database including item imputations. 14. 4. 2010 S T A T I S T I K A U S T R I A 11
Assessment of the Census Database (CD) Unique Attributes attribute exists exactly once in one register and is transfered directly to the CD qij = qΨj link Multiple Attributes attribute exists in more than one register. Information from different sources are combined in the CD using certain decision rules – e. g. majority principle (Sex) Quality indicators for this attribute from different registers are combined to an overall quality indicator q j (e. g Dempster-Shafer Theory) Additional check using external data source HDE leads to a corrected quality indicator qΨj link Derived Attributes several attributes are linked to a new attribute Quality indicator q j is generated by using the quality indicators of the input attributes Additional check using external data source HDE leads to a corrected quality indicator qΨj link 14. 4. 2010 S T A T I S T I K A U S T R I A 12
Quality assessment of the Final Data Pool Census Database including item imputations Hyperdimension Imputation (HDI) Quality indicators depend on methods of imputation Shall be generated with appropriate valuation methods The weight for this hyperdimension is approximated by the proportion of imputation 14. 4. 2010 S T A T I S T I K A U S T R I A 13
Process-oriented Quality Framework S T A T I S T I K A U S T R I A back 14
Conclusion Framework should act as a road map Next milestones are Establishing check lists (questionnaires) Specify how to derive the quality indicators for each hyperdimension and how to aggregate them General approach Application to other projects is possible S T A T I S T I K A U S T R I A 15
Outlook Additional Information Quality Framework Fiedler/Schwerer/Berka/Humer/Moser: Quality Assessment for Register based Statistics in Austria Register-based Census www. statistik. at 14. 4. 2010 S T A T I S T I K A U S T R I A 16
- Slides: 16