Coverage assessment and adjustment methodology Owen Abbott Methodology

Coverage assessment and adjustment methodology Owen Abbott Methodology Directorate, ONS

Agenda • • Introduction 2001 One Number Census 2011 Strategy The Census Coverage Survey (CCS) Estimation Overcount Adjustment Summary

What is the problem? • Despite best efforts, Census won’t count every household or person • It will also count some people twice • Why is that a problem? - In 2001, we estimated that 3 million persons (6%) missed - Need robust census estimates - counts not good enough • Further problem: - The undercount is not evenly spread - Inner Cities, Deprived areas, Young persons

The problem • This is a problem that all census taking countries face • We can try really hard to maximise coverage • But we will still miss households and people • So what do we do? - We must have a robust method for measuring coverage - It must provide accurate estimates at LA level - It must be an integral part of the census process

The 2001 Census experience • Estimated 1. 5 million households missed • 3 million persons missed (most from the missing households but some from counted households) • Subsequent studies estimated a further 0. 3 million missed

The 2001 Census experience

The One Number Census • In 2001, One Number Census methodology was developed - Large Census Coverage Survey Matching, Capture Recapture, Ratio estimation Small area estimation to get LA totals Imputation of missed households and persons • In 2011 we want to build on the ONC, as broadly it was very successful

2011 Aims and Objectives • • Measure undercount Measure overcount Address lessons from 2001 Take into account changes - In census design - In the population of interest • Accuracy to be as good or better than in 2001 - 0. 2 per cent confidence interval nationally

2011 Coverage Assessment Overview 2011 Census Coverage Survey Matching Estimation Adjustment Quality Assurance

The Census Coverage Survey • Key component • Similar to 2001 CCS: - Large Sample Survey 320, 000 Households Sample of small areas (postcodes) Focus on counting the population 6 weeks after census day Short paper based interview Independent of Census

The Census Coverage Survey • Sample design similar to 2001 - Two stage stratified by geography and a ‘hard to count’ index - First sample Output Areas - Then select postcodes within each OA - Sample size determined by optimal allocation • Improvements for 2011 - Sample stratified by Local Authority - More refined Ht. C index - Better Design variable

Estimation • Estimation based on Dual System Estimation - Used mainly for wildlife applications - Requires two counts of the population • Assumptions - Independence - Homogeneity - No matching errors • Applied at very low level

Estimation • Use matched Census + CCS data • DSE estimates adjustment for those missed in both Census and CCS Counted By Census Yes No Counted By CCS Yes No n 11 n 10 n 1+ n 01 n 00 n 0+ n+1 n+0 n++ DSE count for a postcode: n++ = n 1+ n+1 n 11

Estimation • Generalise sample DSE estimates - Use standard ratio type estimators • Problem – not enough sample in most LAs • Solution – post-stratify LAs into groups - 2011 equivalent of Estimation Areas - Group LAs by type, not geography • Then small area model to get LA estimates

Measuring Overcount • Estimate separately • Not yet developed methodology • Sources likely to be: - CCS - Matching Census data • Weight individuals in the DSEs to integrate into estimation methodology

Coverage adjustment • Imputation of households and persons estimated to have been missed • Planning on similar process to that in 2001 - Model coverage probabilities Calibrate weights Impute households (with people) Impute persons into counted households • Looking at improvements in modelling steps

Summary • Measuring Coverage very important • Integral part of the UK Census process • In the UK looking to build on 2001: - Improvements across the board