Enumeration using frozen versions based on slides produced
Enumeration using frozen versions (based on slides produced by Peter Stoltze, Chief Consultant, Statistical Methods, SD) Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk)
Foundation of the Statistics • Definition of the population is essential in relation to interpretation of the statistics • If we do not have a firm grip on the population, everything else is unimportant! • SBR is central in this respect – Updated by administrative sources – Updated by information from different Statistical Divisions a benefit for all Statistical Divisions Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk) 2
Definition of the Population (1) • Population of interest is the collection of objects in which we are interested – Example: All businesses in Ukraine • Target population is the section of the population of interest that we, for practical reasons, must confine ourselves to observe – Example: All businesses with at least 10 employees • Sampling frame is the data representation of the target population available to us – it is from here that the sample is drawn – Example: Extracts from SBR Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk) 3
Definition of the Population (2) Sample Sampling frame Target population Population of interest Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk) 4
Frame Imperfections • The difference between the target population and the sampling frame is due to the fact, that our registers are not perfect • Over-coverage: Businesses which are included in the sampling frame, but ought not to be included – Can be discovered during data collection – Example: The business went bankrupt long before the starting date of the reference period • Under-coverage: Businesses which ought to be included in the sample frame, but are not included – Can be discovered, if we have knowledge of the area via other sources Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk) 5
Estimation based on an updated population • Design weights are sacred • Selection probabilities are sacred • The handling of stratum changes should be conducted by calibration and domain-estimation • Estimation may account for cut-off sampling Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk) 6
Dynamic Frame Population Current version Historic version Time t+1 t t+2 t t+1 t+2 Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk) 7
Frozen Frame Population Current version Historic version Frozen version t Time t+1 t t+2 t+3 t+4 Time t+1 t+2 Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk) 8
Population at Estimation stage Current version Historic version Frozen version Sample Estimation of short-term survey Estimation of structural survey t t+1 t+2 Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk) 9
SBS statistics (all kind) (1) • Purpose: – Give information about the structure – Be able to compare across statistics • When: Year t (a period) or Ultimo t (a point in time) • Based on: A survey – 100 % Big enterprises – 50 % ? Medium sized enterprises – 25 % ? Small enterprises – 0 % ? Micro enterprises • divided eventually into sub-strata • The survey is drawn on the basis of a frozen SBR version 15 th Nov year t • The survey is carried through during e. g. Marts-June t+1 Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk) 10
SBS statistics (2) • New information on Year t requires updating of frozen SBR • What ? – All active enterprises and other units, as e. g. LKAU, (during the year t) have to be in the frozen version – All relevant changes/corrections (and that is changes related to the Year t and not t+1) have to be in the frozen version • but be aware of eventually bias – not only information from surveys has to be taken in – what could be the sources for updating SBR? for the year t? • When ? – Before the first SBS statistic is produced – Hopefully it is also when the information is available Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk) 11
SBS statistics (3) • • And now to the Enumeration The sample was drawn 15 th Nov t The new frozen version is formed ddmmyy year t+1? Principle: At the estimation stage we discover, that a unit selected in stratum ha with π = 0. 1 has moved to stratum hb • We then have to believe that 9 other (unobserved) units from ha have made a similar move • Instead of changing the selection probabilities, the combination Activity*Size are regarded as domains, and calibration is conducted on the basis of these new domains Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk) 12
SBS statistics (4) • And what does that mean? • • (The table has been removed because it is not as simple as it was shown Regression analysis has to be used What is important is to know about the population at the time for enumeration! See theory!!) Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk) 13
SBS statistics (5) A few names: • Horwitz-Thompson estimat or pi-expansion – the sum of design-weights over the sample within a stratum has to sum to the size of the stratum • Calibration can be implemented in the form of regression estimator – SD uses SCB CLAN survey (a collection of Swedish macros to SAS - http: //www. amstat. org/meetings/ices/2000/proceedings/S 09. pdf) – but other possibilities exist, e. g. package Survey to R by Thomas Lumley • google: "regression estimator sampling", "model assisted survey sampling" or "SCB CLAN survey" Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk) 14
SBS statistics (6) Problems • How do you get to know the ‘correct’ population when the frozen version 2 is formed? • How do you distribute between strata? • But it is risky only to include information from surveys • New units should not be included in the sample • Deceased units has to be placed in the stratum for deceased units so they get a weight, but it could be tricky to estimate the size (and depends whether the information is from the survey and not from the population (frozen version) Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk) 15
STS statistics (1) • Purpose: – Give information about development • When: Quarter x Year t+1 (a period) or Ultimo quarter x Year t+1 (a point in time) • Based on: A survey – 100 % Big enterprises, 50 % ? Medium sized enterprises, 25 % ? Small enterprises and 0 % ? Micro enterprises • divided into sub-strata • The survey is drawn on the basis of a frozen SBR version 15 th Nov year t • The survey is carried through during April t+1, July t+1, October t+1 and January t+2 Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk) 16
STS statistics (2) • What is the problem? • What about new enterprises? In year t (from 15 th Nov to 31 st Dec) • What about any change from 31 st Dec year t and to April, July, … t+1(2)? Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk) 17
STS statistics (2) • What is the solution? • Two possibilities – keep the frozen version and look at changes • Disadvantage: this does not take into account new enterprises • Advantage: easy – make new frozen versions for each quarter (or even use the actual version of SBR*) and continue as for SBS • Disadvantages: – time consuming – what is the sources for producing new versions – and to this those mentioned for SBS • Advantage: More correct description of the development – Either possibilities makes it possible to compare SBS and STS * But it is important to know the whole population and be able to distribute to strata Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk) 18
Frozen versions Statistics Denmark • SBS year t – Version 1 (Temporary: t+1 5 th Match (Turnover/Employees 15 th Match) – Version 2 (Temporary: t+1 5 th Sept. (Turnover/Employees 15 th Sept. )) – Version 3 (Final: t+1 5 th Dec. (Turnover/Employees 15 th Dec. )) • STS 1 st quarter year t+1 – Version 1 (Temporary: t+1 5 th May (Turnover/Employees 15 th May) – Version 2 (Final: t+1 5 th Aug. (Turnover/Employees 15 th Aug. )) • Samples might be drawn from any version before enumenation Kiev 2 nd – 5 th October 2012 Mrs Vibeke Skov Møller (vsm@dst. dk) 19
- Slides: 19