Imputation and Editing of Income from the Administrative

  • Slides: 21
Download presentation
Imputation and Editing of Income from the Administrative File in Israel’s Censuses prepared by

Imputation and Editing of Income from the Administrative File in Israel’s Censuses prepared by Orly Furman and Dmitri Romanov UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing Ljubljana, May 2011

Use of Administrative Income Files in Israel’s Censuses 1995 Census 2008 Census Questionnaire Employment

Use of Administrative Income Files in Israel’s Censuses 1995 Census 2008 Census Questionnaire Employment Salary and selfemployment income (net and gross) Employment Reference period September 1995 December 2008 Calendar year 2008 Administrative income files Consistency checks of reportage on salary; Imputation of 8. 6% records when data missing Imputation of salary and self-employment income for all records; Consistency checks of reportage on employment

Administrative Income File used in the 1995 Census Source: National Insurance Institute. Coverage: 1.

Administrative Income File used in the 1995 Census Source: National Insurance Institute. Coverage: 1. 965 million employee posts, months of work and annual salary and wage of 1. 815 million employees, as reported to the NII. Adm. file for the 20% census sample Reported by the 20% census sample Diff. censusadm. file for the census sample, % Average wage, NIS 4, 706 4, 763 3, 978 -16. 5 Emplyees, 000 1, 815. 1 1, 705. 0 1, 566. 5 -8. 1

Amendments to the Salary Data in the 1995 Census Treatment/amendment Percentage of total Non-amended

Amendments to the Salary Data in the 1995 Census Treatment/amendment Percentage of total Non-amended value 69. 0 Gross salary imputed by regression from net salary 21. 0 Imputation of data from adm. file 8. 6 Editing of irregularities (division by 100/10) 1. 2 Other editing 0. 2 Total 100. 0

Reporting Salary in Census: Rounding Distribution of September Salary Reported in the Census 8

Reporting Salary in Census: Rounding Distribution of September Salary Reported in the Census 8 Prevalence, percentage 6 4 2 0 0 1, 000 2, 000 3, 000 4, 000 5, 000 6, 000 7, 000 Reported salary, NIS 8, 000 9, 000 10, 000 11, 000 12, 000

Reporting Salary in Census: Confounding Net and Gross Deviation of the September Salary Reported

Reporting Salary in Census: Confounding Net and Gross Deviation of the September Salary Reported in the Census from the Gross and Net Monthly Salary Per Job in the Administrative File, by Annual Salary Percentiles, as a percentage of gross calculated salary 40 20 0 -20 -40 0 20 40 60 80 Annual salary percentiles Deviation from “gross” salary Deviation from “net” salary 100

Imputing Salary in Census: Challenge of Multiple Jobs Distribution of Values Imputed on the

Imputing Salary in Census: Challenge of Multiple Jobs Distribution of Values Imputed on the Basis of the Administrative File for Employees who Held One Job (Left) and More Then One Job (Right) in 1995

Administrative Income Files Used in the 2008 Census Source: Tax Authority. Coverage: employee posts,

Administrative Income Files Used in the 2008 Census Source: Tax Authority. Coverage: employee posts, months of work and annual salary and wage of the employees, and business income of the self-employed individuals. Usage: Imputation of earnings (salary and business income of the self-employed), conditional on workforce status as reported in the Census and occurrence in the administrative income files. Challenge: Treatment of inconsistencies between the two sources, due to misreporting in the Census, or/and omissions of the administrative file.

Identification of Cases in Which the Census Data and the Administrative Data do Not

Identification of Cases in Which the Census Data and the Administrative Data do Not Coincide

Discrepancies between the Census Data and the Administrative Data • Group A: Individuals that

Discrepancies between the Census Data and the Administrative Data • Group A: Individuals that were found to have work income in 2008 as per the administrative income data base, which according to the census did not belong to the annual workforce. • Group B: Individuals that reported in the census as belonging to the annual workforce, but were not found to have work income according to the administrative data bases.

Analysis of Cases in Group A • 67% of individuals in this group are

Analysis of Cases in Group A • 67% of individuals in this group are in the primary working age -group (19 to 65). 51% worked in 2008, according to the income tax data, less than half a year. This reinforces the hypothesis that under-reporting of employment in the Census is connected to irregular employment over the year. • For 43% of this group, a record exists in the administrative income data base for December 2008. • 74% of the individuals having work income in December, who did not report employment in the census, were in in the primary working age-group; For two thirds of them the income data base includes information on ongoing employment in 2008, for over six months of employment. This indicates a high probability of inaccurate reporting in the census with respect to labour market non-participation.

Analysis of Cases in Group B Work status Distribution as reported in the census

Analysis of Cases in Group B Work status Distribution as reported in the census Absent from the income data base, % of cell Total 100. 0 12. 3 Employees 86. 3 10. 7 Self employed – not employing 8. 3 16. 5 Self employed – employing 4. 4 12. 7 Cooperative members 0. 1 25. 6 Kibbutz members 0. 8 75. 6 Unpaid family members 0. 1 51. 2

Analysis of Cases in Group B Work status Distribution as reported in the census

Analysis of Cases in Group B Work status Distribution as reported in the census Absent from the income data base, % of cell Total 100. 0 12. 3 Employees 86. 3 10. 7 Self employed – not employing 8. 3 16. 5 Self employed – employing 4. 4 12. 7 Cooperative members 0. 1 25. 6 Kibbutz members 0. 8 75. 6 Unpaid family members 0. 1 51. 2

Analysis of Cases in Group B • The work hypothesis is that the absence

Analysis of Cases in Group B • The work hypothesis is that the absence of information on employees and the self-employed is due to late or failed reporting by employers and self employed individuals to the income tax authority. • Accordingly, the employer of an employee who was absent from the 2008 income data base should be examined, to check whether the employee was active in the preceding year. • The examination shows that more than 50% worked in 2007 and have employee jobs. 80% of these work for employers that did not report in 2008 but did report in 2007.

Algorithm of Income Imputation Group Income recording method Found to have work income Work

Algorithm of Income Imputation Group Income recording method Found to have work income Work months and according to income data base salary imputed as per but do not belong to the income data base. workforce according to the census. % of total cases imputed reported cases in census 61. 7 7. 9

Algorithm of Income Imputation (cont. ) Group Income recording method Belong to the workforce

Algorithm of Income Imputation (cont. ) Group Income recording method Belong to the workforce according to the census but found not having work income according to the income data base, found to be employed by employers in 2007 that did not report in 2008 The individual’s salary for 2007 was imputed, adjusted for the average salary increase in the economic industry. % of total cases imputed reported cases in census 15. 2 1. 9

Algorithm of Income Imputation (cont. ) Group Income recording method Belong to the workforce

Algorithm of Income Imputation (cont. ) Group Income recording method Belong to the workforce according to the census but found not having work income according to the income data base, found to be reporting self employed individuals in 2007 who did not report in 2008 Income was imputed for holders of active files in 2007, adjusted for the average income increase in the economic industry. % of total cases imputed reported cases in census 2. 6 0. 3

Algorithm of Income Imputation (cont. ) Group Income recording method Belong to the workforce

Algorithm of Income Imputation (cont. ) Group Income recording method Belong to the workforce according to the census but found not having work income according to the income data base, military personnel, housekeepers, caretakers and unknown denotation of occupation Income was imputed from the ongoing survey, according to the average income as per defined estimation cells*. % of total cases imputed reported cases in census 3. 6 0. 5

Algorithm of Income Imputation (cont. ) Group Individuals who reported having worked in the

Algorithm of Income Imputation (cont. ) Group Individuals who reported having worked in the census but do not belong to the abovementioned groups Income recording method Income was imputed based on average income in the defined estimation cells**, according to the number of months worked as reported in the census. % of total cases imputed reported cases in census 16. 9 2. 1

The Bottom Line • All in all, only in 12. 7% of cases that

The Bottom Line • All in all, only in 12. 7% of cases that reported employment in the 2008 census discrepancies between the reportage and the administrative source were treated, and income information from the administrative file was amended. In 87. 3% the data on earnings of the employees and the self-employed from the administrative file was imputed. • In contrast, income data as reported in the “traditional” 1995 census, had to be amended or imputed in 29. 6% of cases.

Thank you!

Thank you!