Business Register Quality Practices Eddie Salyers Eddie Joe
Business Register: Quality Practices Eddie Salyers Eddie. Joe. Salyers@Census. GOV 301 -763 -2638 1
An Assessment of Current Quality Assurance Practices and Ongoing Work to Develop a Comprehensive Quality Plan for the U. S. Census Bureau Business Register 2
Business Register: Quality Practices • Introduction – Database Redesign – Quality Assurance Team • Business Register Overview • Quality Assurance – – Migration Administrative Records Census Bureau Data Collections Recommendations • Conclusion 3
BR Database Redesign • Complete redesign • Old Standard Statistical Establishment List (SSEL) VAX RDB • New Business Register (BR) Oracle • All software rewritten • New BR production Fall 2002 4
Quality Assurance Team Mission: Assure the quality of the new BR is a minimum commensurate with the old SSEL which it replaces, and to establish a complete quality framework. 5
Quality Assurance Team Definitions: Quality – "The totality of features and characteristics of a product or service that bare on its ability to satisfy specified or implied needs. " (ISO, 1986). Reliability - “The ability of a system or component to perform its required functions under stated conditions for a specified period of time. ” [IEEE 90]. Integrity - Information in the system follows designated standards and is consistent both within an individual table as well as between associated tables. 6
Business Register Overview • Primary Functions – – – Economic Census enumeration list Survey sampling frames Central storage of administrative data Control file for data collection/processing Data for statistical products Data for economic research 7
Key Concepts and Definitions The BR’s Units Business/Statistical } – Establishment Standard Statistical Units – Enterprise segment Variable (e. g. , alternate reporting unit) Administrative – EIN unit – SSN unit } } Mainly for IRS tax reporting 8
Business Organization Basic Types Single-establishment enterprise: – An enterprise that operates just one establishment (i. e. , at one physical location) - a single unit or SU Multi-establishment enterprise – An enterprise that operates two establishments or more (2 -plus locations) 9
Multiunit A more complex MU may have: n Multiple EIN units n One subsidiary enterprise or more 10
Complex Multiunits The largest U. S. Multi-units may have: Several thousand EINs More than 10, 000 establishments 11
System • Oracle Database • Many Related Tables • Interactive Web-Based Interface built with Oracle Forms & PL/SQL • Interface used for research and updates • Software for interactive and batch updates and edits 12
Migration • Complete Redesign – New IDs – New Table Structures – All New Software – Copy Existing data - 2001 – Load “new” data - 2002 13
Migration • Quality Checks – Create SAS Datasets from Old SSEL and New BR for 2001 Records – Record to Record Match of 2001 SSEL and 2001 BR • After accounting for differences cause by design no significant differences were found – Comparison of 2001 BR to 2002 BR • Checks both migration and software used to load 2002 records • Year to Year Changes as Expected 14
Administrative Records Internal Revenue Service: • Business Master File (BMF) • Payroll tax returns • Business income tax returns • Bureau of Labor Statistics (BLS): – Description: Industrial classification assigned by State Employment Security Agencies as part of Covered Employment and Wages • Social Security Administration – Applications for new Employer Identification Number (EIN) 15
Administrative Records Over 100 Million administrative records are received each year. 16
Administrative Records Quality Assurance Current Practices: • Stage 1: – Tabulate distributions of variables on incoming files and compare to expected values. – Unchanged with redesign, works on inputs • Stage 2: – Basic Validity Test: Edits to assure each item has a valid form (valid states, data type, etc. ) – Ratio Edits: Examine Consistency of correlated data, I. e. Payroll per employee – Data failing edits are replaced with imputed values and referred to an analyst for review – Done as part of load to BR database – Process is similar to old, but all software rewritten for new BR 17
Administrative Records Quality Assurance • Current Practices: – Strengths: • Identifies systematic file errors well – Weaknesses • Lack of Macro-Level Post Processing Quality Assurance • Communication • Identifying significant problems with large cases 18
Administrative Records Quality Assurance • Recommendations – Using SAS datasets that are created monthly from the BR perform a routine macro-level review. – Creation of a Centralized Administrative Record Tracking System – Standardization and Automation of all Current QA Reports – Increase Ability to Identify Important Companies with Missing or Inaccurate Administrative Records – Development of Systematic Review of Post. Processing Administrative Record QA – Monitor Cost of Current Administrative Record Quality Assurance Activities 19
Census Bureau Data Collections Company Organization Survey Description: Register proving survey directed to selected multiunit enterprises Content – Ownership or control by a United States parent – Ownership or control by a foreign parent – Inventory of establishments, verifying or collecting the following for each: • Primary and secondary name • Physical location • EIN used for payroll tax reporting • SIC • Employment for pay period including March 12 • First quarter and annual payroll • Year-end operating status 20
Census Bureau Data Collections Economic Census Description: Enumeration of establishments in covered industries Content for each establishment: – – – – – Ownership or control by a parent enterprise Locations of operation Primary and secondary name Physical location address EIN used for payroll tax reporting SIC and Type of Operation Employment for pay period including March 12 First quarter and annual payroll Dollar volume of business (value of shipments, sales, receipts, revenue) – Year-end operating status – Value of products and services by category (selectively) – Other industry-specific content 21
Census Bureau Data Collections Quality Assurance Current Practices: • Data Entry – Independent Verification of samples – Data are re-keyed and difference adjudicated – Lots accepted or rejected based on error rates. • Batch Update Operations – Basic Validity Test: Edits to assure each item has a valid form (valid states, data type, etc. ) – Ratio Edits: Examine Consistency of correlated data, I. e. Payroll per employee – Data failing edits are replaced with imputed values and referred to an analyst for review – Done as part of load to BR database – Process is similar to old, but all software rewritten for new BR 22
Census Bureau Data Collections Quality Assurance Current Practices: • Clerical Operations – A second person that is qualified as a verifier selects and inspects a sample of the referrals from each completed work unit (dependent verification); – Rejected work units subjected to 100% reinspection – Note “old” SSEL had functionality to hold corrections until they passed inspections 23
Additional QA Team Recommendations • Improve Error Tracking • Improve Imputation for missing Employment and Payroll Values • Evaluate ORACLE DQI (Data Quality Inspector) as way to identify problems • Expand use of SAS datasets built from the BR to assess quality • Review and documentation of user needs and how the BR meets those needs • Comparison to Bureau of Labor Statistics (BLS) Business Establishment List (BEL)24
Conclusion • No identifiable difference in quality of new BR and old SSEL • Most procedures remain same • Migration completed accurately • Concerns – Clerical processing – Dependence on staff expertise • Several Areas for Potential Improvements 25
- Slides: 25