Data Capture Process Stages UNSDUNESCAP Regional Workshop on

  • Slides: 19
Download presentation
Data Capture Process Stages UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for

Data Capture Process Stages UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008

Overview o Objective o Major Process Stages n n Document Scanning operations Recognizing operations

Overview o Objective o Major Process Stages n n Document Scanning operations Recognizing operations Verifying operations Coding Assistance o Factors/Considerations UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008

Objective o To provide an overview of the major process stages associated with optical

Objective o To provide an overview of the major process stages associated with optical data capture and quality assurance considerations UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008

Major Process Stages Scanner Speeds are dependent on process chosen Document Scanning Recognizing M

Major Process Stages Scanner Speeds are dependent on process chosen Document Scanning Recognizing M aj or Recognizing is dependent on the sophistication of the recognition engine Automatic Electronic Verification Pr oc es Verifying s. S ta ge Non-Successful Electronic Verification s Coding Assistance prepare data in a form suitable for entry into computer UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008

Document Scanning Stage n Key feature: scanning speed n Scanning speed will be determined

Document Scanning Stage n Key feature: scanning speed n Scanning speed will be determined by: o Quality of the scanner machines o Size of non-drop out color o Paper quality, cleanness & weight UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008

Recognizing Stage n The recognizing process is to interpret images n Accuracy of interpretation

Recognizing Stage n The recognizing process is to interpret images n Accuracy of interpretation will be determined by: o Recognition engine/memory dictionary; o Configuration threshold UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008

Verifying Stage o Processing can be in geographic order or in random order: n

Verifying Stage o Processing can be in geographic order or in random order: n Automatic electronic verification n Non successful electronic verification: Need to compare the value of the interpreted image with the real image of the form. Image manipulation UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008

Verifying Stage (cont. ) n Image Manipulation: Electronic questionnaires can be sent to specialist

Verifying Stage (cont. ) n Image Manipulation: Electronic questionnaires can be sent to specialist operators then back to the original operator if necessary (in some cases, the same questionnaire can be worked on simultaneously by two or more persons) UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008

Coding Assistance Stage n Process in which census questionnaire entries are assigned numerical and/

Coding Assistance Stage n Process in which census questionnaire entries are assigned numerical and/ or alphanumeric values n Objective is to prepare data in a form suitable for entry into computer n Done by setting up possible responses to each question in the census questionnaire UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008

Factors to be considered o Questionnaire Design & Preparation o Data Collection & Processing

Factors to be considered o Questionnaire Design & Preparation o Data Collection & Processing Considerations n Field Operation n Staff Training UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008

Thank You UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture,

Thank You UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008

o Additional material UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data

o Additional material UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008

Questionnaire Design & Preparation Form Design Advise n Consider the number items to be

Questionnaire Design & Preparation Form Design Advise n Consider the number items to be included in a form n Pre-print codes near the place where the box for ticks are located n Considering the speed of the data capture process - it is advisable to use marks or “ticks” as much as possible n Define drop out color properly; use registration marks (allows for quicker recognition) UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008

Questionnaire Design & Preparation o Form Design Advise n Maintain consistent pattern in which

Questionnaire Design & Preparation o Form Design Advise n Maintain consistent pattern in which the information to be collected will be located n Do not disturb the visibility of the ticks and marks with titles, labels or instructions n Avoid putting "answers" of one field to another page of the questions; n Avoid using open ended questions UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008

Questionnaire Design & Preparation o How to Obtain Good Results of Scanning n Select

Questionnaire Design & Preparation o How to Obtain Good Results of Scanning n Select adequate paper quality n Select a reliable printing press n Use appropriate ink, considering drop out color (for the questionnaires paper heavier than 80 grams per square meter can help avoid paper crashes in scanner) UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008

Data Collection & Processing Considerations o Field Operation n Field Operators should have basic

Data Collection & Processing Considerations o Field Operation n Field Operators should have basic knowledge of the data capture process chosen o Staff Training n A set-up of required training for staff will ensure quality and effectiveness of the data captured UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008

Field Operation Considerations o Reasons of Error-Reading of OCR: n Bad condition of the

Field Operation Considerations o Reasons of Error-Reading of OCR: n Bad condition of the form because of dirt, folded, crumple, etc n Unnecessary lines of characters such as points, decorative strokes, hooks, etc n Checking the questionnaires for completeness and consistencies UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008

Training for Processing Staff n Installation and set-up break-down of equipment (e. g. hardware

Training for Processing Staff n Installation and set-up break-down of equipment (e. g. hardware and software) n Basic software knowledge n Scanner operating procedures n Troubleshooting (e. g. solutions to common problems/issues) UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008

Control steps o Control steps should be taken if the information image is partial

Control steps o Control steps should be taken if the information image is partial or no information to assure the quality of generated files n Value Checking Steps n Control for Blank n Missing Questionnaire UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation and archiving Bangkok, Thailand, 15 -19 September 2008