Data Processing The process of converting information from

  • Slides: 22
Download presentation
Data Processing • The process of converting information from a questionnaire so that it

Data Processing • The process of converting information from a questionnaire so that it can be read by a computer is referred to as data preparation/data processing.

Steps in Data Processing Step 1 Step 2 Data validation Data editing Step 3

Steps in Data Processing Step 1 Step 2 Data validation Data editing Step 3 Data coding Step 4 Data tabulation Step 5 Reviewing tabulations

Data validation • The process of determining , to the extent possible, whether a

Data validation • The process of determining , to the extent possible, whether a survey’s interviews or observations were conducted correctly and are free of fraud or bias. • Area of process validation 1. Fraud: was the person actually interviewed ? 2. Screening: many times an interview must be conducted only with qualified respondents to ensure accuracy of the data collected,

 • Procedure: It is critical that the data be collected according to a

• Procedure: It is critical that the data be collected according to a specified procedure. • Completeness: in order to speed through the data collection process, an interviewer may ask the respondent only few of the requisite questions. form. • Courtesy: Normal assumption is that a respondent is treated with courtesy. at times unconsciously, he may inject a tone of negativity into the interviewing process. .

Data editing • Refers to inspecting , correcting and modifying the collected data. •

Data editing • Refers to inspecting , correcting and modifying the collected data. • Interviewer Error: Interviewers sometimes mark incorrect responses. • Respondent errors: Respondents sometimes provide inconsistent answers or make illegible or confusing marks.

Two major types of editing 1. Field Editing: 1. Editing done when the data

Two major types of editing 1. Field Editing: 1. Editing done when the data collection takes place on the same day. 2. Time is important 3. Appoint a supervisor who may periodically throughout the day examine the already completed surveys. 2. Office Editing 1. Done at a central location by an office staff after all data collection is finished. 2. Editor may choose to recontact the respondents.

 • • Drawbacks: anonymous respondents interviewer cannot be given feedback that will improve

• • Drawbacks: anonymous respondents interviewer cannot be given feedback that will improve the accuracy. RESONSE PROBLEMS: 1. Wrong informant 2. Return to sender 3. Illegible writing and incomplete responses. 4. Damaged measuring instrument 5. Apparently confused respondent 6. Lack of variance among responses 7. Lack of consistency among responses 8. Late responses. Editing is performed in two ways: 1. Personal editing 2. Computer editing

CODING • Coding is the process of assigning numbers or symbols to the answer

CODING • Coding is the process of assigning numbers or symbols to the answer to prepare them for tabulation. • When the computers are to be used for tabulation , it is necessary to replace the answers given on the returned printed questionnaires with code numbers that can be transferred to punch cards.

OBJECTIVES OF CLASSIFICATION a) To identify similarity in the data collected. b) To maintain

OBJECTIVES OF CLASSIFICATION a) To identify similarity in the data collected. b) To maintain homogeneity c) To facilitate effective comparison d) To present complex haphazard and scattered data in a concise, logical and intelligible form e) To maintain clarity

f) To identify independent and dependent variables and establish their relations g) To simplicity

f) To identify independent and dependent variables and establish their relations g) To simplicity the complex data h) To specify diversity and unity of the data for the purpose of analysis i) To achieve effective qualification and j) To facilitate easy presentation and

DETERMINATION OF CLASSIFICATION • Classification according to attributes • a) simple classification • b)

DETERMINATION OF CLASSIFICATION • Classification according to attributes • a) simple classification • b) Manifold classification • The arbitrary nature of classification • Classification according to class intervals

Characteristics of effective classification • • Uniformity should be maintained Comparability should be ensured

Characteristics of effective classification • • Uniformity should be maintained Comparability should be ensured Unambiguity should be ensured Classification should be flexible as far as possible • Consistency should be maintained in class intervals • Stability should be secured

 • Homogeneity is the backbone of classification • Attributes should be appropriate and

• Homogeneity is the backbone of classification • Attributes should be appropriate and clear • Class intervals , limits , and magnitudes must be appropriate , accurate and consistent as far as

CODING PROCESS Coding involves grouping and assigning values to various responses from the survey

CODING PROCESS Coding involves grouping and assigning values to various responses from the survey instrument. Both parts of the coding process usually entail labeling the responses with some numeric meaning-a number from 0 to 9. • A well planned and constructed questionnaire can reduce the amount of tiem spent on coding while increasing the accuracy of the processs. • Best pracrices suggest that coding should be incorporated into the design of the questionnaire. (rating scales have built in codes) • In questionnaires that do not use such simpel coded responses, the researcher will establish a master code on which the assigned numeric values are shown for each response. eg: if the question is: on your last visit to a restaurant, what was the Rupee amount you spent on softdrinks? Under Rs. 50 [] 1 Rs 50 -Rs 100[]2 Rs. 100 -Rs. 150 [] 3 More than Rs. 150 [] 4 Don’t Remember [] 5 • Such close-ended questions are normally precoded at the time of questionnaire design. • The use of a master code is an additional safeguard to

Four step process to develop codes • Step 1: Generate a list of as

Four step process to develop codes • Step 1: Generate a list of as many potential responses as possible. These responses can then be assigned values within a range determined by the actual number of separate responses identified. For responses that do not appear on the list, the researcher can simply add a new response and corresponding value to the list or consolidate the response into one of the existing categories.

 • Step 2: Consolidation of responses eg: why has your use of this

• Step 2: Consolidation of responses eg: why has your use of this restaurant decreased? . I donot like th efood here Food there affects my health I work longer hours, and don’t think about food. I started cooking at home The location of my work moved so I am not near the restaurant Two of these related to not liking the food-can be consolidated into a single response category because all have the same shared common meaning. The establishment of consolidated categories is of course a subjective decision that should be made only by an experienced research analyst.

 • Step 3: Assign a numerical value as code. though this may appear

• Step 3: Assign a numerical value as code. though this may appear to be a simple task, the structure of the questionnaire and the number ofresponses per question need to be taken into consideration. • Eg: if a question has more than 10 responses, then double digit codes need to be used, such as “ 01”, “ 02”, …. . ” 12. ” • Another good practice is to assign higher – value codes to positive responses than to negative responses. For eg: ”no” responses are coded 1 and “yes” responses coded 2, dislike responses are coded as 1 and like responses coded as 5. Coding of this nature makes subsequent analysis easier. For eg: , the researcher will find it easier to interpret means or averages if higher values occur as the average moves from “dislike” to “like”

 • Coding of unanswered question 1. CONSIDER HOW THE RESPONSE OR LACK OF

• Coding of unanswered question 1. CONSIDER HOW THE RESPONSE OR LACK OF IS GOING TO BE USED IN THE ANALYSIS PHASE. IF THE ANALYSIS IS AFFECTED BY THE LACK OF RESPONSE THEN BEST WAY IS TO DELETE THE QUESTIONNAIRE AND THE INDIVIDUAL QUESTION • ALSO CHECK HOW THE DATA ANALYSIS SOFTWARE WILL HANDLE DATA OR CODE OMISSIONS. ALLOW THIS TO BE THE GUIDE FOR DETERMINING WHETHER OMISSIONS SHOULD BE CODED OR LEFT BLANK.

STEP 4: ASSIGN A CODED VALUE TO EACH RESPONSE. FIRST EACH QUESTIONNAIRE NEEDS TO

STEP 4: ASSIGN A CODED VALUE TO EACH RESPONSE. FIRST EACH QUESTIONNAIRE NEEDS TO BE ASSIGNED A NUMERICAL VALUE. EG: IF THERE ARE 1000 QUESTIONNAIRES A 3 DIGIT CODE AND IF MORE THAN 1000 FOUR-DIGIT CODE SHOULD BE USED. CODES CAN ALSO BE GIVEN FOR OPEN ENDED QUESTIONS.

DATA ENTRY • THOSE TASK SINVOLVED WITH THE DIRECT INPUT OF THE CODED DATA

DATA ENTRY • THOSE TASK SINVOLVED WITH THE DIRECT INPUT OF THE CODED DATA INTO SOME SPECIFIED SOFTWARE PACKAGE THAT ULTIMATELY ALLOWS THE RESEARCH ANALYST TO MANIPULATE AND TRANSFORM THE RAW DATA INTO USEFUL INFORMATION

FOUR PRINCIPAL WAYS OF ENTERING CODED DATA INTO COMPUTER. 1. KEYBOARD TERMINALAND PERSONAL COMPUTER

FOUR PRINCIPAL WAYS OF ENTERING CODED DATA INTO COMPUTER. 1. KEYBOARD TERMINALAND PERSONAL COMPUTER KEYBOARD. Both are keydriven devices connected directly to a computer processor. The PC keyboard os connected directly to the computer, where as the keyboard terminal is connected to the computer by a data communications link such as a phone line or satellite that may span thousands of miles.

 • Touch screen capabilities that allow analyst to simply touch an area of

• Touch screen capabilities that allow analyst to simply touch an area of the terminal screen to enter data element. • Use of a light pen, which is a handheld electronic pointer used to enter data through the terminal screen. • Optical scanning procedure. . Questionnaires prepared on any form of Microsoft Windows software packages printed on laser printers can be readily scanned through an optical scanning procedure.