Data Discovery The reference interview The reference interview

  • Slides: 22
Download presentation
Data Discovery The reference interview

Data Discovery The reference interview

The reference interview • Always begin by clarifying the distinction between statistics and data

The reference interview • Always begin by clarifying the distinction between statistics and data with your patron. Never assume that the patron clearly knows this distinction. • Ask a question that will help you understand what they might be seeking using our frameworks from yesterday. • Asking them if they want statistics or data isn’t a good starting question, though.

Frameworks Table Dimensions: • Geography • Time • Subject content

Frameworks Table Dimensions: • Geography • Time • Subject content

The reference interview • What the patron intends or needs to do with the

The reference interview • What the patron intends or needs to do with the numbers? What is their objective? – Does the patron need them for a report or for data analysis? • What geographic area is needed? – Smallest geographic area to be described • What time period is needed? • What subject matter (variables) expressed in numbers is needed?

The reference interview If you determine the patron does need data: • Population (unit

The reference interview If you determine the patron does need data: • Population (unit of observation) to be described • Do they need aggregate data, microdata, spatial data? • What software does the patron intend to use? • How would the patron like the data delivered?

level of service • How much you do depends on the level of service

level of service • How much you do depends on the level of service you are offering. – Finding a resource – Retrieving a resource from an online service – Tailoring a product for the patron – Creating a product for a patron (e. g. , postal code conversion linkage)

Does the person want one number? Are they pursuing a fact or figure? Want

Does the person want one number? Are they pursuing a fact or figure? Want to know “how many? ” YES Statistics in print or ready-ref. electronic source? YES Go to print or ready ref. electronic source.

Does the person want one number? Are they pursuing a fact or figure? Want

Does the person want one number? Are they pursuing a fact or figure? Want to know “how many? ” YES Statistics in print or ready-ref. electronic source? YES Go to print or ready ref. electronic source. Extract relevant data from computer-readable source and compile statistics using appropriate software. NO Are the data accessible in computer-readable form? YES Go to computer-readable source.

To Use Data You Need 3 Things • Datafile (the raw numbers) • “Codebook”

To Use Data You Need 3 Things • Datafile (the raw numbers) • “Codebook” (where the numbers are and what they mean) • Statistical Software (for reading the datafile and analyzing the data)

The Statistics Field California Poll (newsletter) September 24, 1996 as reproduced on microfiche in

The Statistics Field California Poll (newsletter) September 24, 1996 as reproduced on microfiche in the collection, American Public Opinion Data.

The Data 3001101 1999503 1 3001102122322288181818 11299999911111199999911111999993311182818 3001103182818 892148882111111119999999 122883 2299821948 30011046601893249242331 111 212190100

The Data 3001101 1999503 1 3001102122322288181818 11299999911111199999911111999993311182818 3001103182818 892148882111111119999999 122883 2299821948 30011046601893249242331 111 212190100 9000311 300110500000100000000000 3001106 1. 1951 1. 1345 1. 1474 1. 1585 3001107 1. 1559 1. 0007 1. 0461 1. 1416 3001201 2329503 2 3001202238543388881288 11299999911881199999111113231282882 3001203222882 1882882222999999911231221221212 322814 8103011942 30012043209492892242314 221 282071000 9470711 3001205100100000000000000 3001206 1. 0056 0. 8949 0. 9050 0. 8557 3001207 1. 0988 0. 9358 0. 8786 0. 8586 3001301 5349503 1 3001302358332888111888 11799999988881199999933333999992221181822 3001303181822 188482231121211124149999999 212884 3399811948 30013046405399393111511 212121000 9550311 300130510000000000000000 3001306 1. 1951 0. 8094 0. 6256 0. 8518 3001307 1. 1559 0. 5942 0. 4393 0. 8840 3001401 1029503 2 300140234234221811111128888888122 100199999922888299999822882212121828 3001403118821 1112222311999999912112182221122 212213 2202538148 30014044805399119381311 211 131491000 9540311 3001405000001000000000010 3001406 0. 7594 0. 6758 0. 7376 0. 7498 3001407 0. 7829 0. 6668 0. 7040 0. 7600

The Codebook From the codebook for the data: The Field (California) Poll #96 -04

The Codebook From the codebook for the data: The Field (California) Poll #96 -04 THE FIELD INSTITUTE INTERVIEWING PERIODS: AUGUST 29 - SETEMBER 7, 1996 NUMBER OF CASES: 1023 VARIABLE Q 7. 15 RATE PERFORMANCE-BARBARA BOXER DECK 2/17 WHAT KIND OF JOB DO YOU THINK BARBARA BOXER IS DOING AS U. S. SENATOR - A VERY GOOD, FAIR, POOR OR VERY POOR JOB? N OF CASES 33 130 134 63 43 107 513 ____ 1023 VALUE 1 2 3 4 5 8 9 TOTAL VALUE LABEL VERY GOOD FAIR POOR VERY POOR NO OPINION NOT APPLICABLE (NOT FORM B)

Statistical Software • Designed to read large files of raw numeric data • Not

Statistical Software • Designed to read large files of raw numeric data • Not a spreadsheet! – Can handle many more variables and cases. – Can do more elaborate and accurate statistics. – Designed to handle data (cases, observations, variables, weights), not unstructured “cells. ”

GAUSS JMP Mini. Tab S-Plus SAS SPSS Stata Systat

GAUSS JMP Mini. Tab S-Plus SAS SPSS Stata Systat

Codebook Describe data layout Write commands to analyze data 3001101 1999503 1 3001102122322288181818 11299999911111199999911111999993311182818

Codebook Describe data layout Write commands to analyze data 3001101 1999503 1 3001102122322288181818 11299999911111199999911111999993311182818 3001103182818 892148882111111119999999 122883 2299821948 30011046601893249242331 111 212190100 9000311 300110500000100000000000 3001106 1. 1951 1. 1345 1. 1474 1. 1585 3001107 1. 1559 1. 0007 1. 0461 1. 1416 3001201 2329503 2 3001202238543388881288 11299999911881199999111113231282882 3001203222882 1882882222999999911231221221212 322814 8103011942 30012043209492892242314 221 282071000 9470711 (data) SPSS

reference strategies • Gov publications approach – What agency would produce such a statistic?

reference strategies • Gov publications approach – What agency would produce such a statistic? • Does the mandate or goals include the scope of content? • Who are the members of the agency, if the agency is a membership organization? – What jurisdiction responsible for this content? – Is this likely an official or non-official statistic? – What publication titles are related to this content? – What is the availability of statistics from the agency • Data librarian approach – What data source would be used to produce such a statistic? – Who would collect such data? – What unit of observation would be needed to produce such a statistics? – What would the structure of the table look like given time, geography and attributes of the unit of observation? – Would the source be in the realm of official or non-official statistics? – Use the literature trail and its indexes (non-official vs. official publications)

the data reference interview process • The information-seeking context is as important to statistics

the data reference interview process • The information-seeking context is as important to statistics and data as other reference interviews. • How is the data reference interview similar to general reference interviews? • How is the data reference interview different?

research on the data reference interview process • A colleague is developing a model

research on the data reference interview process • A colleague is developing a model from which comparisons can be made between the general and data reference interviews. • One aspect of the model, namely the discovery and clarification of concepts and language, is being investigated using items from a specialist discussion list and a blog. http: //blogs. library. ualberta. ca/digrs/