STATA Boot Camp Day 1 Introduction to STATA

STATA Boot Camp Day 1: Introduction to STATA Sarah Fletcher Mercaldo Nate Mercaldo Andrew Wiese

Objectives of Day 1 • Review of basic STATA features • Introduce basic STATA functions – Importing data – Perform simple data checks – Tabulate and summarize variables – Perform simple analyses – Save and Export Data • Introduce STATA help and other resources

Materials and Datasets for Day 1 • Create a folder on your desktop or wherever you find most appropriate • Download the following files into that folder from the class email: – Stata. Example. Dataset. dta – Deaths. xlsx – Bootcamp Epi Analysis. csv

Exploring the STATA Interface • • • Command window Output Window Variable Window Property Window Review Window

Getting started in STATA • Create log file – File Log Begin Save to folder of interest

Opening/Importing data • STATA can open multiple different types of dataset types: – STATA dataset – Excel files – CSV files – Files directly from the internet

Opening/Importing data • Opening a STATA dataset: – File Open • Opening another file type: – File Import Select File Type • Excel • CSV (comma, tab delimited)

Performing simple data checks • View the dataset – “Browse” command • Data Editor Browse – “List” command • Data Describe Data List Data

Exercise 1 • Import Excel file titled “Bootcamp Epi Analysis. csv” • Explore the dataset using the “list” and “browse” commands

Tabulate and Summarize • Re-open “Example. STATAdataset. dta” or “Deaths. xlsx” • Tabulate command – Primarily for use in describing categorical variables Statistics Summaries, Tables, and Tests Frequency Tables Tabulate “variable-name” • Summarize command – Explore very simple descriptive statistics for continuous variables Data Describe Data Summary Statistics Summarize “variable-name”

Keep and Remove • Keep command – Will create a new dataset that includes only the observations or variables that are specified in the “keep” command Data Editor Edit Right-Click Variable Data Keep only selected data Keep “variable-name 1 variable-name 2 …” or • Drop command – Will create a new dataset that excludes the observations or variables that are specified in the “drop” command Data Editor Edit Right-Click Variable Data Drop selected data Drop “variable-name 1 variable-name 2 …” or

Exercise 2 • Use the summarize and tabulate commands to answer the following questions from the Bootcamp Epi Analysis. csv dataset: 1. How many Lung Cancer cases are there (LUNGCANCER=1)? 2. What is the mean value of the AMOUNT variable? 3. What was the highest and lowest values for the AMOUNT variable?

Saving and Exporting Data • Re-open “Example. STATAdataset. dta” or “Deaths. xlsx” • Save command – Will save current iteration of the dataset as a STATA dataset in your current working directory File Save As Select folder and create dataset name Code: Save “Dataset Name”, replace • Export command – Limited STATA ability to export current STATA dataset as an Excel or CSV file File Export Select file type and save name Code: Export “file-name”, replace

Exercise 3 • Open the Bootcamp Epi Analysis. csv file, and save it as a STATA dataset in your Bootcamp folder (give it any name you would prefer) • Remove the “LUNGCANCER” variable • Browse or list the dataset to confirm variable was removed • Save as a. dta • Clear, Open the. dta confirm variable is gone • Re-open the dataset you saved with the file-name in the first step (original file) and confirm that “LUNGCANCER” variable is present again

Simple Epidemiological Analysis • Import the “Bootcamp Epi Analysis. csv” file • This was a case-control study, and so we want to calculate a simple 2 x 2 table and a odds ratio of lung cancer associated with smoking exposure. • How would we go about finding the appropriate STATA code for getting an odds ratio?

STATA Help • Help file structure – Title – Syntax – basic and options – How to access via pulldown menu – Description – Options – Examples – Stored results

Simple Epidemiological Analysis • 2 x 2 Analysis of Case-Control Data CS “Outcome-Variable” “Exposure-Variable”, options

Other Resources • UCLA Website – http: //statistics. ats. ucla. edu/stata/ • STATA Website –http: //www. stata. com/links/resources-forlearning-stata/ • STATA Customer Service

Other Resources • STATA Tutorials (FREE!) –http: //www. cpc. unc. edu/research/tools/data_a nalysis/statatutorial/ –http: //data. princeton. edu/stata/ –http: //www. stata. com/links/video-tutorials/

Questions? sarah. fletcher@Vanderbilt. edu
- Slides: 20