Stata and logit recap Topics Introduction to Stata
- Slides: 23
Stata and logit recap
Topics • Introduction to Stata – Files / directories – Stata syntax – Useful commands / functions • Logistic regression analysis with Stata – Estimation – Goodness Of Fit – Coefficients – Checking assumptions
Overview of Stata commands • Note: we did this interactively for the larger part …
Stata file types • . ado – programs that add commands to Stata • . do – Batch files that execute a set of Stata commands • . dta – Data file in Stata’s format • . log – Output saved as plain text by the log using command (you could add. txt as well)
The working directory • The working directory is the default directory for any file operations such as using & saving data, or logging output cd “d: my work”
Saving output to log files • Syntax for the log command log using [filename], replace text • To close a log file log close
Using and saving datasets • Load a Stata dataset use d: myprojectdata. dta, clear • Save save d: myprojectdata, replace • Using change directory cd d: myproject use data, clear save data, replace
Entering data • Data in other formats – You can use SPSS to convert data that can be read with Stata. Unfortunately, not the other way around (anymore) – You can use the infile and insheet commands to import data in ASCII format – Direct import and export of Excel files in Stata is possible too • Entering data by hand (don’t do this …) – Type edit or just click on the data-editor button
Do-files • You can create a text file that contains a series of commands. It is the equivalent of SPSS syntax (but way easier to memorize) • Use the do-file editor to work with do-files
Adding comments in do-files • // or * denote comments stata should ignore • Stata ignores whatever follows after /// and treats the next line as a continuation • Example II
A recommended template for do-files capture log close set more off cd d: myproject //if a log file is open, close it, otherwise disregard //dont'pause when output scrolls off the page //change directory to your working directory log using myfile, replace text //log results to file myfile. log … here you put the rest of your Stata commands … log close //close the log file
Serious data analysis • Ensure replicability use do+log files • Document your do-files – What is obvious today, is baffling in six months • Keep a research log – Diary that includes a description of every program you run • Develop a system for naming files
Serious data analysis • New variables should be given new names • Use variable labels and notes (I don’t like value labels though) • Double check every new variable • ARCHIVE
Stata syntax examples
Stata syntax example regress y x 1 x 2 if x 3<20, cluster(x 4) 1. regress = command – What action do you want to performed 2. y x 1 x 2 = Names of variables, files or other objects – On what things is the command performed 3. if x 3 <20 = Qualifier on observations – On which observations should the command be performed 4. , cluster(x 4) = Options appear behind the comma – What special things should be done in executing the command
More examples tabulate smoking race if agemother>30, row More elaborate if-statements: sum agemother if smoking==1 & weightmother<100
Elements used for logical statements Operator Definition Example == is equal in value to if male == 1 != not equal in value to if male !=1 > greater than if age > 20 >= greater than or equal to if age >=21 < less than if age < 66 <= less than or equal to if age <=65 & and if age==21 & male==1 | or if age<=21 | age>=65
Missing values • Automatically excluded when Stata fits models (same as in SPSS); they are stored as the largest positive values • Beware!! – The expression “age>65” can thus also include missing values (these are also larger than 65) – To be sure type: “age>65 & age!=. ”
Selecting observations drop [variable list] keep [variable list] drop if age<65 Note: they are then gone forever. This is not SPSS’s [filter] command.
Creating new variables Generating new variables generate age 2 = age*age (for more complicated functions, there also exists a command “egen”, as we will see later)
Useful functions Function Definition Example + addition gen y = a+b - subtraction gen y = a-b / Division gen density=population/area * Multiplication gen y = a*b ^ Take to a power gen y = a^3 ln Natural log gen lnwage = ln(wage) exponential gen y = exp(b) sqrt Square root gen agesqrt = sqrt(age)
Replace command • replace has the same syntax as generate but is used to change values of a variable that already exists gen age_dum 5 =. replace age_dum 5 = 0 if age < 5 replace age_dum 5 = 1 if age >=5
Recode • Change values of existing variables – Change 1 to 2 and 3 to 4 in origvar, and call the new variable myvar 1: recode origvar (1=2)(3=4), gen(myvar 1) – Change 1’s to missings in origvar, and call the new variable myvar 2: recode origvar (1=. ), gen(myvar 2)
- Logit link function
- Logit model specification
- Variable dependiente binaria
- Logistic regression vs logit
- Independence of irrelevant alternatives
- Probit model
- Logitone
- Variveis
- Interpretation of linear probability model
- Logit model
- Introduction for recap
- Recap introduction
- Romeo and juliet act 1 summary
- Shawshank redemption summary
- The great gatsby chapter 8 short summary
- Price matching
- What is the purpose of an iteration recap?
- Recap intensity clipping
- 60 minutes recap
- Recap database
- Differentiation recap
- Recap from last week
- Discussion questions for act 1 of the crucible
- Pontius pilate meaning in the crucible