9 Stata Fudan University 1 Why learn Stata

  • Slides: 40
Download presentation
社会研究方法 第 9讲: Stata软件介绍 Fudan University

社会研究方法 第 9讲: Stata软件介绍 Fudan University

1 Why learn Stata (三大主要的统计软件的比较 - SPSS, SAS, Stata ) l l l SAS:

1 Why learn Stata (三大主要的统计软件的比较 - SPSS, SAS, Stata ) l l l SAS: – having a comparative advantage in data management. 较强的数据管理功 能 – SAS is often seen as relatively difficult to learn. 相对学习起来比较困难。 SPSS – SPSS’s comparative advantage is ease of use 使用方便 – but with lesser capabilities for data management or statistical analysis. 但是数据管理或者统计分析功能率逊一筹 Stata – is usually viewed as somewhere in between SAS and SPSS on these fronts, easier to learn and use than SAS but with stronger data management and statistical analysis capabilities than SPSS. 介于SAS和 SPSS之间,相对比SAS容易学习,其数据管理和统计分析功能要高于SPSS – Stata also stays at the forefront of statistical analysis 一直在更新 – Scholars regularly share such code and user-written code is easy to install from the Internet 学者定期分享用户撰写的code, 并可以直接安装到软 件 Fudan University 2

3 Stata窗口及其基本操作 Fudan University 4

3 Stata窗口及其基本操作 Fudan University 4

Stata windows l l Fudan University In stata command, it is helpful to point

Stata windows l l Fudan University In stata command, it is helpful to point out stata is casesensitive, meaning that if a variable name include a capital letter, we must capitalize it. (Stata命令对大小写敏感) Open do file: 在Do-file中编程。直接的好处便是我们可以很方便的 执行以前写过的命令,并记录我们需要的命令,方便下一次的使用 和分析。在复杂的分析中,采用Command窗口输入的方式会是非 常的困难,我们必须用do-file去编程。另外,在do-file文件中,用* 去编程。 或者/* and */. 表示注释内容,Stata在运行do-file时会跳过这些注释 */. 语句。加入注释语句能增强do-file的可读性。 – Open the Do-File Editor. You can type “doedit” in the command window. When it first opens, the main screen is blank. You can type text into this main white screen to write our batch program. – To RUN: select and “execute selection”; or do ***(file name) 6

5 stata语法和命令 l l l [by varlist: ] command [varlist] [=exp] [if exp] [in

5 stata语法和命令 l l l [by varlist: ] command [varlist] [=exp] [if exp] [in range] [weight] [, options] 最核心的是 command <varlist> – 例如stata 命令:tab policy sex, col – 这条命令是告诉计算机按照sex(列变量)对policy(行变量)做列联表, 并计算列百分比 - Replace 常用的命令: - Summarize – Codebook - Table – Describe - Tabulate – Display (calculator) - Use (variable name) using – Drop or keep (file name) – Generate – Label define – Label variable – Label value – Recode Fudan University 8

How data are stored in computer 矩阵:行是样本,列是变量 Fudan University 10

How data are stored in computer 矩阵:行是样本,列是变量 Fudan University 10

How data are stored in computer l l l Fudan University Quantitative data are

How data are stored in computer l l l Fudan University Quantitative data are stored in a matrix of rows and columns, in which each cell represents a specific data point. Each row represents a study participant. Each column represents a variable. Typically, the first column contains a variable that idenfiies the participant (or case), often called a case id. 11

Data editor Fudan University 12

Data editor Fudan University 12

Open a data file l l Fudan University In stata, we must first open

Open a data file l l Fudan University In stata, we must first open a data file before we can analyze its content, often referred to as “using the data file”. The command we will use to read stata data into memory is: 打开 现有的stata数据. use filename, clear Saving a stata data file 保存stata数据 Save <filename>, replace 可以直接将excel中的数据copy, paste到stata 中 Stat Transfer 软件 13

Log file (log文件) l l l Fudan University Results appear only on the stata

Log file (log文件) l l l Fudan University Results appear only on the stata results window. Typically we would like to save these to use when we write about our results in a paper. In stata, we accomplish this with the command: 使用日志(log)。它 可以帮助我们记录stata的运行结果。 log using <file name>, replace/append text log close Replace: 覆盖原有的log 日志;append: 继续写原来的log日 志 Text: 可使log日志保存为普通文本格式(可以由任何的文本编 辑器或者word打开)。如果不加text, stata的默认储存格式是. smcl 格式,这个格式只能用stata来打开。 14

Fudan University 17

Fudan University 17

重新编码recoding的几种情况 l 将一个变量的类别合并成较少的类别 collapse categories of a variable into a smaller number of categories

重新编码recoding的几种情况 l 将一个变量的类别合并成较少的类别 collapse categories of a variable into a smaller number of categories l 通过创建一组代表新维度的新类别来重新定义一个变量 redefine a variable by creating a new set of categories representing a new dimension 给一个变量的各个类别制定测量得分assigning scale scores to the categories of a variable l Fudan University 19

create new variables创立新的变量 l It is worth reiterating here that we highly recommend not

create new variables创立新的变量 l It is worth reiterating here that we highly recommend not altering the variables in the raw data file. Instead, create a new variable for your analytic data file. Not only does this allow you to give your analytic variable a name that is meaningful to you, but it also preserves the original variables so that you can double-check your work and easily trace back to the raw data. 不建议在原始数据上对变 量修改。建议在分析时创建新的变量。 l The basic commands for creating and modifying variables in stata are: 建立和修改变量的命令: Generate Replace Recode Fudan University 26

create new variables逻辑符号 Fudan University 27

create new variables逻辑符号 Fudan University 27

Tabulate l l l Fudan University 单变量的频数分布: Tabulate <var 1> 双变量的频数分布: Tabulate <var 1>

Tabulate l l l Fudan University 单变量的频数分布: Tabulate <var 1> 双变量的频数分布: Tabulate <var 1> <var 2>, missing We typically include the missing option. By default, stata omits cases that are coded ‘. ’ (Missing) on either variable from the cross-tabulation. But we would like to verify that cases with missing value codes on the original variable are appropriately converted to ‘. ’ (Missing) on the new variable, so we want to include them in the cross tabulation. 28

l Fudan University Example : auto. dta 29

l Fudan University Example : auto. dta 29

Fudan University 30

Fudan University 30

Fudan University 31

Fudan University 31

Fudan University 32

Fudan University 32

Fudan University 33

Fudan University 33

Fudan University 34

Fudan University 34

Fudan University 35

Fudan University 35

结论:foreign cars are better than domestic cars when compared to repair record Fudan University

结论:foreign cars are better than domestic cars when compared to repair record Fudan University 37

也可以用: recode price(0/4000=1 )(4001/8000=2) (8001 -16000=3), gen (price 3) Fudan University 39

也可以用: recode price(0/4000=1 )(4001/8000=2) (8001 -16000=3), gen (price 3) Fudan University 39

Fudan University 40

Fudan University 40