实证白皮书
标签(空格分隔): Econometrics Empirical_Research
简介
回归分析
线性回归模型是实证分析框架的基础。基于古典假定,线性回归模型的最小二乘估计量(OLS)是最优线性无偏估计量。因此,实证研究中回归分析的起点都是最小二乘法。
面板数据分析
如果模型的内生性问题是由于非观测效应所引起,那么面板数据提供了消除这一偏误绝佳的方法。面板数据的优点如此明显,因此在实证中,要充分利用面板数据的优点进行分析。
政策评估方法
政策评估的方法在近年来非常流行,这是因为它的分析框架切合了实证研究的本质,即识别和估计因果关系。它的基本分析框架是潜在结果模型和鲁宾因果框,流行的方法有匹配(Matching)和倾向评分模型(PSM)、双重差分模型(DID)以及断点回归模型(RDD)。
线性回归模型与最小二乘法
软件实现及案例
stata
R
面板数据的固定效应和随机效应方法
软件实现及案例
stata
R
潜在结果估计和鲁宾因果框架
Since the early 1990s the potential outcome, or Neyman-Rubin Causal Model, approach to these problems has gained substantial acceptance as a framework for analyzing causal problems.
鲁宾因果框架(潜在结果模型)是因果分析的基本框架。
Causal effects are comparison of pairs of potential outcomes for the same unit, e.g. the difference $Y_{i}(\omega^{'}) - Y_{i}(\omega)$.
因果关系的衡量方式是对于同一单位处理与控制结果的比较。
We can never directly observe the causal effects, which is what Holland(1986) calls the "fundamental problem of causal inference".
由于无法同时观测到两个结果,所以无法直接得到因果效应,这就是因果推断的基础问题。
Esimates of causal effects are ultimately based on comparisons of different untis with different levels of the treatment.
因此,对于处理效应的估计依赖于对于不同单位观测值的比较。
术语与假设
Potential outcome model(POM)
$$Y_{i} = Y_{0i} + D_{i}(Y_{1i}-Y_{0i})$$
Treatment effect(TE)
$$TE_{i} = Y_{1i} - Y_{0i}$$
Average Treatment Effect(ATE)
$$ATE = E(Y_{1i} - Y_{0i})$$
Average Treatment Effect on the treated(ATET)
$$ATE = E(Y_{1i} - Y_{0i}|D=1)$$
Average Treatment Effect on the untreated(ATENT)
$$ATE = E(Y_{1i} - Y_{0i}|D=0)$$
Unconfoundedness Assumption
$$(Y_{0},Y_{1})\perp D|X$$
随机分配下的识别问题
If the sample was drawn at random(i.e., under random assignment), it would be possible to estimate the ATE as the difference between the sample mean of treated and the sample mean of untreated units, which is the well-known "Difference-in-means"(DIM) estimator of classical statistics.
如果是随机分配样本,那么ATE就是处理样本和控制样本的均值差,这就是常用的DIM估计量。
We call this the independence assumption(IA) fromally stating that:
$$(Y_{0},Y_{1})\perp D$$
基于IA条件,这个条件意味着样本分组与潜在结果没有直接联系。
DIM
$$\hat{DIM} = \frac{1}{N_{1}} \sum_{i=1}^{N_{1}} Y_{1,i} - \frac{1}{N_{0}} \sum_{i=1}^{N_{0}} Y_{0,i}$$
Regression-Adjustment
RA is suitable only when the conditional independence assumption(CIA) holds. RA方法只有在CIA条件成立时才适用。CIA条件写为
$$(Y_{0},Y_{1})\perp D|X$$
A less restrictive assumption which only limits independence to the mean is required. 这是一个比CIA更弱的CMI条件。It is known as conditional mean independence(or CMI) and implies that:
$$E(Y_{1}|x,D)=E(Y_{1}|x)$$
以及
$$E(Y_{0}|x,D)=E(Y_{0}|x)$$
因此,当存在$x$时,可以导出下列结论,two identification conditions of the unobservable counterfactual mean potential outcomes:
$$E(Y_{0}|x,D=1)=E(Y_{0}|x,D=0)$$
以及
$$E(Y_{1}|x,D=1)=E(Y_{1}|x,D=0)$$
Under CMI, we could see:
$$ATE(x)=E(Y|x,D=1) - E(Y|x,D=0)$$
By simply denoting:
$$m_{1}(x)=E(Y|x,D=1)$$
$$m_{0}(x)=E(Y|x,D=0)$$
We have that:
$$ATE(x)=m_{1}(x) - m_{0}(x)$$
一旦得到$m_{1}(x)$和$m_{0}(x)$的一致估计量,我们就可以通过上述估计causal parameters。
$$\hat{ATE} = \frac{1}{N}\sum_{i=1}^{N}[\hat{m}{1}(x{i}) - \hat{m}{0}(x{i})]$$
以及
$$\hat{ATET} = \frac{1}{N_{1}}\sum_{i=1}^{N}D_{i}[\hat{m}{1}(x{i}) - \hat{m}{0}(x{i})]$$
以及
$$\hat{ATENT} = \frac{1}{N_{0}}\sum_{i=1}^{N}(1-D_{i})[\hat{m}{1}(x{i}) - \hat{m}{0}(x{i})]$$
这就是Regression-adjustment估计量。其中$m_{1}(x)$和$m_{0}(x)$可以通过参数、半参和非参估计。Note that the Regression-adjustment approach only uses the potential outcome means to recover ATEs and does not use the propensity score.