纠正错误认知——固定效应模型

天鹰.jpg

天鹰(中南财大——博士研究生)
E-mail: [[email protected]]


最近在利用STATA跑回归的过程中,发现了一个问题,在利用reg和xtreg这两个命令做单向固定效应模型时,出现了相同的结果。原本的认识是利用reg添加虚拟变量的形式能够实现个体固定、时间固定以及个体时间双向固定,而xtreg,fe实现的是个体时间双向固定,在错误的认知下,发现reg i.id和xtreg,fe跑出的结果竟然完全一致,这是不应该有的结果,产生这样的结果也促使自己再次追本溯源,一步步发现问题所在。


  • 接下来,本文利用本人论文中的相关数据,对上述问题进行演示,同时,进一步汇总单向、双向以及多维固定效应的相关命令,以便对上述问题有一个更全面认识。
  • 我们在论文中经常会见到列示OLS、随机效应、个体固定、时间固定以及双向固定的回归结果,那么对于面板数据来说,常用的相关命令无非是reg、xtreg等。

1.xtreg(官方命令)

xtreg,fe是固定效应模型的官方命令,使用这一命令估计出来的系数是最为纯正的固定效应估计量(组内估计量)。xtreg对数据格式有严格要求,要求必须是面板数据,在使用xtreg命令之前,我们首先需要使用xtset命令进行面板数据声明,定义截面(个体)维度和时间维度。

在xtreg命令后加上选项fe,那就表示使用固定效应组内估计方法进行估计,并且默认为个体固定效应,定义在xtset所设定的截面维度上。如果要进行时间固定,则需要在模型中通过i.year引入虚拟变量来表示。

结果演示:
xtreg   rca_gvc   l.ai   lncd   lnpi   lnsize   lnimr   ,fe    / /  个体固定效应
Fixed-effects (within) regression               Number of obs     =        238
Group variable: id                              Number of groups  =         17

R-sq:                                           Obs per group:
     within  = 0.1593                                         min =         14
     between = 0.0173                                         avg =       14.0
     overall = 0.0205                                         max =         14

                                                F(5,216)          =       8.18
corr(u_i, Xb)  = -0.3699                        Prob > F          =     0.0000

------------------------------------------------------------------------------
     rca_gvc |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          ai |
         L1. |   .3066382   .0959536     3.20   0.002     .1175129    .4957634
             |
        lncd |  -.1003965   .0677319    -1.48   0.140    -.2338966    .0331035
        lnpi |  -.1923152   .0942642    -2.04   0.043    -.3781107   -.0065197
      lnsize |   .1256957   .0444703     2.83   0.005     .0380445     .213347
       lnimr |   .1070733   .0641571     1.67   0.097    -.0193809    .2335275
       _cons |   1.741834   .4940647     3.53   0.001     .7680291     2.71564
-------------+----------------------------------------------------------------
     sigma_u |  .67217532
     sigma_e |  .14649365
         rho |  .95465604   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(16, 216) = 174.47                   Prob > F = 0.0000
  • 如果是双向固定,命令如下:
xtreg   rca_gvc   l.ai   lncd   lnpi   lnsize   lnimr  i.year ,fe    / /  个体时间双固定效应
结果如下:
Fixed-effects (within) regression               Number of obs     =        238
Group variable: id                              Number of groups  =         17

R-sq:                                           Obs per group:
     within  = 0.2555                                         min =         14
     between = 0.0006                                         avg =       14.0
     overall = 0.0000                                         max =         14

                                                F(18,203)         =       3.87
corr(u_i, Xb)  = -0.6814                        Prob > F          =     0.0000

------------------------------------------------------------------------------
     rca_gvc |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          ai |
         L1. |    .479147    .106974     4.48   0.000     .2682243    .6900697
             |
        lncd |    .229894   .1027937     2.24   0.026     .0272137    .4325743
        lnpi |  -.1892991   .1000526    -1.89   0.060    -.3865747    .0079765
      lnsize |   .3399658   .0713975     4.76   0.000       .19919    .4807415
       lnimr |   .0390783   .0704173     0.55   0.580    -.0997648    .1779214
             |
        year |
       2002  |  -.0395395    .050029    -0.79   0.430    -.1381826    .0591036
       2003  |   -.059694   .0540411    -1.10   0.271     -.166248    .0468599
       2004  |  -.1355244   .0630919    -2.15   0.033    -.2599239    -.011125
       2005  |  -.1629442   .0714925    -2.28   0.024    -.3039073   -.0219811
       2006  |  -.2361056   .0885851    -2.67   0.008    -.4107705   -.0614407
       2007  |  -.3275978   .1047054    -3.13   0.002    -.5340475   -.1211481
       2008  |  -.3937222    .123663    -3.18   0.002    -.6375509   -.1498935
       2009  |  -.4627217   .1311296    -3.53   0.001    -.7212724   -.2041711
       2010  |  -.5822361   .1501323    -3.88   0.000    -.8782549   -.2862174
       2011  |  -.6646753   .1765024    -3.77   0.000    -1.012688   -.3166623
       2012  |  -.7010857   .1884788    -3.72   0.000    -1.072713   -.3294585
       2013  |  -.7910881   .2010942    -3.93   0.000    -1.187589   -.3945869
       2014  |   -.894121   .2109027    -4.24   0.000    -1.309962   -.4782801
             |
       _cons |  -1.021565   .9437412    -1.08   0.280    -2.882358    .8392272
-------------+----------------------------------------------------------------
     sigma_u |   .8650854
     sigma_e |  .14220283
         rho |   .9736901   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(16, 203) = 182.56                   Prob > F = 0.0000
  • 其实,对于上述结果,完全可以利用reg添加虚拟变量的形式进行实现。
  • 利用reg实现个体固定效应,命令和结果如下:
 reg   rca_gvc   l.ai   lncd    lnpi   lnsize   lnimr   i.id    / / 个体固定效应

      Source |       SS           df       MS      Number of obs   =       238
-------------+----------------------------------   F(21, 216)      =    198.11
       Model |  89.2834348        21  4.25159213   Prob > F        =    0.0000
    Residual |  4.63544423       216   .02146039   R-squared       =    0.9506
-------------+----------------------------------   Adj R-squared   =    0.9458
       Total |   93.918879       237   .39628219   Root MSE        =    .14649

------------------------------------------------------------------------------
     rca_gvc |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          ai |
         L1. |   .3066382   .0959536     3.20   0.002     .1175129    .4957634
             |
        lncd |  -.1003965   .0677319    -1.48   0.140    -.2338966    .0331035
        lnpi |  -.1923152   .0942642    -2.04   0.043    -.3781107   -.0065197
      lnsize |   .1256957   .0444703     2.83   0.005     .0380445     .213347
       lnimr |   .1070733   .0641571     1.67   0.097    -.0193809    .2335275
             |
          id |
          2  |   1.882314   .1139589    16.52   0.000       1.6577    2.106928
          3  |   .9406015   .1224498     7.68   0.000     .6992519    1.181951
          4  |   .8582898    .127059     6.76   0.000     .6078555    1.108724
          5  |  -.0231274   .0606444    -0.38   0.703    -.1426579    .0964031
          6  |   .3334653   .1081376     3.08   0.002     .1203253    .5466053
          7  |  -.1342764   .1270614    -1.06   0.292    -.3847154    .1161625
          8  |   .0188374   .0861588     0.22   0.827    -.1509821     .188657
          9  |  -.7181154   .0702175   -10.23   0.000    -.8565147   -.5797161
         10  |   .3462213   .0788628     4.39   0.000     .1907822    .5016604
         11  |   .3725729   .0869125     4.29   0.000     .2012678     .543878
         12  |   .5364166   .0660762     8.12   0.000     .4061799    .6666532
         13  |  -.1958921   .0961684    -2.04   0.043    -.3854407   -.0063436
         14  |  -.8969968   .1289829    -6.95   0.000    -1.151223   -.6427706
         15  |    .020054   .1922435     0.10   0.917    -.3588594    .3989674
         16  |  -.5532008    .266913    -2.07   0.039    -1.079288   -.0271133
         17  |  -.2964652   .0828788    -3.58   0.000      -.45982   -.1331104
             |
       _cons |   1.595323   .5000321     3.19   0.002     .6097556     2.58089
------------------------------------------------------------------------------

2.reg

  • 利用reg实现个体时间双固定效应,命令和结果如下:
. reg   rca_gvc   l.ai   lncd   lnpi   lnsize   lnimr   i.id   i.year  / / 个体时间双固定效应

      Source |       SS           df       MS      Number of obs   =       238
-------------+----------------------------------   F(34, 203)      =    130.63
       Model |  89.8138851        34  2.64158486   Prob > F        =    0.0000
    Residual |  4.10499394       203  .020221645   R-squared       =    0.9563
-------------+----------------------------------   Adj R-squared   =    0.9490
       Total |   93.918879       237   .39628219   Root MSE        =     .1422

------------------------------------------------------------------------------
     rca_gvc |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          ai |
         L1. |    .479147    .106974     4.48   0.000     .2682243    .6900697
             |
        lncd |    .229894   .1027937     2.24   0.026     .0272137    .4325743
        lnpi |  -.1892991   .1000526    -1.89   0.060    -.3865747    .0079765
      lnsize |   .3399658   .0713975     4.76   0.000       .19919    .4807415
       lnimr |   .0390783   .0704173     0.55   0.580    -.0997648    .1779214
             |
          id |
          2  |   2.306169   .1521725    15.15   0.000     2.006128    2.606211
          3  |   1.299284   .1397215     9.30   0.000     1.023792    1.574775
          4  |   1.412836   .1691219     8.35   0.000     1.079375    1.746297
          5  |   .0287146   .0607942     0.47   0.637    -.0911544    .1485837
          6  |   .8257237   .1478991     5.58   0.000     .5341082    1.117339
          7  |  -.4835427   .1427926    -3.39   0.001    -.7650895   -.2019959
          8  |  -.1414247   .0902781    -1.57   0.119    -.3194277    .0365783
          9  |  -.6196797   .0711185    -8.71   0.000    -.7599054   -.4794541
         10  |   .4309111    .081576     5.28   0.000     .2700661    .5917561
         11  |   .3707501   .0894323     4.15   0.000     .1944148    .5470854
         12  |    .344849   .0791605     4.36   0.000     .1887668    .5009312
         13  |   -.080348   .1029481    -0.78   0.436    -.2833327    .1226366
         14  |  -1.006378   .1339116    -7.52   0.000    -1.270414   -.7423421
         15  |   .1569042   .2019668     0.78   0.438    -.2413176     .555126
         16  |   -1.03071   .2875253    -3.58   0.000    -1.597629   -.4637909
         17  |     .35005   .1753643     2.00   0.047     .0042808    .6958191
             |
        year |
       2002  |  -.0395395    .050029    -0.79   0.430    -.1381826    .0591036
       2003  |   -.059694   .0540411    -1.10   0.271     -.166248    .0468599
       2004  |  -.1355244   .0630919    -2.15   0.033    -.2599239    -.011125
       2005  |  -.1629442   .0714925    -2.28   0.024    -.3039073   -.0219811
       2006  |  -.2361056   .0885851    -2.67   0.008    -.4107705   -.0614407
       2007  |  -.3275978   .1047054    -3.13   0.002    -.5340475   -.1211481
       2008  |  -.3937222    .123663    -3.18   0.002    -.6375509   -.1498935
       2009  |  -.4627217   .1311296    -3.53   0.001    -.7212724   -.2041711
       2010  |  -.5822361   .1501323    -3.88   0.000    -.8782549   -.2862174
       2011  |  -.6646753   .1765024    -3.77   0.000    -1.012688   -.3166623
       2012  |  -.7010857   .1884788    -3.72   0.000    -1.072713   -.3294585
       2013  |  -.7910881   .2010942    -3.93   0.000    -1.187589   -.3945869
       2014  |   -.894121   .2109027    -4.24   0.000    -1.309962   -.4782801
             |
       _cons |  -1.266513   .9626042    -1.32   0.190    -3.164498    .6314721
------------------------------------------------------------------------------
  • 但是由上述回归结果可以发现,结果中会一并呈现出个体或者时间虚拟变量的结果,给人产生冗余感,那么另一个命令可以很好解决这个问题,即areg,absorb(),不想出现个体或时间虚拟变量,只需在absorb()中添加对应的类别变量即可。

3.areg

对应的命令和结果演示如下:
. areg  rca_gvc  l.ai   lncd  lnpi   lnsize  lnimr  i.id , absorb(year) 
Linear regression, absorbing indicators         Number of obs     =        238
                                                F(  21,    203)   =     210.60
                                                Prob > F          =     0.0000
                                                R-squared         =     0.9563
                                                Adj R-squared     =     0.9490
                                                Root MSE          =     0.1422

------------------------------------------------------------------------------
     rca_gvc |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          ai |
         L1. |    .479147    .106974     4.48   0.000     .2682243    .6900697
             |
        lncd |    .229894   .1027937     2.24   0.026     .0272137    .4325743
        lnpi |  -.1892991   .1000526    -1.89   0.060    -.3865747    .0079765
      lnsize |   .3399658   .0713975     4.76   0.000       .19919    .4807415
       lnimr |   .0390783   .0704173     0.55   0.580    -.0997648    .1779214
             |
          id |
          2  |   2.306169   .1521725    15.15   0.000     2.006128    2.606211
          3  |   1.299284   .1397215     9.30   0.000     1.023792    1.574775
          4  |   1.412836   .1691219     8.35   0.000     1.079375    1.746297
          5  |   .0287146   .0607942     0.47   0.637    -.0911544    .1485837
          6  |   .8257237   .1478991     5.58   0.000     .5341082    1.117339
          7  |  -.4835427   .1427926    -3.39   0.001    -.7650895   -.2019959
          8  |  -.1414247   .0902781    -1.57   0.119    -.3194277    .0365783
          9  |  -.6196797   .0711185    -8.71   0.000    -.7599054   -.4794541
         10  |   .4309111    .081576     5.28   0.000     .2700661    .5917561
         11  |   .3707501   .0894323     4.15   0.000     .1944148    .5470854
         12  |    .344849   .0791605     4.36   0.000     .1887668    .5009312
         13  |   -.080348   .1029481    -0.78   0.436    -.2833327    .1226366
         14  |  -1.006378   .1339116    -7.52   0.000    -1.270414   -.7423421
         15  |   .1569042   .2019668     0.78   0.438    -.2413176     .555126
         16  |   -1.03071   .2875253    -3.58   0.000    -1.597629   -.4637909
         17  |     .35005   .1753643     2.00   0.047     .0042808    .6958191
             |
       _cons |  -1.655874   1.052143    -1.57   0.117    -3.730405    .4186569
-------------+----------------------------------------------------------------
        year |        F(13, 203) =      2.018   0.021          (14 categories)
但是这对于两个分类固定效应还好,但是如果多维控制,那么使用areg,absorb()也不是很方便,这时候,一个解决上述问题的外部命令就应运而生reghdfe,absorb()。

4.reghdfe

reghdfe 主要用于实现多维固定效应线性回归。有些时候,我们需要控制多个维度(如城市-行业-年度)的固定效应,xtreg等命令也OK,但运行速度会很慢,reghdfe解决的就是这一痛点,其在运行速度方面远远优于xtreg等命令。reghdfe是一个外部命令,作者是Sergio Correia,在使用之前需要安装(ssc install reghdfe)。
reghdfe命令可以包含多维固定效应,只需 absorb (var1,var2,...),不需要使用i.var的方式引入虚拟变量,相比xtreg等命令方便许多,并且不会汇报一大长串虚拟变量回归结果,我个人也最为推荐这一命令。

  • 利用reghdfe实现上述个体时间双向固定效应命令和结果如下:
. reghdfe  rca_gvc   l.ai    lncd    lnpi   lnsize   lnimr ,absorb(year   id)     / / 个体时间双向固定
(converged in 3 iterations)
HDFE Linear regression                            Number of obs   =        238
Absorbing 2 HDFE groups                           F(   5,    203) =      10.16
                                                  Prob > F        =     0.0000
                                                  R-squared       =     0.9563
                                                  Adj R-squared   =     0.9490
                                                  Within R-sq.    =     0.2002
                                                  Root MSE        =     0.1422

------------------------------------------------------------------------------
     rca_gvc |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          ai |
         L1. |    .479147    .106974     4.48   0.000     .2682243    .6900697
             |
        lncd |    .229894   .1027937     2.24   0.026     .0272137    .4325743
        lnpi |  -.1892991   .1000526    -1.89   0.060    -.3865747    .0079765
      lnsize |   .3399658   .0713975     4.76   0.000       .19919    .4807415
       lnimr |   .0390783   .0704173     0.55   0.580    -.0997648    .1779214
-------------+----------------------------------------------------------------
    Absorbed |        F(29, 203) =    103.061   0.000             (Joint test)
------------------------------------------------------------------------------

Absorbed degrees of freedom:
---------------------------------------------------------------+
 Absorbed FE |  Num. Coefs.  =   Categories  -   Redundant     | 
-------------+-------------------------------------------------|
        year |           14              14              0     | 
          id |           16              17              1     | 
---------------------------------------------------------------+
下面为大家总结了xtreg,reg,areg和reghdfe四个命令估计双向固定效应的方法。
命令 个体效应 时间效应 个体时间双效应
xtreg fe i.year i.year,fe
reg i.id i.year i.id i.year
areg absorb(id) i.year i.year ,absorb(id)
reghdfe absorb(id) absorb(year) absorb( id year)
  • 让我们看看xtreg,reg,areg和reghdfe四个命令的估计差别。
 esttab FE_xtreg FE_reg FE_areg FE_reghdfe ,b(%6.3f) se scalars(N r2) star(* 0.1 ** 0.05 *** 0.01) ///
> keep( L.ai  lncd  lnpi lnsize lnimr) nogaps mtitles("FE_xtreg" "FE_reg" "FE_areg" "FE_reghdfe")

----------------------------------------------------------------------------
                      (1)             (2)             (3)             (4)   
                 FE_xtreg          FE_reg         FE_areg      FE_reghdfe   
----------------------------------------------------------------------------
L.ai                0.479***        0.479***        0.479***        0.479***
                  (0.107)         (0.107)         (0.107)         (0.107)   
lncd                0.230**         0.230**         0.230**         0.230** 
                  (0.103)         (0.103)         (0.103)         (0.103)   
lnpi               -0.189*         -0.189*         -0.189*         -0.189*  
                  (0.100)         (0.100)         (0.100)         (0.100)   
lnsize              0.340***        0.340***        0.340***        0.340***
                  (0.071)         (0.071)         (0.071)         (0.071)   
lnimr               0.039           0.039           0.039           0.039   
                  (0.070)         (0.070)         (0.070)         (0.070)   
----------------------------------------------------------------------------
N                     238             238             238             238   
r2                  0.255           0.956           0.956           0.956   
----------------------------------------------------------------------------
Standard errors in parentheses
* p<0.1, ** p<0.05, *** p<0.01

从汇总表格展示的回归结果发现,xtreg,reg,areg和reghdfe四个命令估计的系数大小是一致的(有时标准误会有略微差异,这个数据呈现的结果无差别)。

  • 其中,xtreg和reghdfe命令估计得到的标准误是一致的,它们背后的估计方法是固定效应。
  • 而reg和areg命令估计得到的标准误是一致的,因为这两个命令背后的估计方法是特殊的混合OLS(LSDV方法)。

你可能感兴趣的:(纠正错误认知——固定效应模型)