R语言实现混频数据分析实例----midas回归预测

目录

原文博客地址:https://blog.csdn.net/s1164548515/article/details/101021959

背景

加载包

数据加载

数据预览

数据预处理

混频回归



背景:

基于季度GDP和月度非农就业总额预测下一季度GDP增长率

加载包:

library(midasr)
   
   
   
   

数据加载:


   
   
   
   
  1. data( "USqgdp")
  2. data( "USpayems")

数据预览:

USqgdp
   
   
   
   

1947至2013年季度GDP


   
   
   
   
  1. Qtr1 Qtr2 Qtr3 Qtr4
  2. 1947 243 .1 246 .3 250 .1 260 .3
  3. 1948 266 .2 272 .9 279 .5 280 .7
  4. 1949 275 .4 271 .7 273 .3 271 .0
  5. 1950 281 .2 290 .7 308 .5 320 .3
  6. 1951 336 .4 344 .5 351 .8 356 .6
  7. 1952 360 .2 361 .4 368 .1 381 .2
  8. 1953 388 .5 392 .3 391 .7 386 .5
  9. 1954 385 .9 386 .7 391 .6 400 .3
  10. 1955 413 .8 422 .2 430 .9 437 .8
  11. 1956 440 .5 446 .8 452 .0 461 .3
  12. 1957 470 .6 472 .8 480 .3 475 .7
  13. 1958 468 .4 472 .8 486 .7 500 .4
  14. 1959 511 .1 524 .2 525 .2 529 .3
  15. 1960 543 .3 542 .7 546 .0 541 .1
  16. 1961 545 .9 557 .4 568 .2 581 .6
  17. 1962 595 .2 602 .6 609 .6 613 .1
  18. 1963 622 .7 631 .8 645 .0 654 .8
  19. 1964 671 .2 680 .8 692 .8 698 .4
  20. 1965 719 .2 732 .4 750 .2 773 .1
  21. 1966 797 .3 807 .2 820 .8 834 .9
  22. 1967 846 .0 851 .1 866 .6 883 .2
  23. 1968 911 .1 936 .3 952 .3 970 .1
  24. 1969 995 .4 1011 .4 1032 .0 1040 .7
  25. 1970 1053 .5 1070 .1 1088 .5 1091 .5
  26. 1971 1137 .8 1159 .4 1180 .3 1193 .6
  27. 1972 1233 .8 1270 .1 1293 .8 1332 .0
  28. 1973 1380 .7 1417 .6 1436 .8 1479 .1
  29. 1974 1494 .7 1534 .2 1563 .4 1603 .0
  30. 1975 1619 .6 1656 .4 1713 .8 1765 .9
  31. 1976 1824 .5 1856 .9 1890 .5 1938 .4
  32. 1977 1992 .5 2060 .2 2122 .4 2168 .7
  33. 1978 2208 .7 2336 .6 2398 .9 2482 .2
  34. 1979 2531 .6 2595 .9 2670 .4 2730 .7
  35. 1980 2796 .5 2799 .9 2860 .0 2993 .5
  36. 1981 3131 .8 3167 .2 3261 .2 3283 .5
  37. 1982 3273 .8 3331 .3 3367 .1 3407 .8
  38. 1983 3480 .3 3583 .8 3692 .3 3796 .1
  39. 1984 3912 .8 4015 .0 4087 .4 4147 .6
  40. 1985 4237 .0 4302 .3 4394 .6 4453 .1
  41. 1986 4516 .3 4555 .2 4619 .6 4669 .4
  42. 1987 4736 .2 4821 .4 4900 .5 5022 .7
  43. 1988 5090 .6 5207 .7 5299 .5 5412 .7
  44. 1989 5527 .3 5628 .4 5711 .5 5763 .4
  45. 1990 5890 .8 5974 .6 6029 .5 6023 .3
  46. 1991 6054 .8 6143 .6 6218 .4 6279 .3
  47. 1992 6380 .8 6492 .3 6586 .5 6697 .5
  48. 1993 6748 .2 6829 .6 6904 .2 7032 .8
  49. 1994 7136 .2 7269 .8 7352 .2 7476 .6
  50. 1995 7545 .3 7604 .9 7706 .5 7799 .5
  51. 1996 7893 .1 8061 .5 8159 .0 8287 .0
  52. 1997 8402 .0 8551 .9 8691 .7 8788 .3
  53. 1998 8889 .7 8994 .7 9146 .5 9325 .6
  54. 1999 9450 .3 9561 .5 9718 .7 9932 .3
  55. 2000 10036 .1 10283 .7 10363 .8 10475 .3
  56. 2001 10512 .5 10641 .6 10644 .3 10702 .7
  57. 2002 10837 .3 10938 .0 11039 .8 11105 .7
  58. 2003 11230 .8 11371 .4 11628 .4 11818 .5
  59. 2004 11991 .4 12183 .5 12369 .4 12563 .8
  60. 2005 12816 .2 12975 .7 13206 .5 13383 .3
  61. 2006 13649 .8 13802 .9 13910 .5 14068 .4
  62. 2007 14235 .0 14424 .5 14571 .9 14690 .0
  63. 2008 14672 .9 14817 .1 14844 .3 14546 .7
  64. 2009 14381 .2 14342 .1 14384 .4 14564 .1
  65. 2010 14672 .5 14879 .2 15049 .8 15231 .7
  66. 2011 15242 .9 15461 .9 15611 .8 15818 .7
  67. 2012 16041 .6 16160 .4 16356 .0 16420 .3
  68. 2013 16535 .3 16661 .0 16912 .9 17089 .6
USpayems
   
   
   
   

1939至2014年3月,月度非农就业总额


   
   
   
   
  1. Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
  2. 1939 29923 30101 30280 30094 30300 30502 30419 30663 31032 31408 31469 31539
  3. 1940 31603 31715 31826 31700 31880 31978 31942 32352 32810 33265 33668 34172
  4. 1941 34480 34844 35094 35469 36182 36651 37137 37544 37835 37948 38024 38104
  5. 1942 38347 38513 38936 39352 39772 40028 40471 40988 41255 41515 41673 41915
  6. 1943 42172 42395 42553 42647 42596 42781 42701 42546 42485 42675 42820 42746
  7. 1944 42655 42544 42292 42063 41985 41947 41905 41850 41672 41709 41712 41861
  8. 1945 41897 41904 41796 41443 41304 41149 40873 40467 38500 38599 38997 39112
  9. 1946 39832 39251 40193 40909 41349 41733 42153 42643 42909 43094 43397 43379
  10. 1947 43539 43563 43606 43492 43638 43808 43743 43959 44201 44415 44487 44579
  11. 1948 44682 44537 44681 44370 44795 45033 45160 45176 45295 45251 45194 45029
  12. 1949 44671 44500 44238 44230 43982 43739 43529 43622 43784 42950 43245 43517
  13. 1950 43528 43298 43952 44376 44718 45084 45454 46188 46442 46712 46778 46855
  14. 1951 47288 47577 47871 47856 47953 48068 48062 48009 47955 48009 48148 48309
  15. 1952 48298 48522 48504 48616 48645 48286 48144 48923 49319 49598 49816 50164
  16. 1953 50145 50339 50475 50432 50491 50522 50536 50487 50365 50242 49907 49702
  17. 1954 49468 49382 49158 49178 48965 48896 48835 48825 48882 48944 49178 49331
  18. 1955 49497 49644 49963 50247 50512 50790 50985 51111 51262 51431 51592 51805
  19. 1956 51975 52167 52295 52375 52506 52584 51954 52630 52601 52781 52822 52930
  20. 1957 52888 53097 53157 53238 53149 53066 53123 53126 52932 52765 52559 52385
  21. 1958 52077 51576 51300 51027 50913 50912 51037 51231 51506 51486 51944 52088
  22. 1959 52480 52687 53016 53320 53549 53679 53803 53334 53429 53359 53635 54175
  23. 1960 54274 54513 54458 54812 54473 54347 54304 54271 54228 54144 53962 53744
  24. 1961 53683 53556 53662 53626 53785 53977 54123 54298 54388 54522 54743 54871
  25. 1962 54891 55187 55276 55602 55627 55644 55746 55838 55977 56041 56056 56028
  26. 1963 56116 56230 56322 56580 56616 56658 56794 56910 57077 57284 57255 57360
  27. 1964 57487 57751 57898 57922 58089 58221 58413 58619 58903 58794 59217 59421
  28. 1965 59583 59800 60003 60259 60492 60690 60963 61228 61490 61718 61997 62321
  29. 1966 62528 62796 63192 63436 63711 64110 64301 64507 64644 64854 65019 65200
  30. 1967 65407 65428 65530 65467 65619 65750 65887 66142 66164 66225 66703 66900
  31. 1968 66805 67215 67295 67555 67653 67904 68125 68328 68487 68720 68985 69246
  32. 1969 69438 69700 69905 70072 70328 70636 70729 71006 70917 71120 71087 71240
  33. 1970 71176 71304 71452 71348 71123 71029 71053 70933 70948 70519 70409 70790
  34. 1971 70866 70806 70859 71037 71247 71253 71315 71370 71617 71642 71846 72108
  35. 1972 72445 72652 72945 73163 73467 73760 73708 74138 74263 74673 74967 75270
  36. 1973 75621 76017 76285 76455 76646 76887 76911 77166 77276 77606 77912 78035
  37. 1974 78104 78254 78296 78382 78547 78602 78635 78619 78611 78629 78261 77657
  38. 1975 77297 76919 76649 76461 76623 76520 76769 77155 77230 77535 77680 78018
  39. 1976 78506 78817 79049 79292 79311 79376 79547 79704 79892 79905 80237 80448
  40. 1977 80692 80988 81391 81729 82089 82488 82836 83074 83532 83794 84173 84408
  41. 1978 84595 84948 85461 86163 86509 86951 87205 87481 87618 87954 88391 88673
  42. 1979 88810 89054 89480 89418 89791 90109 90215 90297 90325 90482 90576 90673
  43. 1980 90802 90882 90994 90850 90419 90099 89837 90097 90210 90491 90748 90943
  44. 1981 91037 91105 91210 91283 91293 91490 91602 91566 91479 91380 91171 90893
  45. 1982 90567 90562 90432 90152 90107 89864 89522 89364 89183 88906 88783 88769
  46. 1983 88993 88918 89090 89366 89643 90022 90440 90132 91247 91518 91871 92227
  47. 1984 92673 93154 93429 93792 94100 94479 94792 95034 95344 95630 95979 96107
  48. 1985 96373 96497 96843 97039 97313 97459 97649 97842 98045 98233 98442 98609
  49. 1986 98734 98841 98935 99122 99249 99155 99473 99587 99934 100120 100306 100511
  50. 1987 100683 100915 101164 101502 101728 101900 102247 102418 102646 103138 103370 103664
  51. 1988 103758 104211 104487 104732 104961 105324 105546 105670 106009 106277 106616 106906
  52. 1989 107168 107426 107619 107792 107910 108026 108066 108115 108365 108476 108753 108849
  53. 1990 109183 109432 109647 109688 109838 109863 109833 109613 109525 109366 109216 109160
  54. 1991 109039 108735 108577 108367 108240 108338 108302 108308 108340 108356 108299 108324
  55. 1992 108378 108313 108368 108527 108654 108721 108790 108930 108966 109145 109284 109494
  56. 1993 109805 110047 109998 110306 110573 110754 111053 111212 111451 111737 111999 112311
  57. 1994 112583 112783 113248 113597 113931 114247 114624 114902 115253 115468 115887 116162
  58. 1995 116487 116691 116913 117075 117059 117294 117395 117644 117885 118041 118189 118321
  59. 1996 118303 118735 119001 119165 119485 119774 120029 120202 120427 120677 120976 121146
  60. 1997 121382 121684 122000 122293 122551 122818 123131 123092 123604 123945 124251 124554
  61. 1998 124830 125026 125177 125456 125862 126080 126204 126551 126775 126971 127254 127601
  62. 1999 127726 128137 128244 128619 128831 129092 129411 129578 129791 130192 130483 130778
  63. 2000 131008 131138 131606 131893 132119 132074 132251 132237 132371 132357 132582 132724
  64. 2001 132694 132766 132741 132460 132422 132293 132178 132020 131778 131454 131160 130989
  65. 2002 130847 130714 130695 130615 130607 130664 130579 130564 130504 130629 130639 130481
  66. 2003 130575 130422 130212 130167 130156 130166 130189 130148 130250 130446 130462 130586
  67. 2004 130747 130791 131123 131372 131679 131753 131785 131917 132079 132425 132490 132619
  68. 2005 132753 132992 133126 133489 133664 133909 134282 134478 134545 134629 134966 135125
  69. 2006 135402 135717 135997 136179 136202 136279 136486 136670 136827 136829 137039 137210
  70. 2007 137448 137536 137724 137802 137946 138017 137984 137968 138053 138135 138253 138350
  71. 2008 138365 138279 138199 137985 137803 137631 137421 137162 136710 136236 135471 134774
  72. 2009 133976 133275 132449 131765 131411 130944 130617 130401 130174 129976 129970 129687
  73. 2010 129705 129655 129811 130062 130578 130456 130395 130353 130296 130537 130674 130745
  74. 2011 130815 130983 131195 131517 131619 131836 131942 132064 132285 132468 132632 132828
  75. 2012 133188 133414 133657 133753 133863 133951 134111 134261 134422 134647 134850 135064
  76. 2013 135261 135541 135682 135885 136084 136285 136434 136636 136800 137037 137311 137395
  77. 2014 137539 137736 137928

数据预处理:

1.分割训练集数据:

y:GDP--1947年1季度至2011年2季度

x:非农就业总额--1919年1月至2011年7月


   
   
   
   
  1. y <- window(USqgdp, end = c( 2011, 2))
  2. x <- window(USpayems, end = c( 2011, 7))

2.计算对数差

log(y_t)-log(y_{t-1})


   
   
   
   
  1. yg <- diff( log(y))* 100
  2. xg <- diff( log(x))* 100

3.数据对齐

一是计算对数差会损失一个数值,二是初始数据起止日期不一致需要补齐


   
   
   
   
  1. nx <- ts(c(NA, xg, NA, NA), start = start(x), frequency = 12) #月度数据
  2. ny <- ts(c(rep(NA, 33), yg, NA), start = start(x), frequency = 4) #季度数据

4.补齐后数据的可视化


   
   
   
   
  1. plot.ts(nx, xlab = "Time", ylab = "Percentages", col = 4, ylim = c( -5, 6))
  2. lines(ny, col = 2)

R语言实现混频数据分析实例----midas回归预测_第1张图片

 

混频回归:

1.样本数据选取:1985年1月至2009年3月非农就业总额数据,1985年1季度至2009年1季度GDP数据


   
   
   
   
  1. xx <- window(nx, start = c( 1985, 1), end = c( 2009, 3))
  2. yy <- window(ny, start = c( 1985, 1), end = c( 2009, 1))

 2.模型拟合

​​mod_1:


   
   
   
   
  1. mod_1 <- midas_r(yy ~ mls(yy, 1, 1) + mls(xx, 3: 11, 3, nbeta), start = list(xx = c( 1.7, 1, 5))) # nbeta
  2. coef(mod_1)

系数:


   
   
   
   
  1. ( Intercept) yy xx1 xx2 xx3
  2. 0 .8311331 0 .1056152 2 .5924606 1 .0149478 12 .4761333

​​mod_2: 


   
   
   
   
  1. mod_2 <- midas_r(yy ~ mls(yy, 1, 1) + mls(xx, 3: 11, 3, nbetaMT), start = list(xx = c( 2, 1, 5, 0))) #nbetaMT
  2. coef(mod_2)

系数:


   
   
   
   
  1. ( Intercept) yy xx1 xx2 xx3 xx4
  2. 0 .93868569 0 .06618413 2 .27682100 0 .98674913 1 .50874150 -0 .09164647

​​mod_3: 


   
   
   
   
  1. mod_3 <- midas_r(yy ~ mls(yy, 1, 1) + mls(xx, 3: 11, 3), start = NULL) #无约束
  2. coef(mod_3)

系数:


   
   
   
   
  1. Intercept) yy xx1 xx2 xx3 xx4 xx5 xx6 xx7 xx8 xx9
  2. 0 .92989757 0 .08358393 2 .00047205 0 .88134597 0 .42964662 -0 .17596814 0 .28351010 1 .16285271 -0 .53081967 -0 .73391876 -1 .18732001

3.分割训练集、测试集


   
   
   
   
  1. fulldata <- list(xx = window(nx, start = c( 1985, 1), end = c( 2011, 6)), yy = window(ny, start = c( 1985, 1), end = c( 2011, 2))) #
  2. insample <- 1:length(yy) #训练集
  3. outsample <- ( 1:length(fulldata$yy))[-insample] #测试集

4.mod_1~mod_3  模型平均


   
   
   
   
  1. avgf <- average_forecast( list(mod_1, mod_2, mod_3), data = fulldata, insample = insample, outsample = outsample)
  2. sqrt(avgf$accuracy$individual$MSE.out.of.sample)

三模型样本外预测的MSE,mod_3好一点

[1] 0.5383774 0.4770977 0.4457144
   
   
   
   

 

你可能感兴趣的:(R语言,混频数据,案例)