python做var模型_带有python + statsmodels的VAR模型

I am an avid user of R, but recently switched to Python for a few different reasons. However, I am struggling a little to run the vector AR model in Python from statsmodels.

Q#1. I get an error when I run this, and I have a suspicion it has something to do with the type of my vector.

import numpy as np

import statsmodels.tsa.api

from statsmodels import datasets

import datetime as dt

import pandas as pd

from pandas import Series

from pandas import DataFrame

import os

df = pd.read_csv('myfile.csv')

speedonly = DataFrame(df['speed'])

results = statsmodels.tsa.api.VAR(speedonly)

Traceback (most recent call last):

File "", line 1, in

results = statsmodels.tsa.api.VAR(speedonly)

File "C:\Python27\lib\site-packages\statsmodels\tsa\vector_ar\var_model.py", line 336, in __init__

super(VAR, self).__init__(endog, None, dates, freq)

File "C:\Python27\lib\site-packages\statsmodels\tsa\base\tsa_model.py", line 40, in __init__

self._init_dates(dates, freq)

File "C:\Python27\lib\site-packages\statsmodels\tsa\base\tsa_model.py", line 54, in _init_dates

raise ValueError("dates must be of type datetime")

ValueError: dates must be of type datetime

I try the VAR model with a third, shorter vector, ts, from Wes McKinney's "Python for Data Analysis," page 293 and it doesn't work.

Okay, so now I'm thinking it's because the vectors are different types:

>>> speedonly.head()

speed

0 559.984

1 559.984

2 559.984

3 559.984

4 559.984

>>> type(speedonly)

#DOESN'T WORK

>>> type(data)

#WORKS

>>> ts

2011-01-02 -0.682317

2011-01-05 1.121983

2011-01-07 0.507047

2011-01-08 -0.038240

2011-01-10 -0.890730

2011-01-12 -0.388685

>>> type(ts)

#DOESN'T WORK

So I convert speedonly to an ndarray... and it still doesn't work. But this time I get another error:

>>> nda_speedonly = np.array(speedonly)

>>> results = statsmodels.tsa.api.VAR(nda_speedonly)

Traceback (most recent call last):

File "", line 1, in

results = statsmodels.tsa.api.VAR(nda_speedonly)

File "C:\Python27\lib\site-packages\statsmodels\tsa\vector_ar\var_model.py", line 345, in __init__

self.neqs = self.endog.shape[1]

IndexError: tuple index out of range

Any suggestions?

Q#2. I have exogenous feature variables in my data set that appear to be useful for predictions. Is the above model from statsmodels even the best one to use?

解决方案

When you give a pandas object to a time-series model, it expects that the index is dates. The error message is improved in the current source (to be released soon).

ValueError: Given a pandas object and the index does not contain dates

In the second case, you're giving a single 1d series to a VAR. VARs are used when you have more than one series. That's why you have the shape error because it expects there to be a second dimension in your array. We could probably improve the error message here. For a single series AR model with exogenous variables, you probably want to use sm.tsa.ARMA. Note that there is a known bug in ARMA.predict for models with exogenous variables to fixed soon. If you could provide a test case for this it would be helpful.

你可能感兴趣的:(python做var模型)