arch.univariate.HARX

class arch.univariate.HARX(y: ndarray | DataFrame | Series | None = None, x: ndarray | DataFrame | None = None, lags: None | int | Sequence[int] | Sequence[Sequence[int]] | ndarray = None, constant: bool = True, use_rotated: bool = False, hold_back: int | None = None, volatility: VolatilityProcess | None = None, distribution: Distribution | None = None, rescale: bool | None = None)[source]

Heterogeneous Autoregression (HAR), with optional exogenous regressors, model estimation and simulation

Parameters:
y: ndarray | DataFrame | Series | None = None

nobs element vector containing the dependent variable

x: ndarray | DataFrame | None = None

nobs by k element array containing exogenous regressors

lags: None | int | Sequence[int] | Sequence[Sequence[int]] | ndarray = None

Description of lag structure of the HAR.

  • Scalar included all lags between 1 and the value.

  • A 1-d n-element array includes the HAR lags 1:lags[0]+1, 1:lags[1]+1, … 1:lags[n]+1.

  • A 2-d (2,n)-element array that includes the HAR lags of the form lags[0,j]:lags[1,j]+1 for all columns of lags.

constant: bool = True

Flag whether the model should include a constant

use_rotated: bool = False

Flag indicating to use the alternative rotated form of the HAR where HAR lags do not overlap

hold_back: int | None = None

Number of observations at the start of the sample to exclude when estimating model parameters. Used when comparing models with different lag lengths to estimate on the common sample.

volatility: VolatilityProcess | None = None

Volatility process to use in the model

distribution: Distribution | None = None

Error distribution to use in the model

rescale: bool | None = None

Flag indicating whether to automatically rescale data if the scale of the data is likely to produce convergence issues when estimating model parameters. If False, the model is estimated on the data without transformation. If True, than y is rescaled and the new scale is reported in the estimation results.

Examples

Standard HAR with average lags 1, 5 and 22

>>> import numpy as np
>>> from arch.univariate import HARX
>>> y = np.random.RandomState(1234).randn(100)
>>> harx = HARX(y, lags=[1, 5, 22])
>>> res = harx.fit()

A standard HAR with average lags 1 and 6 but holding back 10 observations

>>> from pandas import Series, date_range
>>> index = date_range('2000-01-01', freq='M', periods=y.shape[0])
>>> y = Series(y, name='y', index=index)
>>> har = HARX(y, lags=[1, 6], hold_back=10)

Models with equivalent parametrizations of lags. The first uses overlapping lags.

>>> harx_1 = HARX(y, lags=[1,5,22])

The next uses rotated lags so that they do not overlap.

>>> harx_2 = HARX(y, lags=[1,5,22], use_rotated=True)

The third manually specified overlapping lags.

>>> harx_3 = HARX(y, lags=[[1, 1, 1], [1, 5, 22]])

The final manually specified non-overlapping lags

>>> harx_4 = HARX(y, lags=[[1, 2, 6], [1, 5, 22]])

It is simple to verify that these are the equivalent by inspecting the R2.

>>> models = [harx_1, harx_2, harx_3, harx_4]
>>> print([mod.fit().rsquared for mod in models])
0.085, 0.085, 0.085, 0.085

Notes

The HAR-X model is described by

\[y_t = \mu + \sum_{i=1}^p \phi_{L_{i}} \bar{y}_{t-L_{i,0}:L_{i,1}} + \gamma' x_t + \epsilon_t\]

where \(\bar{y}_{t-L_{i,0}:L_{i,1}}\) is the average value of \(y_t\) between \(t-L_{i,0}\) and \(t - L_{i,1}\).

Methods

bounds()

Construct bounds for parameters to use in non-linear optimization

compute_param_cov(params[, backcast, robust])

Computes parameter covariances using numerical derivatives.

constraints()

Construct linear constraint arrays for use in non-linear optimization

fit([update_freq, disp, starting_values, ...])

Estimate model parameters

fix(params[, first_obs, last_obs])

Allows an ARCHModelFixedResult to be constructed from fixed parameters.

forecast(params[, horizon, start, align, ...])

Construct forecasts from estimated model

parameter_names()

List of parameters names

resids(params[, y, regressors])

Compute model residuals

simulate(params, nobs[, burn, ...])

Simulates data from a linear regression, AR or HAR models

starting_values()

Returns starting values for the mean model, often the same as the values returned from fit

Properties

distribution

Set or gets the error distribution

name

The name of the model.

num_params

Returns the number of parameters

volatility

Set or gets the volatility process

x

Gets the value of the exogenous regressors in the model

y

Returns the dependent variable