Mean Models

All ARCH models start by specifying a mean model.

No Mean

class arch.univariate.ZeroMean(y=None, hold_back=None, volatility=None, distribution=None, rescale=None)[source]

Model with zero conditional mean estimation and simulation

Parameters:
  • y ({ndarray, Series}) – nobs element vector containing the dependent variable
  • hold_back (int) – Number of observations at the start of the sample to exclude when estimating model parameters. Used when comparing models with different lag lengths to estimate on the common sample.
  • volatility (VolatilityProcess, optional) – Volatility process to use in the model
  • distribution (Distribution, optional) – Error distribution to use in the model
  • rescale (bool, optional) – Flag indicating whether to automatically rescale data if the scale of the data is likely to produce convergence issues when estimating model parameters. If False, the model is estimated on the data without transformation. If True, than y is rescaled and the new scale is reported in the estimation results.

Examples

>>> import numpy as np
>>> from arch.univariate import ZeroMean
>>> y = np.random.randn(100)
>>> zm = ZeroMean(y)
>>> res = zm.fit()

Notes

The zero mean model is described by

\[y_t = \epsilon_t\]
fit(update_freq=1, disp='final', starting_values=None, cov_type='robust', show_warning=True, first_obs=None, last_obs=None, tol=None, options=None, backcast=None)

Fits the model given a nobs by 1 vector of sigma2 values

Parameters:
  • update_freq (int, optional) – Frequency of iteration updates. Output is generated every update_freq iterations. Set to 0 to disable iterative output.
  • disp (str) – Either ‘final’ to print optimization result or ‘off’ to display nothing
  • starting_values (ndarray, optional) – Array of starting values to use. If not provided, starting values are constructed by the model components.
  • cov_type (str, optional) – Estimation method of parameter covariance. Supported options are ‘robust’, which does not assume the Information Matrix Equality holds and ‘classic’ which does. In the ARCH literature, ‘robust’ corresponds to Bollerslev-Wooldridge covariance estimator.
  • show_warning (bool, optional) – Flag indicating whether convergence warnings should be shown.
  • first_obs ({int, str, datetime, Timestamp}) – First observation to use when estimating model
  • last_obs ({int, str, datetime, Timestamp}) – Last observation to use when estimating model
  • tol (float, optional) – Tolerance for termination.
  • options (dict, optional) – Options to pass to scipy.optimize.minimize. Valid entries include ‘ftol’, ‘eps’, ‘disp’, and ‘maxiter’.
  • backcast (float, optional) – Value to use as backcast. Should be measure \(\sigma^2_0\) since model-specific non-linear transformations are applied to value before computing the variance recursions.
Returns:

results – Object containing model results

Return type:

ARCHModelResult

Notes

A ConvergenceWarning is raised if SciPy’s optimizer indicates difficulty finding the optimum.

Parameters are optimized using SLSQP.

fix(params, first_obs=None, last_obs=None)

Allows an ARCHModelFixedResult to be constructed from fixed parameters.

Parameters:
  • params ({ndarray, Series}) – User specified parameters to use when generating the result. Must have the correct number of parameters for a given choice of mean model, volatility model and distribution.
  • first_obs ({int, str, datetime, Timestamp}) – First observation to use when fixing model
  • last_obs ({int, str, datetime, Timestamp}) – Last observation to use when fixing model
Returns:

results – Object containing model results

Return type:

ARCHModelFixedResult

Notes

Parameters are not checked against model-specific constraints.

forecast(params, horizon=1, start=None, align='origin', method='analytic', simulations=1000, rng=None, random_state=None)

Construct forecasts from estimated model

Parameters:
  • params ({ndarray, Series}, optional) – Alternative parameters to use. If not provided, the parameters estimated when fitting the model are used. Must be identical in shape to the parameters computed by fitting the model.
  • horizon (int, optional) – Number of steps to forecast
  • start ({int, datetime, Timestamp, str}, optional) – An integer, datetime or str indicating the first observation to produce the forecast for. Datetimes can only be used with pandas inputs that have a datetime index. Strings must be convertible to a date time, such as in ‘1945-01-01’.
  • align (str, optional) – Either ‘origin’ or ‘target’. When set of ‘origin’, the t-th row of forecasts contains the forecasts for t+1, t+2, …, t+h. When set to ‘target’, the t-th row contains the 1-step ahead forecast from time t-1, the 2 step from time t-2, …, and the h-step from time t-h. ‘target’ simplified computing forecast errors since the realization and h-step forecast are aligned.
  • method ({'analytic', 'simulation', 'bootstrap'}) – Method to use when producing the forecast. The default is analytic. The method only affects the variance forecast generation. Not all volatility models support all methods. In particular, volatility models that do not evolve in squares such as EGARCH or TARCH do not support the ‘analytic’ method for horizons > 1.
  • simulations (int) – Number of simulations to run when computing the forecast using either simulation or bootstrap.
  • rng (callable, optional) – Custom random number generator to use in simulation-based forecasts. Must produce random samples using the syntax rng(size) where size the 2-element tuple (simulations, horizon).
  • random_state (RandomState, optional) – NumPy RandomState instance to use when method is ‘bootstrap’
Returns:

forecasts – t by h data frame containing the forecasts. The alignment of the forecasts is controlled by align.

Return type:

ARCHModelForecast

Examples

>>> import pandas as pd
>>> from arch import arch_model
>>> am = arch_model(None,mean='HAR',lags=[1,5,22],vol='Constant')
>>> sim_data = am.simulate([0.1,0.4,0.3,0.2,1.0], 250)
>>> sim_data.index = pd.date_range('2000-01-01',periods=250)
>>> am = arch_model(sim_data['data'],mean='HAR',lags=[1,5,22],  vol='Constant')
>>> res = am.fit()
>>> fig = res.hedgehog_plot()

Notes

The most basic 1-step ahead forecast will return a vector with the same length as the original data, where the t-th value will be the time-t forecast for time t + 1. When the horizon is > 1, and when using the default value for align, the forecast value in position [t, h] is the time-t, h+1 step ahead forecast.

If model contains exogenous variables (model.x is not None), then only 1-step ahead forecasts are available. Using horizon > 1 will produce a warning and all columns, except the first, will be nan-filled.

If align is ‘origin’, forecast[t,h] contains the forecast made using y[:t] (that is, up to but not including t) for horizon h + 1. For example, y[100,2] contains the 3-step ahead forecast using the first 100 data points, which will correspond to the realization y[100 + 2]. If align is ‘target’, then the same forecast is in location [102, 2], so that it is aligned with the observation to use when evaluating, but still in the same column.

resids(params, y=None, regressors=None)[source]

Compute model residuals

Parameters:
  • params (ndarray) – Model parameters
  • y (ndarray, optional) – Alternative values to use when computing model residuals
  • regressors (ndarray, optional) – Alternative regressor values to use when computing model residuals
Returns:

resids – Model residuals

Return type:

ndarray

simulate(params, nobs, burn=500, initial_value=None, x=None, initial_value_vol=None)[source]

Simulated data from a zero mean model

Parameters:
  • params ({ndarray, DataFrame}) – Parameters to use when simulating the model. Parameter order is [volatility distribution]. There are no mean parameters.
  • nobs (int) – Length of series to simulate
  • burn (int, optional) – Number of values to simulate to initialize the model and remove dependence on initial values.
  • initial_value (None) – This value is not used.
  • x (None) – This value is not used.
  • initial_value_vol ({ndarray, float}, optional) – An array or scalar to use when initializing the volatility process.
Returns:

simulated_data – DataFrame with columns data containing the simulated values, volatility, containing the conditional volatility and errors containing the errors used in the simulation

Return type:

DataFrame

Examples

Basic data simulation with no mean and constant volatility

>>> from arch.univariate import ZeroMean
>>> zm = ZeroMean()
>>> sim_data = zm.simulate([1.0], 1000)

Simulating data with a non-trivial volatility process

>>> from arch.univariate import GARCH
>>> zm.volatility = GARCH(p=1, o=1, q=1)
>>> sim_data = zm.simulate([0.05, 0.1, 0.1, 0.8], 300)

Constant Mean

class arch.univariate.ConstantMean(y=None, hold_back=None, volatility=None, distribution=None, rescale=None)[source]

Constant mean model estimation and simulation.

Parameters:
  • y ({ndarray, Series}) – nobs element vector containing the dependent variable
  • hold_back (int) – Number of observations at the start of the sample to exclude when estimating model parameters. Used when comparing models with different lag lengths to estimate on the common sample.
  • volatility (VolatilityProcess, optional) – Volatility process to use in the model
  • distribution (Distribution, optional) – Error distribution to use in the model
  • rescale (bool, optional) – Flag indicating whether to automatically rescale data if the scale of the data is likely to produce convergence issues when estimating model parameters. If False, the model is estimated on the data without transformation. If True, than y is rescaled and the new scale is reported in the estimation results.

Examples

>>> import numpy as np
>>> from arch.univariate import ConstantMean
>>> y = np.random.randn(100)
>>> cm = ConstantMean(y)
>>> res = cm.fit()

Notes

The constant mean model is described by

\[y_t = \mu + \epsilon_t\]
fit(update_freq=1, disp='final', starting_values=None, cov_type='robust', show_warning=True, first_obs=None, last_obs=None, tol=None, options=None, backcast=None)

Fits the model given a nobs by 1 vector of sigma2 values

Parameters:
  • update_freq (int, optional) – Frequency of iteration updates. Output is generated every update_freq iterations. Set to 0 to disable iterative output.
  • disp (str) – Either ‘final’ to print optimization result or ‘off’ to display nothing
  • starting_values (ndarray, optional) – Array of starting values to use. If not provided, starting values are constructed by the model components.
  • cov_type (str, optional) – Estimation method of parameter covariance. Supported options are ‘robust’, which does not assume the Information Matrix Equality holds and ‘classic’ which does. In the ARCH literature, ‘robust’ corresponds to Bollerslev-Wooldridge covariance estimator.
  • show_warning (bool, optional) – Flag indicating whether convergence warnings should be shown.
  • first_obs ({int, str, datetime, Timestamp}) – First observation to use when estimating model
  • last_obs ({int, str, datetime, Timestamp}) – Last observation to use when estimating model
  • tol (float, optional) – Tolerance for termination.
  • options (dict, optional) – Options to pass to scipy.optimize.minimize. Valid entries include ‘ftol’, ‘eps’, ‘disp’, and ‘maxiter’.
  • backcast (float, optional) – Value to use as backcast. Should be measure \(\sigma^2_0\) since model-specific non-linear transformations are applied to value before computing the variance recursions.
Returns:

results – Object containing model results

Return type:

ARCHModelResult

Notes

A ConvergenceWarning is raised if SciPy’s optimizer indicates difficulty finding the optimum.

Parameters are optimized using SLSQP.

forecast(params, horizon=1, start=None, align='origin', method='analytic', simulations=1000, rng=None, random_state=None)

Construct forecasts from estimated model

Parameters:
  • params ({ndarray, Series}, optional) – Alternative parameters to use. If not provided, the parameters estimated when fitting the model are used. Must be identical in shape to the parameters computed by fitting the model.
  • horizon (int, optional) – Number of steps to forecast
  • start ({int, datetime, Timestamp, str}, optional) – An integer, datetime or str indicating the first observation to produce the forecast for. Datetimes can only be used with pandas inputs that have a datetime index. Strings must be convertible to a date time, such as in ‘1945-01-01’.
  • align (str, optional) – Either ‘origin’ or ‘target’. When set of ‘origin’, the t-th row of forecasts contains the forecasts for t+1, t+2, …, t+h. When set to ‘target’, the t-th row contains the 1-step ahead forecast from time t-1, the 2 step from time t-2, …, and the h-step from time t-h. ‘target’ simplified computing forecast errors since the realization and h-step forecast are aligned.
  • method ({'analytic', 'simulation', 'bootstrap'}) – Method to use when producing the forecast. The default is analytic. The method only affects the variance forecast generation. Not all volatility models support all methods. In particular, volatility models that do not evolve in squares such as EGARCH or TARCH do not support the ‘analytic’ method for horizons > 1.
  • simulations (int) – Number of simulations to run when computing the forecast using either simulation or bootstrap.
  • rng (callable, optional) – Custom random number generator to use in simulation-based forecasts. Must produce random samples using the syntax rng(size) where size the 2-element tuple (simulations, horizon).
  • random_state (RandomState, optional) – NumPy RandomState instance to use when method is ‘bootstrap’
Returns:

forecasts – t by h data frame containing the forecasts. The alignment of the forecasts is controlled by align.

Return type:

ARCHModelForecast

Examples

>>> import pandas as pd
>>> from arch import arch_model
>>> am = arch_model(None,mean='HAR',lags=[1,5,22],vol='Constant')
>>> sim_data = am.simulate([0.1,0.4,0.3,0.2,1.0], 250)
>>> sim_data.index = pd.date_range('2000-01-01',periods=250)
>>> am = arch_model(sim_data['data'],mean='HAR',lags=[1,5,22],  vol='Constant')
>>> res = am.fit()
>>> fig = res.hedgehog_plot()

Notes

The most basic 1-step ahead forecast will return a vector with the same length as the original data, where the t-th value will be the time-t forecast for time t + 1. When the horizon is > 1, and when using the default value for align, the forecast value in position [t, h] is the time-t, h+1 step ahead forecast.

If model contains exogenous variables (model.x is not None), then only 1-step ahead forecasts are available. Using horizon > 1 will produce a warning and all columns, except the first, will be nan-filled.

If align is ‘origin’, forecast[t,h] contains the forecast made using y[:t] (that is, up to but not including t) for horizon h + 1. For example, y[100,2] contains the 3-step ahead forecast using the first 100 data points, which will correspond to the realization y[100 + 2]. If align is ‘target’, then the same forecast is in location [102, 2], so that it is aligned with the observation to use when evaluating, but still in the same column.

resids(params, y=None, regressors=None)[source]

Compute model residuals

Parameters:
  • params (ndarray) – Model parameters
  • y (ndarray, optional) – Alternative values to use when computing model residuals
  • regressors (ndarray, optional) – Alternative regressor values to use when computing model residuals
Returns:

resids – Model residuals

Return type:

ndarray

simulate(params, nobs, burn=500, initial_value=None, x=None, initial_value_vol=None)[source]

Simulated data from a constant mean model

Parameters:
  • params (ndarray) – Parameters to use when simulating the model. Parameter order is [mean volatility distribution]. There is one parameter in the mean model, mu.
  • nobs (int) – Length of series to simulate
  • burn (int, optional) – Number of values to simulate to initialize the model and remove dependence on initial values.
  • initial_value (None) – This value is not used.
  • x (None) – This value is not used.
  • initial_value_vol ({ndarray, float}, optional) – An array or scalar to use when initializing the volatility process.
Returns:

simulated_data – DataFrame with columns data containing the simulated values, volatility, containing the conditional volatility and errors containing the errors used in the simulation

Return type:

DataFrame

Examples

Basic data simulation with a constant mean and volatility

>>> import numpy as np
>>> from arch.univariate import ConstantMean, GARCH
>>> cm = ConstantMean()
>>> cm.volatility = GARCH()
>>> cm_params = np.array([1])
>>> garch_params = np.array([0.01, 0.07, 0.92])
>>> params = np.concatenate((cm_params, garch_params))
>>> sim_data = cm.simulate(params, 1000)

Autoregressions

class arch.univariate.ARX(y=None, x=None, lags=None, constant=True, hold_back=None, volatility=None, distribution=None, rescale=None)[source]

Autoregressive model with optional exogenous regressors estimation and simulation

Parameters:
  • y ({ndarray, Series}) – nobs element vector containing the dependent variable
  • x ({ndarray, DataFrame}, optional) – nobs by k element array containing exogenous regressors
  • lags (scalar, 1-d array, optional) – Description of lag structure of the HAR. Scalar included all lags between 1 and the value. A 1-d array includes the AR lags lags[0], lags[1], …
  • constant (bool, optional) – Flag whether the model should include a constant
  • hold_back (int) – Number of observations at the start of the sample to exclude when estimating model parameters. Used when comparing models with different lag lengths to estimate on the common sample.
  • rescale (bool, optional) – Flag indicating whether to automatically rescale data if the scale of the data is likely to produce convergence issues when estimating model parameters. If False, the model is estimated on the data without transformation. If True, than y is rescaled and the new scale is reported in the estimation results.

Examples

>>> import numpy as np
>>> from arch.univariate import ARX
>>> y = np.random.randn(100)
>>> arx = ARX(y, lags=[1, 5, 22])
>>> res = arx.fit()

Estimating an AR with GARCH(1,1) errors >>> from arch.univariate import GARCH >>> arx.volatility = GARCH() >>> res = arx.fit(update_freq=0, disp=’off’)

Notes

The AR-X model is described by

\[y_t = \mu + \sum_{i=1}^p \phi_{L_{i}} y_{t-L_{i}} + \gamma' x_t + \epsilon_t\]
fit(update_freq=1, disp='final', starting_values=None, cov_type='robust', show_warning=True, first_obs=None, last_obs=None, tol=None, options=None, backcast=None)

Fits the model given a nobs by 1 vector of sigma2 values

Parameters:
  • update_freq (int, optional) – Frequency of iteration updates. Output is generated every update_freq iterations. Set to 0 to disable iterative output.
  • disp (str) – Either ‘final’ to print optimization result or ‘off’ to display nothing
  • starting_values (ndarray, optional) – Array of starting values to use. If not provided, starting values are constructed by the model components.
  • cov_type (str, optional) – Estimation method of parameter covariance. Supported options are ‘robust’, which does not assume the Information Matrix Equality holds and ‘classic’ which does. In the ARCH literature, ‘robust’ corresponds to Bollerslev-Wooldridge covariance estimator.
  • show_warning (bool, optional) – Flag indicating whether convergence warnings should be shown.
  • first_obs ({int, str, datetime, Timestamp}) – First observation to use when estimating model
  • last_obs ({int, str, datetime, Timestamp}) – Last observation to use when estimating model
  • tol (float, optional) – Tolerance for termination.
  • options (dict, optional) – Options to pass to scipy.optimize.minimize. Valid entries include ‘ftol’, ‘eps’, ‘disp’, and ‘maxiter’.
  • backcast (float, optional) – Value to use as backcast. Should be measure \(\sigma^2_0\) since model-specific non-linear transformations are applied to value before computing the variance recursions.
Returns:

results – Object containing model results

Return type:

ARCHModelResult

Notes

A ConvergenceWarning is raised if SciPy’s optimizer indicates difficulty finding the optimum.

Parameters are optimized using SLSQP.

fix(params, first_obs=None, last_obs=None)

Allows an ARCHModelFixedResult to be constructed from fixed parameters.

Parameters:
  • params ({ndarray, Series}) – User specified parameters to use when generating the result. Must have the correct number of parameters for a given choice of mean model, volatility model and distribution.
  • first_obs ({int, str, datetime, Timestamp}) – First observation to use when fixing model
  • last_obs ({int, str, datetime, Timestamp}) – Last observation to use when fixing model
Returns:

results – Object containing model results

Return type:

ARCHModelFixedResult

Notes

Parameters are not checked against model-specific constraints.

forecast(params, horizon=1, start=None, align='origin', method='analytic', simulations=1000, rng=None, random_state=None)

Construct forecasts from estimated model

Parameters:
  • params ({ndarray, Series}, optional) – Alternative parameters to use. If not provided, the parameters estimated when fitting the model are used. Must be identical in shape to the parameters computed by fitting the model.
  • horizon (int, optional) – Number of steps to forecast
  • start ({int, datetime, Timestamp, str}, optional) – An integer, datetime or str indicating the first observation to produce the forecast for. Datetimes can only be used with pandas inputs that have a datetime index. Strings must be convertible to a date time, such as in ‘1945-01-01’.
  • align (str, optional) – Either ‘origin’ or ‘target’. When set of ‘origin’, the t-th row of forecasts contains the forecasts for t+1, t+2, …, t+h. When set to ‘target’, the t-th row contains the 1-step ahead forecast from time t-1, the 2 step from time t-2, …, and the h-step from time t-h. ‘target’ simplified computing forecast errors since the realization and h-step forecast are aligned.
  • method ({'analytic', 'simulation', 'bootstrap'}) – Method to use when producing the forecast. The default is analytic. The method only affects the variance forecast generation. Not all volatility models support all methods. In particular, volatility models that do not evolve in squares such as EGARCH or TARCH do not support the ‘analytic’ method for horizons > 1.
  • simulations (int) – Number of simulations to run when computing the forecast using either simulation or bootstrap.
  • rng (callable, optional) – Custom random number generator to use in simulation-based forecasts. Must produce random samples using the syntax rng(size) where size the 2-element tuple (simulations, horizon).
  • random_state (RandomState, optional) – NumPy RandomState instance to use when method is ‘bootstrap’
Returns:

forecasts – t by h data frame containing the forecasts. The alignment of the forecasts is controlled by align.

Return type:

ARCHModelForecast

Examples

>>> import pandas as pd
>>> from arch import arch_model
>>> am = arch_model(None,mean='HAR',lags=[1,5,22],vol='Constant')
>>> sim_data = am.simulate([0.1,0.4,0.3,0.2,1.0], 250)
>>> sim_data.index = pd.date_range('2000-01-01',periods=250)
>>> am = arch_model(sim_data['data'],mean='HAR',lags=[1,5,22],  vol='Constant')
>>> res = am.fit()
>>> fig = res.hedgehog_plot()

Notes

The most basic 1-step ahead forecast will return a vector with the same length as the original data, where the t-th value will be the time-t forecast for time t + 1. When the horizon is > 1, and when using the default value for align, the forecast value in position [t, h] is the time-t, h+1 step ahead forecast.

If model contains exogenous variables (model.x is not None), then only 1-step ahead forecasts are available. Using horizon > 1 will produce a warning and all columns, except the first, will be nan-filled.

If align is ‘origin’, forecast[t,h] contains the forecast made using y[:t] (that is, up to but not including t) for horizon h + 1. For example, y[100,2] contains the 3-step ahead forecast using the first 100 data points, which will correspond to the realization y[100 + 2]. If align is ‘target’, then the same forecast is in location [102, 2], so that it is aligned with the observation to use when evaluating, but still in the same column.

resids(params, y=None, regressors=None)

Compute model residuals

Parameters:
  • params (ndarray) – Model parameters
  • y (ndarray, optional) – Alternative values to use when computing model residuals
  • regressors (ndarray, optional) – Alternative regressor values to use when computing model residuals
Returns:

resids – Model residuals

Return type:

ndarray

simulate(params, nobs, burn=500, initial_value=None, x=None, initial_value_vol=None)

Simulates data from a linear regression, AR or HAR models

Parameters:
  • params (ndarray) – Parameters to use when simulating the model. Parameter order is [mean volatility distribution] where the parameters of the mean model are ordered [constant lag[0] lag[1] … lag[p] ex[0] … ex[k-1]] where lag[j] indicates the coefficient on the jth lag in the model and ex[j] is the coefficient on the jth exogenous variable.
  • nobs (int) – Length of series to simulate
  • burn (int, optional) – Number of values to simulate to initialize the model and remove dependence on initial values.
  • initial_value ({ndarray, float}, optional) – Either a scalar value or max(lags) array set of initial values to use when initializing the model. If omitted, 0.0 is used.
  • x ({ndarray, DataFrame}, optional) – nobs + burn by k array of exogenous variables to include in the simulation.
  • initial_value_vol ({ndarray, float}, optional) – An array or scalar to use when initializing the volatility process.
Returns:

simulated_data – DataFrame with columns data containing the simulated values, volatility, containing the conditional volatility and errors containing the errors used in the simulation

Return type:

DataFrame

Examples

>>> import numpy as np
>>> from arch.univariate import HARX, GARCH
>>> harx = HARX(lags=[1, 5, 22])
>>> harx.volatility = GARCH()
>>> harx_params = np.array([1, 0.2, 0.3, 0.4])
>>> garch_params = np.array([0.01, 0.07, 0.92])
>>> params = np.concatenate((harx_params, garch_params))
>>> sim_data = harx.simulate(params, 1000)

Simulating models with exogenous regressors requires the regressors to have nobs plus burn data points

>>> nobs = 100
>>> burn = 200
>>> x = np.random.randn(nobs + burn, 2)
>>> x_params = np.array([1.0, 2.0])
>>> params = np.concatenate((harx_params, x_params, garch_params))
>>> sim_data = harx.simulate(params, nobs=nobs, burn=burn, x=x)

Heterogeneous Autoregressions

class arch.univariate.HARX(y=None, x=None, lags=None, constant=True, use_rotated=False, hold_back=None, volatility=None, distribution=None, rescale=None)[source]

Heterogeneous Autoregression (HAR), with optional exogenous regressors, model estimation and simulation

Parameters:
  • y ({ndarray, Series}) – nobs element vector containing the dependent variable
  • x ({ndarray, DataFrame}, optional) – nobs by k element array containing exogenous regressors
  • lags ({scalar, ndarray}, optional) – Description of lag structure of the HAR. Scalar included all lags between 1 and the value. A 1-d array includes the HAR lags 1:lags[0], 1:lags[1], … A 2-d array includes the HAR lags of the form lags[0,j]:lags[1,j] for all columns of lags.
  • constant (bool, optional) – Flag whether the model should include a constant
  • use_rotated (bool, optional) – Flag indicating to use the alternative rotated form of the HAR where HAR lags do not overlap
  • hold_back (int) – Number of observations at the start of the sample to exclude when estimating model parameters. Used when comparing models with different lag lengths to estimate on the common sample.
  • volatility (VolatilityProcess, optional) – Volatility process to use in the model
  • distribution (Distribution, optional) – Error distribution to use in the model
  • rescale (bool, optional) – Flag indicating whether to automatically rescale data if the scale of the data is likely to produce convergence issues when estimating model parameters. If False, the model is estimated on the data without transformation. If True, than y is rescaled and the new scale is reported in the estimation results.

Examples

>>> import numpy as np
>>> from arch.univariate import HARX
>>> y = np.random.randn(100)
>>> harx = HARX(y, lags=[1, 5, 22])
>>> res = harx.fit()
>>> from pandas import Series, date_range
>>> index = date_range('2000-01-01', freq='M', periods=y.shape[0])
>>> y = Series(y, name='y', index=index)
>>> har = HARX(y, lags=[1, 6], hold_back=10)

Notes

The HAR-X model is described by

\[y_t = \mu + \sum_{i=1}^p \phi_{L_{i}} \bar{y}_{t-L_{i,0}:L_{i,1}} + \gamma' x_t + \epsilon_t\]

where \(\bar{y}_{t-L_{i,0}:L_{i,1}}\) is the average value of \(y_t\) between \(t-L_{i,0}\) and \(t - L_{i,1}\).

fit(update_freq=1, disp='final', starting_values=None, cov_type='robust', show_warning=True, first_obs=None, last_obs=None, tol=None, options=None, backcast=None)

Fits the model given a nobs by 1 vector of sigma2 values

Parameters:
  • update_freq (int, optional) – Frequency of iteration updates. Output is generated every update_freq iterations. Set to 0 to disable iterative output.
  • disp (str) – Either ‘final’ to print optimization result or ‘off’ to display nothing
  • starting_values (ndarray, optional) – Array of starting values to use. If not provided, starting values are constructed by the model components.
  • cov_type (str, optional) – Estimation method of parameter covariance. Supported options are ‘robust’, which does not assume the Information Matrix Equality holds and ‘classic’ which does. In the ARCH literature, ‘robust’ corresponds to Bollerslev-Wooldridge covariance estimator.
  • show_warning (bool, optional) – Flag indicating whether convergence warnings should be shown.
  • first_obs ({int, str, datetime, Timestamp}) – First observation to use when estimating model
  • last_obs ({int, str, datetime, Timestamp}) – Last observation to use when estimating model
  • tol (float, optional) – Tolerance for termination.
  • options (dict, optional) – Options to pass to scipy.optimize.minimize. Valid entries include ‘ftol’, ‘eps’, ‘disp’, and ‘maxiter’.
  • backcast (float, optional) – Value to use as backcast. Should be measure \(\sigma^2_0\) since model-specific non-linear transformations are applied to value before computing the variance recursions.
Returns:

results – Object containing model results

Return type:

ARCHModelResult

Notes

A ConvergenceWarning is raised if SciPy’s optimizer indicates difficulty finding the optimum.

Parameters are optimized using SLSQP.

fix(params, first_obs=None, last_obs=None)

Allows an ARCHModelFixedResult to be constructed from fixed parameters.

Parameters:
  • params ({ndarray, Series}) – User specified parameters to use when generating the result. Must have the correct number of parameters for a given choice of mean model, volatility model and distribution.
  • first_obs ({int, str, datetime, Timestamp}) – First observation to use when fixing model
  • last_obs ({int, str, datetime, Timestamp}) – Last observation to use when fixing model
Returns:

results – Object containing model results

Return type:

ARCHModelFixedResult

Notes

Parameters are not checked against model-specific constraints.

forecast(params, horizon=1, start=None, align='origin', method='analytic', simulations=1000, rng=None, random_state=None)[source]

Construct forecasts from estimated model

Parameters:
  • params ({ndarray, Series}, optional) – Alternative parameters to use. If not provided, the parameters estimated when fitting the model are used. Must be identical in shape to the parameters computed by fitting the model.
  • horizon (int, optional) – Number of steps to forecast
  • start ({int, datetime, Timestamp, str}, optional) – An integer, datetime or str indicating the first observation to produce the forecast for. Datetimes can only be used with pandas inputs that have a datetime index. Strings must be convertible to a date time, such as in ‘1945-01-01’.
  • align (str, optional) – Either ‘origin’ or ‘target’. When set of ‘origin’, the t-th row of forecasts contains the forecasts for t+1, t+2, …, t+h. When set to ‘target’, the t-th row contains the 1-step ahead forecast from time t-1, the 2 step from time t-2, …, and the h-step from time t-h. ‘target’ simplified computing forecast errors since the realization and h-step forecast are aligned.
  • method ({'analytic', 'simulation', 'bootstrap'}) – Method to use when producing the forecast. The default is analytic. The method only affects the variance forecast generation. Not all volatility models support all methods. In particular, volatility models that do not evolve in squares such as EGARCH or TARCH do not support the ‘analytic’ method for horizons > 1.
  • simulations (int) – Number of simulations to run when computing the forecast using either simulation or bootstrap.
  • rng (callable, optional) – Custom random number generator to use in simulation-based forecasts. Must produce random samples using the syntax rng(size) where size the 2-element tuple (simulations, horizon).
  • random_state (RandomState, optional) – NumPy RandomState instance to use when method is ‘bootstrap’
Returns:

forecasts – t by h data frame containing the forecasts. The alignment of the forecasts is controlled by align.

Return type:

ARCHModelForecast

Examples

>>> import pandas as pd
>>> from arch import arch_model
>>> am = arch_model(None,mean='HAR',lags=[1,5,22],vol='Constant')
>>> sim_data = am.simulate([0.1,0.4,0.3,0.2,1.0], 250)
>>> sim_data.index = pd.date_range('2000-01-01',periods=250)
>>> am = arch_model(sim_data['data'],mean='HAR',lags=[1,5,22],  vol='Constant')
>>> res = am.fit()
>>> fig = res.hedgehog_plot()

Notes

The most basic 1-step ahead forecast will return a vector with the same length as the original data, where the t-th value will be the time-t forecast for time t + 1. When the horizon is > 1, and when using the default value for align, the forecast value in position [t, h] is the time-t, h+1 step ahead forecast.

If model contains exogenous variables (model.x is not None), then only 1-step ahead forecasts are available. Using horizon > 1 will produce a warning and all columns, except the first, will be nan-filled.

If align is ‘origin’, forecast[t,h] contains the forecast made using y[:t] (that is, up to but not including t) for horizon h + 1. For example, y[100,2] contains the 3-step ahead forecast using the first 100 data points, which will correspond to the realization y[100 + 2]. If align is ‘target’, then the same forecast is in location [102, 2], so that it is aligned with the observation to use when evaluating, but still in the same column.

resids(params, y=None, regressors=None)[source]

Compute model residuals

Parameters:
  • params (ndarray) – Model parameters
  • y (ndarray, optional) – Alternative values to use when computing model residuals
  • regressors (ndarray, optional) – Alternative regressor values to use when computing model residuals
Returns:

resids – Model residuals

Return type:

ndarray

simulate(params, nobs, burn=500, initial_value=None, x=None, initial_value_vol=None)[source]

Simulates data from a linear regression, AR or HAR models

Parameters:
  • params (ndarray) – Parameters to use when simulating the model. Parameter order is [mean volatility distribution] where the parameters of the mean model are ordered [constant lag[0] lag[1] … lag[p] ex[0] … ex[k-1]] where lag[j] indicates the coefficient on the jth lag in the model and ex[j] is the coefficient on the jth exogenous variable.
  • nobs (int) – Length of series to simulate
  • burn (int, optional) – Number of values to simulate to initialize the model and remove dependence on initial values.
  • initial_value ({ndarray, float}, optional) – Either a scalar value or max(lags) array set of initial values to use when initializing the model. If omitted, 0.0 is used.
  • x ({ndarray, DataFrame}, optional) – nobs + burn by k array of exogenous variables to include in the simulation.
  • initial_value_vol ({ndarray, float}, optional) – An array or scalar to use when initializing the volatility process.
Returns:

simulated_data – DataFrame with columns data containing the simulated values, volatility, containing the conditional volatility and errors containing the errors used in the simulation

Return type:

DataFrame

Examples

>>> import numpy as np
>>> from arch.univariate import HARX, GARCH
>>> harx = HARX(lags=[1, 5, 22])
>>> harx.volatility = GARCH()
>>> harx_params = np.array([1, 0.2, 0.3, 0.4])
>>> garch_params = np.array([0.01, 0.07, 0.92])
>>> params = np.concatenate((harx_params, garch_params))
>>> sim_data = harx.simulate(params, 1000)

Simulating models with exogenous regressors requires the regressors to have nobs plus burn data points

>>> nobs = 100
>>> burn = 200
>>> x = np.random.randn(nobs + burn, 2)
>>> x_params = np.array([1.0, 2.0])
>>> params = np.concatenate((harx_params, x_params, garch_params))
>>> sim_data = harx.simulate(params, nobs=nobs, burn=burn, x=x)

Least Squares

class arch.univariate.LS(y=None, x=None, constant=True, hold_back=None, rescale=None)[source]

Least squares model estimation and simulation

Parameters:
  • y ({ndarray, DataFrame}, optional) – nobs element vector containing the dependent variable
  • y – nobs by k element array containing exogenous regressors
  • constant (bool, optional) – Flag whether the model should include a constant
  • hold_back (int) – Number of observations at the start of the sample to exclude when estimating model parameters. Used when comparing models with different lag lengths to estimate on the common sample.
  • rescale (bool, optional) – Flag indicating whether to automatically rescale data if the scale of the data is likely to produce convergence issues when estimating model parameters. If False, the model is estimated on the data without transformation. If True, than y is rescaled and the new scale is reported in the estimation results.

Examples

>>> import numpy as np
>>> from arch.univariate import LS
>>> y = np.random.randn(100)
>>> x = np.random.randn(100,2)
>>> ls = LS(y, x)
>>> res = ls.fit()

Notes

The LS model is described by

\[y_t = \mu + \gamma' x_t + \epsilon_t\]
fit(update_freq=1, disp='final', starting_values=None, cov_type='robust', show_warning=True, first_obs=None, last_obs=None, tol=None, options=None, backcast=None)

Fits the model given a nobs by 1 vector of sigma2 values

Parameters:
  • update_freq (int, optional) – Frequency of iteration updates. Output is generated every update_freq iterations. Set to 0 to disable iterative output.
  • disp (str) – Either ‘final’ to print optimization result or ‘off’ to display nothing
  • starting_values (ndarray, optional) – Array of starting values to use. If not provided, starting values are constructed by the model components.
  • cov_type (str, optional) – Estimation method of parameter covariance. Supported options are ‘robust’, which does not assume the Information Matrix Equality holds and ‘classic’ which does. In the ARCH literature, ‘robust’ corresponds to Bollerslev-Wooldridge covariance estimator.
  • show_warning (bool, optional) – Flag indicating whether convergence warnings should be shown.
  • first_obs ({int, str, datetime, Timestamp}) – First observation to use when estimating model
  • last_obs ({int, str, datetime, Timestamp}) – Last observation to use when estimating model
  • tol (float, optional) – Tolerance for termination.
  • options (dict, optional) – Options to pass to scipy.optimize.minimize. Valid entries include ‘ftol’, ‘eps’, ‘disp’, and ‘maxiter’.
  • backcast (float, optional) – Value to use as backcast. Should be measure \(\sigma^2_0\) since model-specific non-linear transformations are applied to value before computing the variance recursions.
Returns:

results – Object containing model results

Return type:

ARCHModelResult

Notes

A ConvergenceWarning is raised if SciPy’s optimizer indicates difficulty finding the optimum.

Parameters are optimized using SLSQP.

fix(params, first_obs=None, last_obs=None)

Allows an ARCHModelFixedResult to be constructed from fixed parameters.

Parameters:
  • params ({ndarray, Series}) – User specified parameters to use when generating the result. Must have the correct number of parameters for a given choice of mean model, volatility model and distribution.
  • first_obs ({int, str, datetime, Timestamp}) – First observation to use when fixing model
  • last_obs ({int, str, datetime, Timestamp}) – Last observation to use when fixing model
Returns:

results – Object containing model results

Return type:

ARCHModelFixedResult

Notes

Parameters are not checked against model-specific constraints.

resids(params, y=None, regressors=None)

Compute model residuals

Parameters:
  • params (ndarray) – Model parameters
  • y (ndarray, optional) – Alternative values to use when computing model residuals
  • regressors (ndarray, optional) – Alternative regressor values to use when computing model residuals
Returns:

resids – Model residuals

Return type:

ndarray

simulate(params, nobs, burn=500, initial_value=None, x=None, initial_value_vol=None)

Simulates data from a linear regression, AR or HAR models

Parameters:
  • params (ndarray) – Parameters to use when simulating the model. Parameter order is [mean volatility distribution] where the parameters of the mean model are ordered [constant lag[0] lag[1] … lag[p] ex[0] … ex[k-1]] where lag[j] indicates the coefficient on the jth lag in the model and ex[j] is the coefficient on the jth exogenous variable.
  • nobs (int) – Length of series to simulate
  • burn (int, optional) – Number of values to simulate to initialize the model and remove dependence on initial values.
  • initial_value ({ndarray, float}, optional) – Either a scalar value or max(lags) array set of initial values to use when initializing the model. If omitted, 0.0 is used.
  • x ({ndarray, DataFrame}, optional) – nobs + burn by k array of exogenous variables to include in the simulation.
  • initial_value_vol ({ndarray, float}, optional) – An array or scalar to use when initializing the volatility process.
Returns:

simulated_data – DataFrame with columns data containing the simulated values, volatility, containing the conditional volatility and errors containing the errors used in the simulation

Return type:

DataFrame

Examples

>>> import numpy as np
>>> from arch.univariate import HARX, GARCH
>>> harx = HARX(lags=[1, 5, 22])
>>> harx.volatility = GARCH()
>>> harx_params = np.array([1, 0.2, 0.3, 0.4])
>>> garch_params = np.array([0.01, 0.07, 0.92])
>>> params = np.concatenate((harx_params, garch_params))
>>> sim_data = harx.simulate(params, 1000)

Simulating models with exogenous regressors requires the regressors to have nobs plus burn data points

>>> nobs = 100
>>> burn = 200
>>> x = np.random.randn(nobs + burn, 2)
>>> x_params = np.array([1.0, 2.0])
>>> params = np.concatenate((harx_params, x_params, garch_params))
>>> sim_data = harx.simulate(params, nobs=nobs, burn=burn, x=x)

Writing New Mean Models

All mean models must inherit from :class:ARCHModel and provide all public methods. There are two optional private methods that should be provided if applicable.

class arch.univariate.base.ARCHModel(y=None, volatility=None, distribution=None, hold_back=None, rescale=None)[source]

Abstract base class for mean models in ARCH processes. Specifies the conditional mean process.

All public methods that raise NotImplementedError should be overridden by any subclass. Private methods that raise NotImplementedError are optional to override but recommended where applicable.