The ARIMA tool estimates a time series forecasting model, either as a univariate model or one with covariates (predictors), using an autoregressive integrated moving average (or ARIMA) method. ARIMA is the most commonly used forecasting approach, and is considered to be the most general class of models for forecasting a time series field. The ARIMA methods implemented in this tool can use an automated approach to develop a model based on statistical criteria, or the user can directly specify the underlying parameters of an ARIMA model. A detailed discussion of the ARIMA model, along with a description of the automated methods used in this tool, can be found in Chapter 8 of Hyndman and Athanasopoulos's online book Forecasting: Principals and Practice.
Connect an input
An Alteryx data stream containing historical data on the time series to be forecast and (optionally) a set of covariates. Fields that will not be used in model creation can also be present in the data stream.
Configure the tool
Use the Required parameters tab to set the basic controls required for an ARIMA model to be created.
- Model name: Each model needs to be given a name so it can later be identified. Model names must start with a letter and may contain letters, numbers, and the special characters period (".") and underscore ("_"). No other special characters are allowed, and R is case sensitive.
- Select the target field: Select the field from the data stream you want to forecast. Measurements for this field need to be made at regular time intervals (e.g., daily, monthly, quarterly, etc.).
- Use covariates in model estimation?: If this option is checked, then the user will be presented with a checkbox list to select the fields to use as covariates in the ARIMA model.
- Target field frequency: Choose the time interval for the observations of the target field.
Columns containing unique identifiers, such as surrogate primary keys and natural primary keys, should not be used in statistical analyses. They have no predictive value and can cause runtime exceptions.
Use the Model customization (optional) tab to set controls that adjust how the model processes data.
- Customize the parameters used for automatic model creation...: Clicking on this option exposes a set of parameters that influence automatic model creation. The options include the ability to:
- Adjust the non-seasonal components including the level of first differencing, the maximum order of the autoregressive component, and the maximum order of the moving average component.
- Adjust the seasonal components including the level of seasonal differencing, the maximum order of the seasonal autoregressive component, and the maximum order of the seasonal moving average component.
- Select the information criteria used for selecting between different candidate models. By default the corrected Akaike information criterion (AICc) is used, but the uncorrected Akaike information criterion (AIC) or the Bayesian information criterion can be selected instead.
- The user can also determine whether all possible models are estimated and compared (full enumeration) instead of using the default stepwise algorithm. The stepwise algorithm has been shown to have good performance characteristics, and is much less computationally intensive, however, it is not guaranteed to find the best single model. Estimating all possible ARIMA models will find the single best model, but with a significantly greater runtime. If full enumeration is selected, the user can place some limits on the space searched by setting the maximum allowed order of the model. In addition, the user has the option of using multiple cores of the machine on which Alteryx is being run.
- Options can also be set that allow for "drift" in the model and whether a Box-Cox transformation (including setting the value of lambda) is applied to the target field.
- Completely user specified model...: Clicking on this option allows the user to manually specify an ARIMA model. The required parameters include:
- The non-seasonal components of the order of the autoregressive component (p), the degree of first differencing (d), and the order of the moving average (q).
- The seasonal components of the order of the seasonal autoregressive component (P), the degree of seasonal differencing (D), and the order of the seasonal moving average component (Q).
- Options can also be set that allow for "drift" within the model and whether a Box-Cox transformation (including setting the value of lambda) is applied to the target field.
Use the Other options tab to set additional parameters for periods.
- Series starting period (optional): This option allows the user to specify the starting period of the time series, which is reflected in the forecast plot.
- The number of periods to include in the forecast plot: This plot that contains the original data and a number of forecast future points (along with 80% and 95% confidence intervals around the forecast points). The user can specify the number of periods that should be forecast into the future for the plot.
- Select Week Format: This allows the user to choose a method to specify work weeks. These options relate to what constitutes the first week of the year, and what day of the week a week begins on.
- US – Sunday is the first day of the week
- UK – Monday is the first day of the week
- ISO8601 – Monday is the first day of the week
If Target Field Frequency is set to Hourly, Daily (all days), or Daily (weekdays only), this option is not available.
Use the Graphics Options tab to set the controls for the graphical output.
- Plot size: Select inches or centimeters for the size of the graph.
Graph resolution: Select the resolution of the graph in dots per inch: 1x (96 dpi); 2x (192 dpi); or 3x (288 dpi). Lower resolution creates a smaller file and is best for viewing on a monitor. Higher resolution creates a larger file with better print quality.
- Base font size (points): Select the size of the font in the graph.
View the output
- O anchor: Consists of an output stream containing the ARIMA model object that can be used for both point forecasts and a user specified percentile confidence interval surrounding those forecasts.
- R anchor: Consists of the report snippets generated by the ARIMA tool: a statistical summary, autocorrelation diagnostic plots and forecast plots.
- I anchor: An interactive html dashboard consisting of plots and metrics. You can interact with the visualizations by clicking on the different graphical elements to reveal more information, values, metrics and analytics.
- Target Field Frequency is set to Hourly, Daily (all days), or Daily (weekdays only).
- Target Field Frequency is set to Weekly, Monthly, Quarterly, or Annually and the Series starting period is not set.
Expected behavior: plot calculations
The forecast plot uses a default date for calculations if any of the following configuration settings are used:
The default date used may vary, making the calculation appear random.