The ARIMA tool estimates a time series forecasting model, either as a univariate model or one with covariates (predictors), using an autoregressive integrated moving average (or ARIMA) method. ARIMA is the most commonly used forecasting approach and is considered to be the most general class of models for forecasting a time series field. The ARIMA methods implemented in this tool can use an automated approach to develop a model based on statistical criteria, or you can directly specify the underlying parameters of an ARIMA model. A detailed discussion of the ARIMA model, along with a description of the automated methods used in this tool, can be found in Chapter 8 of Hyndman and Athanasopoulos's online book Forecasting: Principals and Practice.
This tool uses the R tool. Go to Options > Download Predictive Tools and sign in to the Alteryx Downloads and Licenses portal to install R and the packages used by the R Tool. See Download and Use Predictive Tools.
Connect an Input
An Alteryx data stream containing historical data on the time series to be forecast and (optionally) a set of covariates. Fields that will not be used in model creation can also be present in the data stream.
Configure the Tool
Use the Required parameters tab to set the basic controls required for an ARIMA model to be created.
- Model name: Each model needs to be given a name so it can later be identified. Model names must start with a letter and can contain letters, numbers, and the special characters period (".") and underscore ("_"). No other special characters are allowed, and R is case sensitive.
- Select the target field: Select the field from the data stream you want to forecast. Measurements for this field need to be made at regular time intervals (for example, daily, monthly, quarterly, etc.). Columns containing unique identifiers, like surrogate primary keys and natural primary keys, should not be used in statistical analyses. They have no predictive value and can cause runtime exceptions.
- Use covariates in model estimation?: If this option is checked, you are presented with a checkbox list to select the fields to use as covariates in the ARIMA model.
- Target field frequency: Choose the time interval for the observations of the target field.
Use the Model customization (optional) tab to set controls that adjust how the model processes data.
- Customize the parameters used for automatic model creation...: Select this option to expose a set of parameters that influence automatic model creation. The options include the ability to...
- Adjust the non-seasonal components including the level of first differencing, the maximum order of the autoregressive component, and the maximum order of the moving average component.
- Adjust the seasonal components including the level of seasonal differencing, the maximum order of the seasonal autoregressive component, and the maximum order of the seasonal moving average component.
- Select the information criteria used for selecting between different candidate models. By default the corrected Akaike information criterion (AICc) is used, but the uncorrected Akaike information criterion (AIC) or the Bayesian information criterion can be selected instead.
- You can also determine whether all possible models are estimated and compared (full enumeration) instead of using the default stepwise algorithm. The stepwise algorithm has been shown to have good performance characteristics and is much less computationally intensive, however, it is not guaranteed to find the best single model. Estimating all possible ARIMA models will find the single best model, but with a significantly greater runtime. If full enumeration is selected, you can place some limits on the space searched by setting the maximum allowed order of the model. In addition, you have the option of using multiple cores of the machine on which Alteryx is being run.
- Options can also be set that allow for "drift" in the model and whether a Box-Cox transformation (including setting the value of lambda) is applied to the target field.
- Completely user specified model...: Select this option to manually specify an ARIMA model. The required parameters include...
- The non-seasonal components of the order of the autoregressive component (p), the degree of first differencing (d), and the order of the moving average (q).
- The seasonal components of the order of the seasonal autoregressive component (P), the degree of seasonal differencing (D), and the order of the seasonal moving average component (Q).
- Options can also be set that allow for "drift" within the model and whether a Box-Cox transformation (including setting the value of lambda) is applied to the target field.
Use the Other options tab to set additional parameters for periods.
- Series starting period (optional): This option allows you to specify the starting period of the time series, which is reflected in the forecast plot. If Target Field Frequency is set to Hourly, Daily (all days), or Daily (weekdays only), this option is not available.
- The number of periods to include in the forecast plot: This plot that contains the original data and a number of forecast future points (along with 80% and 95% confidence intervals around the forecast points). You can specify the number of periods that should be forecast into the future for the plot.
- Select Week Format: This allows you to choose a method to specify work weeks. These options relate to what constitutes the first week of the year, and what day of the week a week begins on.
- US: Sunday is the first day of the week.
- UK: Monday is the first day of the week.
- ISO8601: Monday is the first day of the week.
Use the Graphics Options tab to set the controls for the graphical output.
- Plot size: Select inches or centimeters for the size of the graph.
- Graph resolution: Select the resolution of the graph in dots per inch: 1x (96 dpi), 2x (192 dpi), or 3x (288 dpi). Lower resolution creates a smaller file and is best for viewing on a monitor. Higher resolution creates a larger file with better print quality.
- Base font size (points): Select the size of the font in the graph.
View the Output
- O anchor: Consists of an output stream that contains the ARIMA model object that can be used for both point forecasts and a user-specified percentile confidence interval surrounding those forecasts.
- R anchor: Consists of the report snippets generated by the ARIMA tool: a statistical summary, autocorrelation diagnostic plots, and forecast plots.
- I anchor: An interactive html dashboard that consists of plots and metrics. Select the different graphical elements to interact with the visualizations to reveal more information, values, metrics, and analytics.
Expected Behavior: Plot Calculations
The forecast plot uses a default date for calculations if any of these configuration settings are used:
- Target Field Frequency is set to Hourly, Daily (all days), or Daily (weekdays only).
- Target Field Frequency is set to Weekly, Monthly, Quarterly, or Annually and the Series starting period is not set.
The default date used might vary, making the calculation appear random.