Cross-Validation Tool
The Cross-Validation tool compares the performance of one or more Alteryx-generated predictive models using the process of cross-validation. It supports all classification and regression models.
This tool uses the R tool. Go to Options > Download Predictive Tools and sign in to the Alteryx Downloads and Licenses portal to install R and the packages used by the R Tool.
Gallery tool
This tool is not automatically installed with Alteryx Designer or the R tools. To use this tool, download it from the Alteryx Analytics Gallery.
Among predictive modelers, cross-validation is frequently preferred over other model evaluation methods because it does not require the use of a separate test set and generates more robust estimates of model quality.
For all classification models, the tool provides the overall accuracy, the accuracy by class, and a set of confusion matrices (one for each model). Additionally, the tool reports the F1 score and a collection of Performance Diagnostic Plots (lift curve, gain chart, precision versus recall curves and ROC curve) for binary classification models. For regression models, the tool generally provides the correlation between predicted and actual values, the root mean square error (RMSE), the mean absolute error (MAE), the mean percentage error (MPE), and the mean absolute percentage error (MAPE) of each model's predictions. But when at least one target value is near 0, the MPE and the MAPE are undefined. In that case, the MPE is replaced with the sum of the errors over the sum of the actual values, and the sum of the absolute errors divided by the sum of the actual values (that is, the Weighted Absolute Percentage Error) replaces the MAPE. Additionally, the tool always provides a plot of actual versus predicted values in the regression case.
Connect inputs
The Cross-Validation tool requires two inputs:
- M anchor: Either a single Alteryx-generated predicted model, or the union of two or more such models. These models should all have been generated using the same dataset.
- D anchor: The dataset used to generate the above models.
Configure the tool
- Number of trials: Enter the number of times you would like the cross-validation procedure to be repeated. Choosing a smaller number of trials will speed up the tool, but a larger number will give you a more robust estimate of your models' quality.
- Number of folds: Enter the number of subsets to split the data into. An analogous tradeoff to Number of trials also exists for the number of folds.
-
Select the Type of model.
- Classification: These models predict categories like yes/no.
- Regression: These models predict numerical quantities like sales totals.
- Should stratified cross-validation be used?: Stratified cross-validation is a special type of cross-validation that creates folds with the same probability distribution as the larger dataset. For example, in a dataset where 80% of the target values are “No,” and 20% are “Yes,” each fold would have roughly 80% “No” responses and 20% “Yes” ones. Stratified cross-validation is frequently recommended when the target variable is imbalanced.
- Name of the positive class: (Optional) This configuration option is only relevant in binary (two-class) classification. Some of the measures reported for binary classification, such as the F1 score, require a distinction between a positive class (such as “Yes”) and a negative class (such as “No”). However, this configuration option is not required. If you leave it blank when using the tool with binary classification models, the tool will choose one of the classes as the positive one.
- Value of seed: To create reproducible results, you can select the seed used by the random number generator that dictates which records get sorted into which folds. Changing the seed will change the folds' compositions.
View the output
- D anchor: This output provides the actual data values as well as their predictions.
- F anchor: This output reports various model fit measures, depending on model type.
- R anchor: A summary report containing the average fit measures for each trial, as well as graphs where a single curve is presented for each model.