Cross-Validation Tool

The Cross-Validation tool compares the performance of one or more Alteryx-generated predictive models using the process of cross-validation. It supports all classification and regression models.

This tool uses the R tool. Install R and the necessary packages by going to Options > Download Predictive Tools.

This tool is not automatically installed with Alteryx Designer. To use this tool, download it from the Alteryx Analytics Gallery.

Overview

Among predictive modelers, cross-validation is frequently preferred over other model evaluation methods because it does not require the use of a separate test set and generates more robust estimates of model quality.

For all classification models, the tool provides the overall accuracy, the accuracy by class, and a set of confusion matrices (one for each model). Additionally, the tool reports the F1 score and a collection of Performance Diagnostic Plots (lift curve, gain chart, precision versus recall curves and ROC curve) for binary classification models. For regression models, the tool generally provides the correlation between predicted and actual values, the root mean square error (RMSE), the mean absolute error (MAE), the mean percentage error (MPE), and the mean absolute percentage error (MAPE) of each model's predictions. But when at least one target value is near 0, the MPE and the MAPE are undefined. In that case, the MPE is replaced with the sum of the errors over the sum of the actual values, and the sum of the absolute errors divided by the sum of the actual values (that is, the Weighted Absolute Percentage Error) replaces the MAPE. Additionally, the tool always provides a plot of actual versus predicted values in the regression case.

Inputs

The Cross-Validation tool requires two inputs:

Configuration Properties

Outputs