The Importance Weight tool provides methods for selecting a set of variables to use in a predictive model based on how strongly related each possible predictor is to the target variable of a model to be created.
The final set selected can be based on taking the N most strongly related predictors to the target, or by selecting a cutoff importance weight level, and only those variables that exceed the cutoff point are included in a model.
On drawback to this approach is that it only looks at the strength of a possible predictor on the target in isolation, ignoring possible interaction effects and correlation between predictors. Despite this limitation, this type of variable filtering method is frequently used in practice.
There are a number of different importance weights measures, and the applicability of a particular method typically depends on both the type of target and the predictor (numeric or categorical). One drawback to this situation is that measures used to determine the relative importance of different possible predictors will be different for numeric and categorical variables. The exception is the Relief method, but its performance is not as robust as other methods that are specific to a particular target type and predictor type combination.
Most of the measures are provided by the FSelector R package. This package makes use of some methods written in Java, so to use this macro, you will need to have a Java 7 runtime environment on the machine where Alteryx is installed.
This macro is not automatically installed with Alteryx Designer. To use this macro, download it from the Alteryx Analytics Gallery.
An Alteryx data stream containing both the desired target variable and a set of potential predictor variables that will be used to estimate a predictive model.
Continuous target: Select this option if the target variable you want to predict is a numeric variable. When you select this option you will be asked to select the target variable field from the data, and whether you want to examine which possible continuous (numeric) or categorical (string variables with category labels) you wish to consider. Once you have made this selection, you will need to select the set of predictors (of the selected type) you want to examine and one or more comparison measures. For continuous target and continuous predictors the available measures are:
The available importance weight measures available for a continuous target and categorical predictors are:
Columns containing unique identifiers, such as surrogate primary keys and natural primary keys, should not be used in statistical analyses. They have no predictive value and can cause runtime exceptions.
Categorical target: Select this option if the target variable you want to predict is a categorical variable. When you select this option you will be asked to select the target variable field from the data, and whether you want to examine which possible continuous (numeric) or categorical (string variables with category labels) you wish to consider. Once you have made this selection, you will need to select the set of predictors (of the selected type) you want to examine and one or more comparison measures. For continuous target and continuous predictors the available measures are:
The available importance weight measures available for a categorical target and categorical predictors are:
D: Consists of a table that provides the selected importance weight value for each potential predictor.
©2018 Alteryx, Inc., all rights reserved. Allocate®, Alteryx®, Guzzler®, and Solocast® are registered trademarks of Alteryx, Inc.