The Association Analysis tool allows a user to determine which fields in a database have a bivariate association with one another. The assessment can be based on either Pearson product-moment ("regular") correlation coefficients,* Spearman rank-order correlation coefficients,** or Hoeffding's D statistics*** (a non-parametric test that can find non-monotonic relationships such as inverted U-shapes). In addition, the statistical significance of each association measure is determined.
The tool always provides the full set of relationships, and optionally can provide an in depth analysis of a target field of interest and its relationship to other numeric variables. The target field of interest can either be a numeric variable or a binary categorical variable. If a binary categorical variable is used as the target field, then it is converted to a zero-one numeric field with the value one imputed in cases where the field has a level that corresponds to a target level, and a zero value is imputed otherwise.
This tool uses the R programming language. Go to Options > Download Predictive Tools to install R and the packages used by the R Tool.
The Association Analysis tool accepts input from an Alteryx data stream.
Columns containing unique identifiers, such as surrogate primary keys and natural primary keys, should not be used in statistical analyses. They have no predictive value and can cause runtime exceptions.
R Output: Report output includes 3 tables that comprise a Pearson Correlation Analysis: Focused Analysis of Field Trans, Full Correlation Matrix, and Matrix of Corresponding p-values.
I Output: Interactive report includes a Correlation Matrix with Scatterplot that changes based on your mouse position.
Table of Critical Values for Pearson's r
©2018 Alteryx, Inc., all rights reserved. Allocate®, Alteryx®, Guzzler®, and Solocast® are registered trademarks of Alteryx, Inc.