The Gamma Regression tool relates a gamma distributed, strictly positive variable of interest (target variable) to one or more variables (predictor variables) that are expected to have an influence on the target variable.
In a number of applications, the values of the target variable are always strictly positive (i.e., are never zero or negative), but tend to cluster toward the lower range of the observed values, but in a small minority of cases take on large values. Target variables of this nature represent a data generation process that is not consistent with the Normality assumptions underlying the traditional linear regression model. However, the values are always positive and will not always be integer numbers, so they do not follow a Poisson distribution or Negative Binomial distribution based process. They are consistent with a process based on a Gamma distribution, and can be estimated using methods similar to linear regression, via the generalized linear model framework.*
With this tool, if the input data is from a regular Alteryx data stream, then the open source R glm function is used for model estimation. If the input comes from either an XDF Output or XDF Input tool, then the Revo ScaleR rxGlm function is used for model estimation. The advantage of using the Revo ScaleR based function is that it allows much larger (out of memory) datasets to be analyzed, but at the cost of additional overhead to create an XDF file and with the inability to create some of the model diagnostic output that is available with the open source R functions.
This tool uses the R programming language. Go to Options > Download Predictive Tools to install R and the packages used by the R Tool.
An Alteryx data stream or XDF metadata stream that includes a target field of interest along with one or more possible predictor fields.
Columns containing unique identifiers, such as surrogate primary keys and natural primary keys, should not be used in statistical analyses. They have no predictive value and can cause runtime exceptions.
Graph resolution: Select the resolution of the graph in dots per inch: 1x (96 dpi); 2x (192 dpi); or 3x (288 dpi). Lower resolution creates a smaller file and is best for viewing on a monitor. Higher resolution creates a larger file with better print quality.
*en.wikipedia.org/wiki/Generalized_linear_model
©2018 Alteryx, Inc., all rights reserved. Allocate®, Alteryx®, Guzzler®, and Solocast® are registered trademarks of Alteryx, Inc.