The Logistic Regression tool creates a model that relates a target binary variable (such as yes/no, pass/fail) to one or more predictor variables to obtain the estimated probability for each of two possible responses for the target variable, Common logistic regression models include logit, probit, and complementary log-log. See Logistic Regression.
This tool uses the R programming language. Go to Options > Download Predictive Tools to install R and the packages used by the R Tool.
Connect an Alteryx data stream or XDF metadata stream that includes a target field of interest along with one or more possible predictor fields.
If the input data is from an Alteryx data stream, then the open source R glm function and the glmnet and cv.glmnet functions (from the glmnet package) is used for model estimation.
If the input data comes from either an XDF Output Tool or XDF Input Tool, then the RevoScaleR rxLogit function is used for model estimation. The advantage of using the RevoScaleR based function is that it allows much larger (out of memory) datasets to be analyzed, but at the cost of additional overhead to create an XDF file and the inability to create some of the model diagnostic output that is available with the open source R functions, and it only allows for the use of a logit link function.
Columns containing unique identifiers, such as surrogate primary keys and natural primary keys, should not be used in statistical analyses. They have no predictive value and can cause runtime exceptions.
Click Customize to modify the Model, Cross-validation, and Plots settings.
Graph resolution: Select the resolution of the graph in dots per inch: 1x (96 dpi); 2x (192 dpi); or 3x (288 dpi). Lower resolution creates a smaller file and is best for viewing on a monitor. Higher resolution creates a larger file with better print quality.
Connect a Browse tool to each output anchor to view results.
The Logistic Regression tool supports Oracle, Microsoft SQL Server 2016, and Teradata in-database processing. See In-Database Overview for more information about in-database support and tools.
When a Logistic Regression tool is placed on the canvas with another In-DB tool, the tool automatically changes to the In-DB version. To change the version of the tool, right-click the tool, point to Choose Tool Version, and click a different version of the tool. See Predictive Analytics for more about predictive in-database support.
Connect an in-database data stream that includes a target field of interest along with one or more possible predictor fields.
If the input is from a SQL Server or Teradata in-database data stream, then the Microsoft R Server rxLogit function (from the RevoScaleR package) is used for model estimation. This allows the processing to be done on the database server, as long as both the local machine and the server have been configured with Microsoft R Server, and can result in a significant improvement on performance.
If the input is from an Oracle in-database data stream, then the Oracle R Enterprise ore.lm function (from the OREmodels package) is used for model estimation. This allows the processing to be done on the database server, as long as both the local machine and the server have been configured with Oracle R Enterprise, and can result in a significant improvement on performance.
For an in-database workflow in an Oracle database, full functionality of the resulting model object downstream only occurs if the Logistic Regression tool is connected directly from a Connect In-DB tool with a single full table selected, or if a Write Data In-DB tool is used immediately before the Logistic Regression tool to save the estimation data table to the database. Oracle R Enterprise makes use of the estimation data table to provide full model object functionality, such as calculating prediction intervals.
Columns containing unique identifiers, such as surrogate primary keys and natural primary keys, should not be used in statistical analyses. They have no predictive value and can cause runtime exceptions.
Connect a Browse tool to each output anchor to view results.
©2018 Alteryx, Inc., all rights reserved. Allocate®, Alteryx®, Guzzler®, and Solocast® are registered trademarks of Alteryx, Inc.