Score Tool
The Score Tool creates an estimate of a target variable by applying an R model to a set of supplied predictor variables. If the target variable is categorical, it provides probabilities that a record (based on the predictor variable) belongs to each category. If the target variable is continuous, it estimates the target variable’s value. Although it can be used to assess model performance, it does not do so on its own.
This tool uses the R tool. Go to Options > Download Predictive Tools and sign in to the Alteryx Downloads and Licenses portal to install R and the packages used by the R Tool. See Download and Use Predictive Tools.
Model type
The Score tool can evaluate models from a number of locations:
- Local Model: The model is pulled into the workflow from a local machine or is accessed within a database.
- Promote Model: The model is stored in the Promote model management system.
The Score tool can be configured for models accessed by a standard workflow or for models accessed using the In-DB suite.
The Score tool requires two inputs:
- The model object produced in an R-based predictive tool.
- A data stream that contains the predictor fields selected in the model configuration. This can be a standard Alteryx data stream or an XDF metadata stream.
Connect these inputs to the Score tool input to begin configuration.
Supported models
The Score tool can use a data stream from a predictive model, even if it was estimated using a RevoScaleR function. The Score tool can only use an XDF metadata stream if the input to the modeling tool was from either an XDF Output Tool or XDF Input Tool and the model was estimated using a RevoScaleR function.
Models estimated by Oracle R Enterprise using an In-DB predictive tool connected to an Oracle data source cannot be used to score a standard Alteryx data stream, although models estimated with a standard Alteryx data stream can be used to score Oracle data sources.
- The new field name (continuous target) or prefix (categorical target): The field name or prefix must start with a letter and may contain letters, numbers, and the special characters period (".") and underscore ("_"). R is case sensitive.
- The target field has an oversampled value: These fields are used to adjust the fitted probabilities to match the true sample percentages. Select to provide:
- The value of the target field that was oversampled: The name of the oversampled field.
- The percentage of the oversampled value in the original data prior to oversampling: The percentage of values that were repeated during oversampling.
- Non-regularized linear regression only options:
- The target field has been natural log transformed: Select to apply a transformation that fits the values back to the original scale and to use a Smearing estimator to account for the subsequent transformation bias.
- Include a prediction confidence interval: Select to specify the value used to calculate confidence intervals.
- XDF input specific options:
- Append scores to the input XDF file: Select to append scores to the input XDF file instead of placing them into an Alteryx data stream.
- The number of records to score at a time: Select the number of records in a group. Input data is scored one group at a time to avoid the in-memory processing limitation of R.
The adjustments made through this option are only valid if the target is a binary categorical variable.
ORE-created models
If using an ORE-created model, the original estimation table must exist in the database to calculate confidence intervals.
The Score tool supports Oracle, Microsoft SQL Server 2016, and Teradata in-database processing. See In-Database Overview for more information about in-database support and tools.
To access the In-DB version of the Score tool:
- Place an In-DB tool on the canvas. The Score tool automatically changes to the In-DB version.
- Right-click the Score tool, point to Choose Tool Version, and select the In-DB version.
See Predictive Analytics for more information about predictive in-database support.
The Score tool requires two inputs:
- The model object produced in an R-based predictive tool.
- A data stream that contains the predictor fields selected in the model configuration. This can be a standard Alteryx data stream or an XDF metadata stream.
Connect these inputs to the Score tool input to begin configuration.
Supported models
The Score tool can use a data stream from a predictive model, even if it was estimated using a RevoScaleR function. The Score tool can only use an XDF metadata stream if the input to the modeling tool was from either an XDF Output Tool or XDF Input Tool and the model was estimated using a RevoScaleR function.
Models estimated by ORE using an In-DB predictive tool connected to an Oracle data source cannot be used to score a standard Alteryx data stream, although models estimated with a standard Alteryx data stream can be used to score Oracle data sources.
- Output table name: Type the name of the table that the results are saved to in the database.
- The new field name (continuous target) or prefix (categorical target): The field name or prefix must start with a letter and may contain letters, numbers, and the special characters period (".") and underscore ("_").
- The target field has an oversampled value: These fields are used to adjust the fitted probabilities to match the true sample percentages. The adjustments made through this option are only valid if the target is a binary categorical variable. Select to provide:
- The value of the target field that was oversampled: The name of the oversampled field.
- The percentage of the oversampled value in the original data prior to oversampling: The percentage of values that were repeated during oversampling.
- Linear regression only options:
- The target field has been natural log transformed: Select to apply a transformation that fits the values back to the original scale and to use a Smearing estimator to account for the subsequent transformation bias.
- Include a prediction confidence interval: Select to specify the value used to calculate confidence intervals.
- Teradata specific configuration: Microsoft Machine Learning Server needs additional configuration information about the specific Teradata platform to be used. This information is typically provided by a local Teradata administrator.
- The Teradata server paths to R's binary executables
- The temporary file write location that is used by Microsoft Machine Learning Server.
No other special characters are allowed, and R is case sensitive.
ORE-created models
If using an ORE-created model, the original estimation table must exist in the database to calculate confidence intervals.
- Model Source: Select the source of the model object that is passed into the (M) input of the Score tool. This can be either:
- In the database, identified by the value in the "Name" field of the data stream.
- Contained in the "Object" field of the data stream.
The output includes the original data streams with the predicted values of the model. In the case of a model that uses a categorical target, a predicted probability for each level of the target variable is provided in a field name comprised of the user-provided prefix and the suffix of the field that corresponds to the possible level of the target variable.
Promote is a platform for deploying, managing, and scaling predictive models. Alteryx can connect to the Promote platform to access stored models and score against them.
Establish an Alteryx Promote Connection.
Alteryx Promote Connection: A drop-down list used to select from saved Promote connections.
Add Connection: An option to add to the list of available Promote connections. The Promote connection manager operates independently of workflows.
- Click Add Connection.
- In the Add Connection window, enter an Alteryx Promote URL, a URL that points to the location where your model is stored.
- Click Next.
- In the Alteryx Promote Credentials window, type your Username and API Key.
- Click Connect.
- If successful, in the Connection Established window, select Finish. The new connection is selected and visible in the drop-down.
- Select an available connection.
- Click Remove Connection.
- In the Confirmation window, verify the URL and Username are associated with the connection you want to remove.
- Click OK. The connection is no longer available in the drop-down.
Promote access
If you are unsure if you have access to the Promote feature or need assistance finding your required credentials, contact your local administrator or your support representative.
A list of the models you have access to is generated. Scroll through the list or use the search function to find the model you want to score and select the model path.
Once a model path is selected, information regarding the model populates.
- Name: The model name.
- Owner: The model owner.
- Status: The current state of the model, reflecting it's accessibility.
- Online: Model is up-to-date and ready to process data.
- Building: Model is currently being updated and cannot process data.
- Failed Unit Test: Model finished building, but components failed to build correctly. The model cannot process data.
- Failed: Model failed to build correctly and cannot process data.
- Offline: Model has not been built and cannot process data.
- Last Updated: The timestamp of the last model build.
Verify that the model is available for data processing and select Done.
The Configuration Summary provides a summary of the Credentials used and the Model Summary of the selected model.