Data Typing Transformer

Use of the Assisted Modeling tool requires participation in the Alteryx Analytics Beta program. Visit the the Alteryx beta program, also known as the Alteryx Customer Feedback Program, to find out more. All Alteryx Beta Program notifications and disclaimers apply to this content. Functionality described here may or may not be available as part of the Beta Program. To send feedback about the documentation, send an email message to helpfeedback@alteryx.com.

Data typing: Data typing is a type of transformer. Use data typing to set your columns to the most appropriate data type. In assisted mode, for optimal performance, keep the recommended data type shown or select one from the drop-down list.

Use a transformer to pre-process a model and improve the performance of the model by reducing bias, defining relationships, removing outliers, and more. Apply the transformation using the Transformation tool in the Machine Learning tool palette. Transformations include setting data types, clean up of missing values, and selecting columns. Alteryx Machine Learning Transformers generate a new dataset using one or more rows or columns of your existing dataset. .
Selecting a different transformer clears any changes you might have made.

Before using the tool

Start with an existing workflow. You should first clean and prep your dataset. Once your dataset contains only the relevant data you need for your business use case, then start building a pipeline using the Machine Learning tools.

Add the tool

  1. Click the Transformation tool in the Machine Learning tool palette. Drag it to the workflow canvas, and connect it to your workflow.
    A start pipeline tool is required for the transformation to function. Your workflow should contain a start pipeline tool such as the Start Pipeline tool or the Assisted Modeling tool prior to starting a data transformation.
  1. In Transformer, select the transformation type you want to configure.
  2. Configure the tool.

Configure the tool

Configure the parameters. Understand the parameters before changing them. For best practices, avoid making assumptions, and use a test dataset to assess the performance of your model whether your objective is prediction or not.

To find out more about a parameter, click the parameter's tooltip.

For optimal model performance, each feature in your dataset should be set to the most appropriate data type.

Analyze the features in your data and to determine which data type best fits the values in the column. To view your dataset, you can click the anchor on the input tool.

Set the data type for each feature

Review the assumed data type for each feature in the dataset.

For each feature, accept the recommended data type, or select a different data type. Changing the data type may cause Alteryx to drop the feature. To manually drop a feature, set the data type to ID.

Click the search icon and enter search criteria to filter the view.

You may want to ignore the assumed data types when you want to use the values in a numeric column as categories. For example, your column may consist entirely of 0s and 1s, indicating categories. The numbers have no mathematical meaning and are purely for categorization purposes. In this case, the data type is categorical. See Machine Learning Tools - Definitions.

Run the workflow to apply the configuration.

Machine Learning Tools

Assisted Modeling

Expert Modeling

Definitions for Machine Learning Tools

Steps in Assisted Modeling

Select Target and Machine-Learning Method

Select Target and Machine-Learning Method

Set Data Types

Clean Up Missing Values

Select Features

Select Algorithms

Other Machine Learning Tools

Predict Tool

One Hot Encoding Machine Learning Tool

Fit Tool Machine Learning Tool

Transformation Tool

Data Typing Transformer

Missing Value Imputation Transformer

Feature Selection Transformer

Classifiers

Classification Tool

Logistic Regression Classifier

Random Forest Classifier

Decision Tree Classifier

Regressors

Regression Machine Learning Tool

Linear Regression

Random Forest Regression

Decision Tree Regression