Missing Value Imputation Transformer

Use of the Assisted Modeling tool requires participation in the Alteryx Analytics Beta program. Visit the the Alteryx beta program, also known as the Alteryx Customer Feedback Program, to find out more. All Alteryx Beta Program notifications and disclaimers apply to this content. Functionality described here may or may not be available as part of the Beta Program. To send feedback about the documentation, send an email message to helpfeedback@alteryx.com.

Missing value imputation: Use the missing value imputation transformation tool to clean up (impute) missing values. The tool recommends a clean-up method when there are missing values in your dataset. You can choose your own clean-up method and override any recommendations.

Use a transformer to pre-process a model and improve the performance of the model by reducing bias, defining relationships, removing outliers, and more. Apply the transformation using the Transformation tool in the Machine Learning tool palette. Transformations include setting data types, clean up of missing values, and selecting columns. Alteryx Machine Learning Transformers generate a new dataset using one or more rows or columns of your existing dataset. .
Selecting a different transformer clears any changes you might have made.

Before using the tool

Start with an existing workflow. You should first clean and prep your dataset. Once your dataset contains only the relevant data you need for your business use case, then start building a pipeline using the Machine Learning tools.

Add the tool

  1. Click the Transformation tool in the Machine Learning tool palette. Drag it to the workflow canvas, and connect it to your workflow.
    A start pipeline tool is required for the transformation to function. Your workflow should contain a start pipeline tool such as the Start Pipeline tool or the Assisted Modeling tool prior to starting a data transformation.
  1. In Transformer, select the transformation type you want to configure.
  2. Configure the tool.

Configure the tool

Configure the parameters. Understand the parameters before changing them. For best practices, avoid making assumptions, and use a test dataset to assess the performance of your model whether your objective is prediction or not.

To find out more about a parameter, click the parameter's tooltip.

The tool suggests imputation methods for each column in the dataset where there are rows with missing or null values. This is known as imputation.

For more information about how datasets with missing values are incompatible with Scikit-learn, visit Imputation of missing values.

This step applies to numeric columns only. If there are categorical columns with missing values, Alteryx applies a "missing" category.

1. Review each column with null values

2. Select an imputation method

Accept the recommendation or select a different clean-up method from the drop-down list. Clean-up methods include the following:

  • Drop column - Alteryx will drop the column.
  • Impute with mean -Alteryx will replace missing values with the mean value of all values in the column. The mean is calculated as the sum of all values in the column divided by the number of rows in the column.
  • Impute with median - Alteryx will replace missing values with the median value of all values of the column. The median is calculated as the value that is halfway into the column if the values in the column were arranged from smallest to largest. If column contains an even number of rows, the median is calculated as the average or mean of the values in the two middle rows.
  • Impute with mode -Alteryx will replace missing values with the mode. The mode is calculated as the value that occurs most often. If no values repeat, then there is no mode.

Run the workflow to apply the configuration.

Machine Learning Tools

Assisted Modeling

Expert Modeling

Definitions for Machine Learning Tools

Steps in Assisted Modeling

Select Target and Machine-Learning Method

Select Target and Machine-Learning Method

Set Data Types

Clean Up Missing Values

Select Features

Select Algorithms

Other Machine Learning Tools

Predict Tool

One Hot Encoding Machine Learning Tool

Fit Tool Machine Learning Tool

Transformation Tool

Data Typing Transformer

Missing Value Imputation Transformer

Feature Selection Transformer

Classifiers

Classification Tool

Logistic Regression Classifier

Random Forest Classifier

Decision Tree Classifier

Regressors

Regression Machine Learning Tool

Linear Regression

Random Forest Regression

Decision Tree Regression