Transforms

Note

Transforms are a part of the underlying language, which is not directly accessible to users. This content is maintained for reference purposes only. For more information on the user-accessible equivalent to transforms, see Transformation Reference.

In Wrangle, a transform is an action applied to your dataset. Each step of your recipe corresponds to a fully specified transform.

Tip

To see transforms by category, click the sort buttons in the Category header in the online documentation.

Transform	Category	Description
Case Transform	Other	Performif/then/else or case logic on the rows in your dataset.
Comment Transform	Other	Inserts a non-functional comment as a recipe step.
Countpattern Transform	Search and Replace	Counts the number of instances of a specified pattern in a column and writes that value into a newly generated column. Source column is unchanged.
Deduplicate Transform	Manage Rows	Removes exact duplicate rows from your dataset. Duplicate rows are identified by exact matches between values. For example, two strings with different capitalization do not match.
Delete Transform	Manage Rows	Deletes a set of rows in your dataset, based on a condition specified in the `row` expression. If the conditional expression is `true`, then the row is deleted.
Derive Transform	Manage Columns	Generate a new column where the values are the output of the `value` expression. Expression can be calculated based on values specified in the `group` parameter. Output column can be named as needed.
Drop Transform	Manage Columns	Removes the specified column or columns permanently from your dataset.
Extract Transform	Search and Replace	Extracts a subset of data from one column and inserts it into a new column, based on a specified string or pattern. The source column in unmodified.
Extractkv Transform	Search and Replace	Extracts key-value pairs from a source column and writes them to a new column. Source column must be of String type, although the data can be formatted as other data types.
Extractlist Transform	Search and Replace	Extracts a set of values based on a specified pattern from a source column of any data type. The generated column contains an array of occurrences of the specified pattern. While the new column contains array data, the data type of the new column is sometimes inferred as String.
Filter Transform	Manage Rows	Keep or delete rows in your dataset based on a defined type of filter.
Flatten Transform	Nested Data	Unpacks array data into separate rows for each value.
Header Transform	Initial Parsing	Uses one row from the dataset sample as the header row for the table. Each value in this row becomes the name of the column in which it is located.
Keep Transform	Manage Rows	Retains a set of rows in your dataset, which are specified by the conditional in the `row` expression. All other rows are removed from the dataset.
Merge Transform	Manage Columns	Merges two or more columns in your dataset to create a new column of String type. Optionally, you can insert a delimiter between the merged values.
Move Transform	Manage Columns	Moves the specified column or columns before or after another column in your dataset.
Nest Transform	Nested Data	Creates an Object or Array of values using column names and their values as key-value pairs for one or more columns. Generated column type is determined by the `into` parameter.
Pivot Transform	Nested Data	The `pivot` transform can be used to aggregate or pivot your data into columns and aggregate the results. Reshape your dataset into summary information. When you aggregate data, calculations are performed on column values, which are then grouped and ordered based on specified parameters. When you pivot data, the values of a selected column become new columns in the dataset, each of which contains a summary calculation that you specify. This calculation can be based on all rows for totals across the dataset or based on group of rows you define in the transform.
Rename Transform	Manage Columns	Renames one or more columns to specified names or append or prepend column names with specific values.
Replace Transform	Search and Replace	Replaces values within the specified column or columns based on the string literal, pattern, or location within the cell value, as specified in the transform.
Set Transform	Search and Replace	Replaces all values in the specified column with the specified value, which can be a literal or an expression. You can specify an optional `row:` parameter, containing a conditional test to identify the rows where the replacement is to be made within the column.
Settype Transform	Manage Columns	Sets the data type of the specified column. This transform does not modify the source values. The data in the column is re-inferred against the specified data type, which can change the results of column profiling.
Split Transform	Initial Parsing	Splits the specified column into separate columns of data based on the delimiters in the transform. Delimiters can be specified in a number of methods described below.
Splitrows Transform	Initial Parsing	Splits a column of values into separate rows of data based on the specified delimiter. You can split rows only on String literal values. Pattern-based row splitting is not supported.
Sort Transform	Manage Rows	Sorts the dataset based on one or more columns in ascending or descending order. You can also sort based on the order of rows when the dataset was created.
Unnest Transform	Nested Data	Unpacks nested data from an Array or Object column to create new rows or columns based on the keys in the source data. This transform works differently on columns of Object or Array type.
Unpivot Transform	Nested Data	Reshapes the layout of data by merging one or more columns into key and value columns. Keys are the names of input columns, and the values are the cell values from the source columns. Rows of data are duplicated, once for each input column.
Valuestocols Transform	Manage Columns	For each unique value in a column, a separate column is created. For each row that contains the value in the source column, an indicator value is inserted in the new column. This value can be a literal value or the output of a function. If no indicator value is generated, a null value is written.
Window Transform	Aggregation	The `window` transform enables you to perform summations and calculations based on a rolling window of data relative to the current row. For example, you can compute the rolling average for a specified column for the current row value and the nine preceding rows. This transform is particularly useful for processing time or otherwise sequential data.

Transforms

Search results