Skip to main content

Reshaping Steps

You can reshape the row and column structure of your data through a variety of transformations.

In the Transformer page:

  • You can create new columns, modify them, and delete them to re-scope the size of your data to the most meaningful information.

  • You can reshape your data through pivots and aggregations.

  • Nested data in the form of Arrays or Objects (key-value pairs) can be un-nested across columns and rows for easier manipulation. As needed, patterned data can be re-nested through transformations that are easy to select and manipulate.

Tip

When reshaping your data from its original form, you may find it useful to build your pivots and aggregations as separate recipes created off of your current recipe. In this manner, you can preserve the original structure and explore more significant transformations as needed.

Transformations to Reshape Your Dataset

Recipe steps can change the number of rows in the dataset and apply wider impacts to your dataset and its samples.

These reshaping steps include the following transformations:

Transformation

Documentation

Splitrows

Initial Parsing Steps

Expand Arrays into Rows

Working with Arrays

Filter Rows (keep or delete)

Remove Data

Pivot Table

Pivot Data

Unpivot Columns

Unpivot Columns

Join Datasets

Join Window

Union Datasets

Union Page

Select Lookup from the column menu

Lookup Wizard

Remove Duplicate Rows

Remove Data

Samples and reshaping your datasets

When one of these transformations is applied and rows are removed from your dataset:

  • Any samples generated before the step was added are invalidated and cannot be used.

  • If you edit steps in your recipe before this added transformation, any samples that you generated after the step are invalidated and cannot be used.

  • A valid initial sample is always available for use.

For more information, see Samples Panel.

Build Pivot Tables

You can reshape your data by building pivot tables. Pivot tables are useful when you want to calculate aggregation functions, such as sums, maximums, and averages for one or more columns of data.

In the following example, the data is reshaped to include the sum of POS_Sales for each distinct value in the Daily column across the values in the Sales_Description column:

CS-ReshapeYourData-PivotTable.png

Figure: Reshape your data using pivot tables

For more information, see Pivot Data.

Create Aggregations

An aggregation is a computation across a grouped set of rows. Designer Cloud Powered by Trifacta Enterprise Edition provides a wide range of aggregation functions that you can apply:

  • To an entire column (called a flat aggregation)

  • To generate a new column

  • To use to reshape your entire table

For more information, see Create Aggregations.

Nest and Unnest

You can combine data in separate columns into single-column values stored in Arrays or Objects (maps). Similarly, data from an Array or Object column can be converted into new rows or columns based on the keys in the source data. For more information:

Select Columns

You can select a set of columns to replace the current dataset completely. See Select.

Delete Columns

You can reshape your data by deleting unwanted columns in the dataset. You can delete a single column or multiple columns.

  • To delete a column from your dataset, click the required column and select Delete from the column drop-down.

  • If you select Delete others, all other remaining columns are deleted except the selected column.

Tip

To delete multiple columns, select them in the data grid or column browser. Then select Delete from the column menu.

CS-ReshapeYourData-Delete.png

Figure: Reshape your data using Delete columns

The above menu choices get turned into recipe steps that use the Delete columns transformation.

Transformation Name

Delete columns

Parameter: Columns

Multiple

Parameter: Columns

Whse_Name

Parameter: Action

Delete selected columns

Tip

While using Delete columns transformation, you can use the tilde (~) character between the start and end column names to delete a range of columns.

See Delete Data.

Split Columns

You can split a column based on one or more known delimiters or based on index positions in the data. See Split Column.