Skip to main content

Delete Data

A key task in cleaning up your data is to remove unwanted columns and rows, which can simplify future transformations and improve job execution performance. Designer Cloud provides multiple mechanisms for removing data from your dataset.

Tip

When you are deleting data, you should consider if that data may have other uses in the future or for other users. If so, you should consider doing the data removal through a separate recipe off of your current recipe, which preserves the data for other uses in the current recipe.

Delete Columns

You can delete one or more columns based on the following:

  • By selection

  • Through transformation

    Tip

    When you delete through transformation steps, you have additional controls at your disposal.

By selection

You can delete a single column or multiple columns:

  • To delete a column from your dataset, click the column and select Delete from the column drop-down.

  • If you select Delete others, all other remaining columns are deleted except the selected column.

Tip

To delete multiple columns, select them in the data grid or column browser. Then selectDeletefrom the column menu.

The column or columns are removed from the data grid, and a new step is added to your recipe.

Through transformation

You can delete columns through the transformation steps.

Steps:

  1. In the Transformer page, click Delete columns.

  2. The Delete columns transformation is populated in the Transformer Builder.

  3. Select one or more columns, as required:

    1. Multiple: Select one or more columns from the drop-down list.

    2. All: Select all columns in the dataset.

      Note

      This step removes all columns in your dataset.

    3. Range: Specify a start and ending columns. All columns inclusive of start and end are deleted.

    4. Advanced: Specify the columns using a comma-separated list. Ranges of columns can be specified using the tilde (~) character. Examples:

      Entry

      Description

      Store_Nbr~Daily

      Columns from Store_Nbr to Daily in the dataset are deleted.

      Store_Name,Store_Manager,Store_Nbr~Daily

      The following columns are deleted: Store_NameStore_ManagerStore_Nbr to Daily

  4. From the Action area, select one of the following options:

    1. Delete selected columns: Deletes only the selected columns.

    2. Delete unselected columns: Deletes all other remaining columns except the selected columns.

  5. To delete columns, click Add.

Example transformation:

The following transformation deletes the columns between Store_Nbr and Daily, inclusive.

Transformation Name

Delete columns

Parameter: Columns

Advanced

Parameter: Column

Store_Nbr~Daily

Parameter: Action

Delete selected columns

Delete Rows

Since rows do not have an identifying header, you must identify the rows to remove in your dataset based on a specified condition. You can delete rows based on the following:

  • By selection

  • By custom conditions

By selection

You can delete rows by selecting values. You are prompted for data filtering suggestions when you select values in:

  • column histograms

  • column data quality bars

  • cells or values within a cell

When you make a selection, select the Delete rowstransformation in the context panel. The Transform Builder contains a transformation to filter rows based on the the condition that you have selected. For example, if you selected the value California in the State column, then the transformation is specified to filter out rows in which State=California.

In the Transform Builder, you must decide if the transformation keeps matching rows (deleting all others) or deletes matching rows. In the following example, rows in which State=California are selected for deletion:

Transformation Name

Filter rows

Parameter: Condition

Custom formula

Parameter: Type of formula

Custom single

Parameter: Condition

State == "California"

Parameter: Action

Delete matching rows

By custom conditions

You can delete a set of rows based on a condition specified in the condition column. If the conditional expression is true, then the selected rows are deleted.

  1. In the Transformer page, click the Recipe icon. The Recipe panel is displayed.

  2. In the Search Transformations panel, enter Filter in.

  3. In the Filter rows transformation, enter the required details:

    1. Condition: Filter based on the condition type that you select in the drop-down. Some condition types do not support specifying the condition by formula.

    2. Column: The column containing the values to filter. For example, action_count.

    3. Values or Formula: Specify the values or the formula used to determine the condition.

      1. If these values are present, then the condition evaluates to true.

      2. The formula must evaluate to true or false.

    4. Action: The action to be performed to the rows based on the specified conditions.

    5. In the following example, the rows where the action_count column values fall between 1 and 10 are deleted:

      Transformation Name

      Filter rows

      Parameter: Condition

      Custom formula

      Parameter: Type of formula

      Custom single

      Parameter: Condition

      (action_count >= 1) && (action_count <= 10)

      Parameter: Action

      Delete matching rows

      Tip

      You can apply logical operators such as && (logical AND) above to build more sophisticated logical tests.

  4. To add the recipe to the step, click Add. The dataset rows are filtered based on the configured transformation.