Skip to main content

Change Dataset Dialog

Through the Flow View page, you can change the source that is used for your dataset. In this manner, you can apply the same recipe across datasets with the same schema. When the source dataset has been changed, a new sample is automatically generated for you.

For example, you build your recipe for a week's worth of sales data, which is sourced from an imported dataset based on a CSV called, Week01-Sales.csv. When the next week's source data is dropped in the appropriate directory, you can:

  1. Import the new dataset,

  2. Edit the recipe,

  3. Change the source to the new file, and

  4. Execute a job immediately to process the new week of data.

Note

A dataset source can be an imported dataset, a reference dataset, or a recipe. Subsequent changes to the source data affect your dataset in development.

Notes and Limitations:

  • If there are differences between the schemas of the source and the new source, your recipe is likely to break on the dataset when the new dataset is selected.

  • You can swap your original source dataset with an imported dataset, reference dataset, or a recipe. If needed, you can swap back to the original source at any time.

  • If you have enabled relational connections, swapping relational sources may not work if they are from different database vendors.

  • Data-dependent transforms, such as header and valuestocols, use the data that was present in the sample at the time that they were added to the recipe. This fact can cause unexpected changes or breakages when the recipe is applied to another source.

  • You cannot undo or redo source swaps.

Steps:

  1. To change a data source, open the flow containing it.

  2. In Flow View, you can:

    1. Click the imported dataset icon. Then, click Replace.

      Note

      This action removes the imported dataset and all links (edges) coming out of it. The replacement must be reconnected with any downstream objects.

    2. Click the recipe icon. Then, click Change input.

      Note

      This action substitutes only the primary input from a recipe, which does not include any datasets that are integrated from joins, unions, lookups, or other multi-dataset options.

  3. Select the new source:

    Note

    You can select data from any flow to which you have access. Changes to the source are inherited.

    ChangeDatasetDialog.png

    Figure: Change Dataset Dialog

    1. If replacing an imported dataset, you can import new data as the replacement. Click Import Datasets. For more information, see Import Data Page.

  4. Click Replace or Change.

  5. Your dataset is now using the selected dataset as its source, and the current recipe in the Transformer page is applied to the new source.