Skip to main content

Overview of Orchestration

Note

This feature may not be available in all product editions. For more information on available features, see Compare Editions.

Orchestration is a set of functionality that supports the scheduled execution of task sequences in the Dataprep by Trifacta platform. These tasks could be external processes, data transformation jobs, HTTP requests, and more.

Note

This feature may not be available in all product editions. For more information on available features, see Compare Editions.

In the following sections, you can review short summaries of specific features and explore more detailed information on them.

Terms

Term

Description

plan

A plan is a sequence of tasks that are executed from the platform or on assets to which you have access. To orchestrate tasks, you build a plan. A plan can be scheduled for execution, triggered manually, or invoked via API.

trigger

A task is executed based on a trigger. A trigger is a condition under which a task is executed. In many cases, the trigger for a task is based on the schedule for the plan.

task

A task is a unit of execution in the platform.

snapshot

A snapshot of the plan is captured, and the plan is executed against this snapshot. For more information on snapshots, see "Plan execution" below.

Task Types

The following types of tasks are available.

  • HTTP task: A request submitted to a third-party server as part of a plan run.

  • Slack task: Send a message with information about the plan run to a specified Slack channel.

  • Delete task: Delete files and folders from backend data storage.

  • flow task: An ad-hoc or scheduled execution of the transformations required to produce one or more selected outputs from a flow.

Limitations

  • You cannot specify parameter overrides to be applied to plans specifically.

    • Plans inherit parameter values from the objects referenced in the plan's tasks.

    • If overrides are applied to parameters inside of assets in a plan, those overrides are passed to the plan at the time of task execution.

Basic Task

You create a plan and schedule it using the following basic task.

  1. Create the plan. A plan is the container for definition of the tasks, triggers, and other objects. See Plans Page.

  2. In Plan View, you specify the objects that are part of your plan. See Plan View Page.

    1. Schedule: The schedule defines the set of triggers that queue the plan for execution.

      1. Trigger: A trigger defines the schedule and frequency at which the plan is executed. A plan can have multiple triggers (e.g. monthly versus weekly executions).

    2. Task(s): Next, you specify the tasks that are executed in order.

  3. As needed, you can apply override values to any flow parameters. These overrides are applied during a plan run. For more information, see Manage Parameters Dialog for Plans.

  4. To test:

    1. Click Run now.

    2. To track progress, click the Runs link.

    3. In the Run Details page, you can track the progress.

    4. The first task is executed and completes, before the second task is started.

    5. Individual tasks are executed as separate jobs, which you can track through the Job History page. See Job History Page.

    6. When the plan has completed, you can verify the results through the Job details page. See Job Details Page.

  5. If you are satisfied with the plan definition and your test run, the plan will execute according to the scheduled trigger.

Plan Scheduling

Through the Plan View page, you can configure the scheduled executions of the plan. Plan schedules are defined using triggers.

  • These schedules are independent of schedules for other asset types.

  • You cannot create schedules for individual tasks.

Plan Execution

When a plan is triggered for execution, a snapshot of the plan is taken. This snapshot is used to execute the plan. Tasks are executed in the sequence listed in Plan View.

Important notes:

Note

Any subsequent changes to the flows, datasets, recipes, and outputs referenced in the plan's tasks can affect subsequent executions of the plan. For example, subsequent removal of a dataset in a flow referenced in a task can cause the task to fail to execute properly.

At the flow level, you can define webhooks and email notifications that are triggered based on the successful generation of outputs. When you execute a plan containing an output with one of these messages, the message is triggered and delivered to stakeholders.

Note

Webhook messages and email notification cannot be directly triggered based on a plan's execution. However, you can create HTTP-based tasks to send messages based on a plan task's execution.

Tip

When a flow email notification is triggered through a plan, the internal identifier for the plan is included in the email.

See "Webhooks" and "Email notifications" above.

Enable

Enable the following setting:

Plans feature

Plan sharing, import, and export must also be enabled.For more information, see Dataprep Project Settings Page.

Logging

For more information on debugging plans, see Diagnose Failed Plan Runs.