Skip to main content

Dataflow Running Environment

Integrated into Google Cloud Platform, Dataflow is a scalable service for processing data pipelines. When the job is specified, the Trifacta Application manages the process of queuing and executing the job on Dataflow. Jobs are executing using a default Compute Engine service account, although they can be executed on a user-specified service account for some product editions.

Tip

Dataflow is suitable for larger jobs.

Tip

In the Run Job page, select Dataflow to run the job on this running environment.

Note

Dataflow integration is enabled by default. The default IAM permissions enable job execution on Dataflow. For more information, see Required Dataprep User Permissions.

For more information, seehttps://cloud.google.com/dataflow.

Limitations

Note

When jobs are run on Dataflow, filenames and paths cannot contain the at-sign (@), which is reserved in Dataflow.