Dataflow Running Environment
Integrated into Google Cloud Platform, Dataflow is a scalable service for processing data pipelines. When the job is specified, the Trifacta Application manages the process of queuing and executing the job on Dataflow. Jobs are executing using a default Compute Engine service account, although they can be executed on a user-specified service account for some product editions.
Tip
Dataflow is suitable for larger jobs.
Tip
In the Run Job page, select Dataflow to run the job on this running environment.
Note
Dataflow integration is enabled by default. The default IAM permissions enable job execution on Dataflow. For more information, see Required Dataprep User Permissions.
For more information, seehttps://cloud.google.com/dataflow.
Limitations
Note
When jobs are run on Dataflow, filenames and paths cannot contain the at-sign (@
), which is reserved in Dataflow.