Configure Running Environments
This section provides overview information on how to configure the running environments accessible from your deployment of the Trifacta Application.
Arunning environmentis the set of services that are used to execute a job.
A job can include tasks to do the following:
Ingest data
Transform data
Profile data
Sample data
Generate results
A running environment can be hosted on the Trifacta node or across a cluster that is connected to the product.
Trifacta Photon
Hosted on the Trifacta node, Trifacta Photon is an in-memory running environment designed for high performance on small- to medium-sized jobs.
Note
This feature may not be available in all product editions. For more information on available features, see Compare Editions.
Configuration:
Trifacta Photon may require enablement in your project or workspace:
For more information, see Workspace Settings Page.
EMR
Amazon Elastic Map Reduce (EMR) is a managed-cluster data platform for processing large volumes of disparate sources of data. This scalable platform is used for running jobs and can handle data processing tasks of any size.
Configuration:
EMR is enabled by default and requires no additional configuration.
If you are accessing AWS resources using IAM roles, those roles must contain policies to run jobs on EMR. For more information, see Required AWS Account Permissions.
Snowflake
Note
This feature may not be available in all product editions. For more information on available features, see Compare Editions.
Snowflake provides cloud-based data storage and analytics as a service. If all of your source datasets and outputs are in Snowflake locations and other conditions are met, then the entire execution of the transformations can occur in Snowflake. For more information, see https://www.snowflake.com.
For datasets and outputs that are hosted in Snowflake, you can configure the Trifacta Application to perform the transformation steps of your job in Snowflake. In this manner, no data needs to be transferred to and from the data warehouse, and performance should be significantly better.
Tip
Jobs must be enabled for execution in Snowflake for each flow. For more information, see Flow Optimization Settings Dialog.
Limitations:
Note
Snowflake is not a running environment that you explicitly select or specify as part of a job. If all of the requirements are met, then the job is executed in Snowflake. For more information on limitations, see Overview of Job Execution.
Configuration:
For more information, see Snowflake Running Environment.