After installation of the Designer Cloud Powered by Trifacta platform software and databases in your Microsoft Azure infrastructure, please complete these steps to perform the basic integration between the Trifacta node and Azure resources like the backend storage layer and running environment cluster.
Note
This section includes only basic configuration for required platform functions and integrations with Azure. Please use the links in this section to access additional details on these key features.
Tip
When you save changes from within the Designer Cloud Powered by Trifacta platform, your configuration is automatically validated, and the platform is automatically restarted.
These steps require admin access to your Azure deployment.
To create an Azure Active Directory (AAD) application, please complete the following steps in the Azure console.
Steps:
Create registered application:
In the Azure console, navigate to Azure Active Directory > App Registrations.
Create a New App. Name it
trifacta
.Note
Retain the Application ID and Directory ID for configuration in the Designer Cloud Powered by Trifacta platform.
Create a client secret:
Navigate to Certificates & secrets.
Create a new Client secret.
Note
Retain the value of the Client secret for configuration in the Designer Cloud Powered by Trifacta platform.
Add API permissions:
Navigate to API Permissions.
Add Azure Key Vault with the
user_impersonation
permission.
For additional details, see Configure for Azure.
Please complete the following steps in the Azure portal to create a Key Vault and to associate it with the Alteryx registered application.
Note
A Key Vault is required for use with the Designer Cloud Powered by Trifacta platform.
Steps:
Log into the Azure portal.
Complete the form for creating a new Key Vault resource:
Name: Provide a reasonable name for the resource. Example:
<clusterName>-<applicationName>-<group/organizationName>
Or, you can use
trifacta
.Location: Pick the location used by the cluster.
For other fields, add appropriate information based on your enterprise's preferences.
To create the resource, click Create.
Note
Retain the DNS Name value for later use.
Steps:
In the Azure portal, you must assign access policies for application principal of the Alteryx registered application to access the Key Vault.
Steps:
In the Azure portal, select the Key Vault you created. Then, select Access Policies.
In the Access Policies window, select the Alteryx registered application.
Click Add Access Policy.
Select the following secret permissions (at a minimum):
Get
Set
Delete
Recover
Select the Alteryx application principal.
Assign the policy you just created to that principal.
For additional details, see Configure Azure Key Vault.
In the Azure console, you must create or modify the backend datastore for use with the Designer Cloud Powered by Trifacta platform. Supported datastores:
Note
You should review the limitations for your selected datastore before configuring the platform to use it. After the base storage layer has been defined in the platform, it cannot be modified.
Datastore | Notes |
---|---|
ADLS Gen2 | Supported for use with Azure Databricks cluster only. See ADLS Gen2 Access. |
ADLS Gen1 | See ADLS Gen1 Access. |
WASB | Only WASBS protocol is supported only. See WASB Access. |
In the Azure console, you must create or modify the running environment where jobs are executed by the Designer Cloud Powered by Trifacta platform. Supported running environments:
Note
You should review the limitations for your selected running environment before configuring the platform to use it.
Running Environment | Notes |
---|---|
Azure Databricks |
Please complete the following sections as soon as you can access the Trifacta Application.
This section contains a set of configuration steps required to enable basic functionality in the Designer Cloud Powered by Trifacta platform, as well as the methods by which you can apply the configuration.
Please complete the following steps to configure the Designer Cloud Powered by Trifacta platform and to integrate it with Azure resources.
Please complete the following configuration steps in the Designer Cloud Powered by Trifacta platform.
Note
If you are integrating with Azure Databricks and are Managed Identities for authentication, please skip this section. That configuration is covered in a later step.
Note
Except as noted, these configuration steps are required for all Azure installs. These values must be extracted from the Azure portal.
Steps:
You can apply this change through the Admin Settings Page (recommended) or
trifacta-conf.json
. For more information, see Platform Configuration Methods.Azure registered application values:
"azure.applicationId": "<azure_application_id>", "azure.directoryId": "<azure_directory_id>", "azure.secret": "<azure_secret>",
Parameter
Description
azure.applicationId
Application ID for the Alteryx registered application that you created in the Azure console
azure.directoryId
The directory ID for theAlteryx registered application
azure.secret
The Secret value for the Alteryx registered application
Configure Key Vault:
"azure.keyVaultUrl": "<url_of_key_vault>",
Parameter
Description
azure.keyVaultUrl
URL of the Azure Key Vault that you created in the Azure console
Save your changes and restart the platform.
For additional details:
The Designer Cloud Powered by Trifacta platform supports integration with the following backend datastores on Azure.
ADLS Gen2
ADLS Gen1
WASB
Please complete the following configuration steps in the Designer Cloud Powered by Trifacta platform.
Note
Integration with ADLS Gen2 is supported only on Azure Databricks.
Steps:
You can apply this change through the Admin Settings Page (recommended) or
trifacta-conf.json
. For more information, see Platform Configuration Methods.Enable ADLS Gen2 as the base storage layer:
"webapp.storageProtocol": "abfss", "hdfs.enabled": false, "hdfs.protocolOverride": "",
Parameter
Description
webapp.storageProtocol
Sets the base storage layer for the platform. Set this value to
abfss
.Note
After this parameter has been saved, you cannot modify it. You must re-install the platform to change it.
hdfs.enabled
For ADLS Gen2 access, set this value to
false
.hdfs.protocolOverride
For ADLS Gen2 access, this special parameter should be empty. It is ignored when the storage protocol is set to
abfss
.Configure ADLS Gen2 access mode. The following parameter must be set to
system
."azure.adlsgen2.mode": "system",
Set the protocol whitelist and base URIs for ADLS Gen2:
"fileStorage.whitelist": ["abfss"], "fileStorage.defaultBaseUris": ["abfss://filesystem@storageaccount.dfs.core.windows.net/"],
Parameter
Description
fileStorage.whitelist
A comma-separated list of protocols that are permitted to read and write with ADLS Gen2 storage.
Note
The protocol identifier
"abfss"
must be included in this list.fileStorage.defaultBaseUris
For each supported protocol, this param must contain a top-level path to the location where Designer Cloud Powered by Trifacta platform files can be stored. These files include uploads, samples, and temporary storage used during job execution.
Note
A separate base URI is required for each supported protocol. You may only have one base URI for each protocol.
Save your changes.
The Java VFS service must be enabled for ADLS Gen2 access. For more information, see Configure Java VFS Service in the Configuration Guide.
For additional details, see ADLS Gen2 Access.
ADLS Gen1 access leverages HDFS protocol and storage, so additional configuration is required.
Steps:
You can apply this change through the Admin Settings Page (recommended) or
trifacta-conf.json
. For more information, see Platform Configuration Methods.Enable ADLS Gen1 as the base storage layer:
"webapp.storageProtocol": "adl", "hdfs.enabled": false,
Parameter
Description
webapp.storageProtocol
Sets the base storage layer for the platform. Set this value to
adl
.Note
After this parameter has been saved, you cannot modify it. You must re-install the platform to change it.
hdfs.enabled
For ADLS Gen1 storage, set this value to
false
.These parameters specify the base location and protocol for storage. Only one datastore can be specified:
"fileStorage": { "defaultBaseUris": [ "<baseURIOfYourLocation>" ], "whitelist": ["adl"] }
Parameter
Description
filestorage.defaultBaseURIs
Set this value to the base location for your ADLS Gen1 storage area. Example:
adl://<YOUR_STORE_NAME>.azuredatalakestore.net
whitelist
This list must include
adl
.Save your changes.
For additional details, see ADLS Gen1 Access.
Steps:
You can apply this change through the Admin Settings Page (recommended) or
trifacta-conf.json
. For more information, see Platform Configuration Methods.Enable WASB as the base storage layer:
"webapp.storageProtocol": "wasbs", "hdfs.enabled": false,
Parameter
Description
webapp.storageProtocol
Sets the base storage layer for the platform. Set this value to
wasbs
.Note
After this parameter has been saved, you cannot modify it. You must re-install the platform to change it.
wasb
protocol is not supported.hdfs.enabled
For WASB blob storage, set this value to
false
.Save your changes.
In the following sections, you configure where the platform acquires the SAS token to use for WASB access from one of the following:
From platform configuration
From the Azure key vault
When integrating with WASB, the platform must be configured to use a SAS token to gain access to WASB resources. This token can be made available in either of the following ways, each of which requires separate configuration.
Via Designer Cloud Powered by Trifacta platform configuration:
You can apply this change through the Admin Settings Page (recommended) or
trifacta-conf.json
. For more information, see Platform Configuration Methods.Locate and specify the following parameter:
"azure.wasb.fetchSasTokensFromKeyVault": false,
Parameter
Description
azure.wasb.fetchSasTokensFromKeyVault
For acquiring the SAS token from platform configuration, set this value to
false
.Save your changes and restart the platform.
Via Azure Key Vault:
To require the Designer Cloud Powered by Trifacta platform to acquire the SAS token from the Azure key vault, please complete the following configuration steps.
You can apply this change through the Admin Settings Page (recommended) or
trifacta-conf.json
. For more information, see Platform Configuration Methods.Locate and specify the following parameter:
"azure.wasb.fetchSasTokensFromKeyVault": true,
Parameter
Description
azure.wasb.fetchSasTokensFromKeyVault
For acquiring the SAS token from the key vault, set this value to
true
.
To apply this configuration change, login as an administrator to the Trifacta node. Then, edit
trifacta-conf.json
. For more information, see Platform Configuration Methods.Locate the
azure.wasb.stores
configuration block.Apply the appropriate configuration as specified below.
Tip
The default container must be specified as the first set of elements in the array. All containers listed after the first one are treated as extra stores.
"azure.wasb.stores": [ { "sasToken": "<DEFAULT_VALUE1_HERE>", "keyVaultSasTokenSecretName": "<DEFAULT_VALUE1_HERE>", "container": "<DEFAULT_VALUE1_HERE>", "blobHost": "<DEFAULT_VALUE1_HERE>" }, { "sasToken": "<VALUE2_HERE>", "keyVaultSasTokenSecretName": "<VALUE2_HERE>", "container": "<VALUE2_HERE>", "blobHost": "<VALUE2_HERE>" } ] },
Parameter
Description
SAS Token from Azure Key Vault
SAS Token from Platform Configuration
sasToken
Set this value to the SAS token to use, if applicable.
Example value:
?sv=2019-02-02&ss=bfqt&srt=sco&sp=rwdlacup&se=2022-02-13T00:00:00Z&st=2020-02-13T00:00:00Z&spr=https&sig=<redacted>
Set this value to an empty string.
Note
Do not delete the entire line. Leave the value as empty.
See below for the command to execute to generate a SAS token.
keyVaultSasTokenSecretName
Set this value to the secret name of the SAS token in the Azure key vault to use for the specified blob host and container.
If needed, you can generate and apply a per-container SAS token for use in this field for this specific store. Details are below.
Set this value to an empty string.
Note
Do not delete the entire line. Leave the value as empty.
container
Apply the name of the WASB container.
Note
If you are specifying different blob host and container combinations for your extra stores, you must create a new Key Vault store. See above for details.
blobHost
Specify the blob host of the container.
Example value:
storage-account.blob.core.windows.net
Note
If you are specifying different blob host and container combinations for your extra stores, you must create a new Key Vault store. See above for details.
Save your changes and restart the platform.
For additional details, see WASB Access.
Tip
At this point, you should be able to load data from your backend datastore, if data is available. You can try to run a small job on Photon, which is native to the Trifacta node. You cannot yet run jobs on an integrated cluster.
The Designer Cloud Powered by Trifacta platform can run jobs on the following running environments.
Note
You may integrate with only one of these environments.
The following parameters should be configured for all Azure running environments.
Steps:
You can apply this change through the Admin Settings Page (recommended) or
trifacta-conf.json
. For more information, see Platform Configuration Methods.Parameters:
"webapp.runInTrifactaServer": true, "webapp.runinEMR": false, "webapp.runInDataflow": false,
Parameter
Description
webapp.runInTrifactaServer
When set to
true
, the platform recommends and can run smaller jobs on the Trifacta node, which uses the embedded Photon running environment.Tip
Unless otherwise instructed, the Photon running environment should be enabled.
webapp.runinEMR
For Azure, set this value to
false
.webapp.runInDataflow
For Azure, set this value to
false
.Save your changes.
The Designer Cloud Powered by Trifacta platform can be configured to integrate with supported versions of Azure Databricks clusters to run jobs in Spark.
Note
Before you attempt to integrate, you should review the limitations around this integration. For more information, see Configure for Azure Databricks.
Steps:
You can apply this change through the Admin Settings Page (recommended) or
trifacta-conf.json
. For more information, see Platform Configuration Methods.Configure the following parameters to enable job execution on the specified Azure Databricks cluster:
"webapp.runInDatabricks": true, "webapp.runWithSparkSubmit": false,
Parameter
Description
webapp.runInDatabricks
Defines if the platform runs jobs in Azure Databricks. Set this value to
true
.webapp.runWithSparkSubmit
For all Azure Databricks deployments, this value should be set to
false
.Configure the following Azure Databricks-specific parameters:
"databricks.serviceUrl": "<url_to_databricks_service>",
Parameter
Description
databricks.serviceUrl
URL to the Azure Databricks Service where Spark jobs will be run (Example: https://westus2.azuredatabricks.net)
Note
If you are using instance pooling on the cluster, additional configuration is required. See Configure for Azure Databricks.
Save your changes and restart the platform.
For additional details, see Configure for Azure Databricks.
Tip
At this point, you should be able to load data from your backend datastore and run jobs on an integrated cluster.
The Designer Cloud Powered by Trifacta platform supports the following methods of authentication when hosted in Azure.
The platform can be configured to integrate with your enterprise's Azure Active Directory provider. For more information, see Configure SSO for Azure AD.
If you are not applying your enterprise SSO authentication to the Designer Cloud Powered by Trifacta platform, platform users must be created and managed through the application.
Self-managed:
Users can be permitted to self-register their accounts and manage their password reset requests:
Note
Self-created accounts are permitted to import data, generate samples, run jobs, and generate and download results. Admin roles must be assigned manually through the application.
See Configure User Self-Registration in the Configuration Guide
See Enable Self-Service Password Reset in the Configuration Guide
Admin-managed:
If users are not permitted to create their accounts, an admin must do so:
See Create User Account in the Admin Guide
See Create Admin Account in the Admin Guide
Tip
Users who are authenticated or have been provisioned user accounts should be able to login to the Trifacta Application and begin using the product.
Note
You can try to verify operations using the Trifacta Photon running environment at this time.
After you have applied a configuration change to the platform and restarted, you can use the following steps to verify that Designer Cloud Powered by Trifacta Enterprise Edition is working correctly.
You can access complete product documentation online and in PDF format. From within the product, select Help menu > Documentation.
Topic | Description | Configuration Guide sections |
---|---|---|
User Access | You can enable self-service user registration or create users through the admin account. | |
Relational Connections | The platform can integrate with a variety of relational datastores. | |
Compressed Clusters | The platform can integrate with some compressed running environments. | |
High Availability | The platform can integrate with a highly available cluster. | |
The Trifacta node can be configured to use other nodes in case of a failure. | ||
Features | Some features must be enabled and can be configured through platform configuration. | Feature flags: Miscellaneous Configuration |
Services | Some platform services support additional configuration options. |