API Task - Deploy a Flow

Overview

In this task, you learn how to deploy a flow in development to a production instance of the platform. After you have created and finished a flow in a Development (Dev) instance, you can deploy it to an environment designed primarily for production execution of jobs for finished flows (Prod instance). For more information on managing these deployments, see Overview of Deployment Manager.

Prerequisites

Finished flow: This example assumes that you have finished development of a flow with the following characteristics:

Single dataset imported from a table through a Redshift connection
Single JSON output

Separate Dev and Prod instances: Although it is possible to deploy flows to the same instance in which they are developed, this example assumes that you are deploying from a Dev instance to a completely separate Prod instance. The following implications apply:

Separate user accounts to access Dev (User1) and Prod (Admin2) instances.
Tip
You should do all of your recipe development and testing in Dev/Test. Avoid making changes in a Prod environment.
Note
Although these are separate user accounts, the assumption is that the same admin-level user is using these accounts through the APIs.
New connections must be created in the Prod instance to access the production version of the database table.

Task

In this example, your environment contains separate Dev and Prod instances, each of which has a different set of users.

Item	Dev	Prod
Environment	http://wrangle-dev.example.com:3005 Tip Dev environment work can be done through the UI, which may be easier.	http://wrangle-prod.example.com:3005
User	User1 Note User1 has no access to Prod.	Admin2
Source DB	devWrangleDB	prodWrangleDB
Source Table	Dev-Orders	Prod-Orders
Connection Name	Dev Redshift Conn	Prod Redshift Conn

Example Flow:

User1 is creating a flow, which is used to wrangle weekly batches of orders for the enterprise. The flow contains:

A single imported dataset that is created from a Redshift database table.
A single recipe that modifies the imported dataset.
A single output to a JSON file.
Production data is hosted in a different Redshift database. So, the Prod connection is different from the Dev connection.

Steps:

Build in Dev instance: User1 creates the flow and iterates on building the recipe and running jobs until a satisfactory output can be generated in JSON format.
Export: When User1 is ready to push the flow to production, User1 exports the flow and downloads the export package ZIP file to the local desktop.
Deploy to Prod instance:
1. Admin2 creates a new deployment in the Prod instance.
2. Admin2 creates a new connection (Prod Redshift Conn) in the Prod instance.
3. Admin2 creates new import rules in the Prod instance to map from the old connection (Dev Redshift Conn) to the new one (Prod Redshift Conn).
4. Admin2 uploads the export ZIP package.
Test deployment: Through Flow View in the Prod instance, Admin2 runs a job. The results look fine.
Set schedule: Using cron, Admin2 sets a schedule to run the active release for this deployment once per week.
1. Each week, the Prod-Orders table must be refreshed with data.
2. The dataset is now operational in the Prod environment.

Step - Get Flow Id

The first general step is for the Dev user (User1) to get the flowId and export the flow from the Dev instance.

Steps:

Tip

If it's easier, you can gather the flowId from the user interface in Flow View. In the following example, the flowId is 21:

http://www.wrangle-dev.example.com:3005/flows/21

Through the APIs, you can create a flow using the following call:
Endpoint
http://www.wrangle-dev.example.com:3005/v4/flows
Authentication
Required
Method
GET
Request Body
None.

Endpoint	http://www.wrangle-dev.example.com:3005/v4/flows
Authentication	Required
Method	GET
Request Body	None.

The response should be status code 200 - OK with a response body like the following:

{    "data": [
        {
            "id": 21,
            "name": "Intern Training",
            "description": "null",
            "createdAt": "2019-01-08T18:14:37.851Z",
            "updatedAt": "2019-01-08T18:57:26.824Z",
            "creator": {
                "id": 2
            },
            "updater": {
                "id": 2
            },
            "folder": {
                "id": 1
            },
            "workspace": {
                "id": 1
            }
        },
        {
            "id": 19,
            "name": "example Flow",
            "description": null,
            "createdAt": "2019-01-08T17:25:21.392Z",
            "updatedAt": "2019-01-08T17:30:30.959Z",
            "creator": {
                "id": 2
            },
            "updater": {
                "id": 2
            },
            "folder": {
                "id": 4
            },
            "workspace": {
                "id": 1
            }
        }
    ]
}

Retain the flow identifier (21) for later use.

Note

You have identified the flow to export.

For more information, see https://api.trifacta.com/ee/9.7/index.html#operation/listFlows

Step - Export a Flow

Export the flow to your local desktop.

Tip

This step may be easier to do through the UI in the Dev instance.

Steps:

Export flowId=21:
Endpoint
http://www.wrangle-dev.example.com:3005/v4/flows/21/package
Authentication
Required
Method
GET
Request Body
None.
The response should be status code 200 - OK. The response body is the flow itself.
Download and save this file to your local desktop. Let's assume that the filename you choose is flow-WrangleOrders.zip.

Endpoint	http://www.wrangle-dev.example.com:3005/v4/flows/21/package
Authentication	Required
Method	GET
Request Body	None.

For more information, see https://api.trifacta.com/ee/9.7/index.html#operation/getFlowPackage

Step - Create Deployment

In the Prod environment, you can create the deployment from which you can manage the new flow. Note that the following information has changed for this environment:

Item	Prod env value
userId	Admin2
baseURL	http://www.wrangle-prod.example.com:3005

Steps:

Through the APIs, you can create a deployment using the following call:

Endpoint	http://www.wrangle-prod.example.com:3005/v4/deployments
Authentication	Required Note Username and password credentials must be submitted for the `Admin2` account.
Method	POST
Request Body	{ "name": "Production Orders" }

The response should be status code 201 - Created with a response body like the following:

{
   "id": 3,
    "name": "Production Orders",
    "updatedAt": "2017-11-27T23:48:54.340Z",
    "createdAt": "2017-11-27T23:48:54.340Z",
    "creator": {
        "id": 1
    },
    "updater": {
        "id": 1
    }
}

Retain the deploymentId (3) for later use.

For more information, see https://api.trifacta.com/ee/9.7/index.html#operation/createDeployment

Step - Create Connection

When a flow is exported, its connections are not included in the export. Before you import the flow into a new environment:

Connections must be created or recreated in the Prod environment. In some cases, you may need to point to production versions of the data contained in completely different databases.
Rules must be created to remap the connection to use in the imported flow.

This section and the following step through these processes.

Steps:

From the Dev environment, you collect the connection information for the flow:
Endpoint
http://www.wrangle-dev.example.com:3005/v4/connections
Authentication
Required
Note
Username and password credentials must be submitted for the User1 account.
Method
GET
Request Body
None.

Endpoint	http://www.wrangle-dev.example.com:3005/v4/connections
Authentication	Required Note Username and password credentials must be submitted for the `User1` account.
Method	GET
Request Body	None.

The response should be status code 200 - Ok with a response body like the following:

{
    "data": [
        {
            "id": 9,
            "host": "dev-redshift.example.com",
            "port": 5439,
            "vendor": "redshift",
            "params": {
                "connectStrOpts": "",
                "defaultDatabase": "devWrangleDB",
                "extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS"
            },
            "ssl": false,
            "vendorName": "redshift",
            "name": "Dev Redshift Conn",
            "description": "",
            "type": "jdbc",
            "isGlobal": true,
            "credentialType": "iamRoleArn",
            "credentialsShared": true,
            "uuid": "b8014610-ce56-11e7-9739-27deec2c3249",
            "disableTypeInference": false,
            "createdAt": "2017-11-21T00:55:50.770Z",
            "updatedAt": "2017-11-21T00:55:50.770Z",
            "credentials": [
                {
                    "user": "devDBuser"
                }
            ],
            "creator": {
                "id": 2
            },
            "updater": {
                "id": 2
            },
            "workspace": {
                "id": 1
            }
        }
    ],
    "count": {
        "owned": 1,
        "shared": 0,
        "count": 1
    }
}

You retain the above information for use in Production.

In the Prod environment, you create the new connection using the following call:

Endpoint	http://www.wrangle-prod.example.com:3005/v4/connections
Authentication	Required Note Username and password credentials must be submitted for the `Admin2` account.
Method	POST
Request Body	{ "host": "prod-redshift.example.com", "port": 1433, "vendor": "redshift", "params": { "connectStrOpts": "", "defaultDatabase": "prodWrangleDB", "extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS" }, "vendorName": "redshift", "name": "Redshift Conn Prod", "description": "", "isGlobal": true, "type": "jdbc", "ssl": false, "credentialType": "iamRoleArn", "credentials": [ { "username": "prodDBUser", "password": "<password>", "iamRoleArn": "iam:aws:12345" } ] }

The response should be status code 201 - Created with a response body like the following:

{
  "id": 12,
  "host": "prod-redshift.example.com",
  "port": 5439,
  "vendor": "redshift",
     "params": {
       "connectStrOpts": "",
       "defaultDatabase": "prodWrangleDB",
       "extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS"
     },
  "ssl": false,
  "name": "Redshift Conn Prod",
  "description": "",
  "type": "jdbc",
  "isGlobal": true,
  "credentialType": "iamRoleArn",
  "credentialsShared": true,
  "uuid": "fa7e06c0-0143-11e8-8faf-27c0392328c5",
  "disableTypeInference": false,
  "createdAt": "2018-01-24T20:20:11.181Z",
  "updatedAt": "2018-01-24T20:20:11.181Z",
  "credentials": [
      {
          "username": "prodDBUser"
      }
  ],
  "creator": {
      "id": 2
  },
  "updater": {
      "id": 2
  }
}

When you hit the /v4/connections endpoint again, you can retrieve the connectionId for this connection. In this case, let's assume that the connectionId value is 12.

See https://api.trifacta.com/ee/9.7/index.html#operation/createConnection

Step - Create Import Rules

Now that you have defined the connection to use to acquire the production data from within the production environment, you must create an import rule to remap from the Dev connection to the Prod connection within the flow definition. This rule is applied during the import process to ensure that the flow is working after it has been imported.

In this case, you must remap the uuid value for the Dev connection, which is written into the flow definition, with the connection Id value from the Prod instance.

For more information on import rules, see API Task - Define Deployment Import Mappings.

Steps:

From the Dev environment, you collect the connection information for the flow:
Endpoint
http://www.wrangle-dev.example.com:3005/v4/connections
Authentication
Required
Note
Username and password credentials must be submitted for the User1 account.
Method
GET
Request Body
None.

Endpoint	http://www.wrangle-dev.example.com:3005/v4/connections
Authentication	Required Note Username and password credentials must be submitted for the `User1` account.
Method	GET
Request Body	None.

The response should be status code 200 - Ok with a response body like the following:

{
    "data": [
        {
            "id": 9,
            "host": "dev-redshift.example.com",
            "port": 5439,
            "vendor": "redshift",
            "params": {
                "connectStrOpts": "",
                "defaultDatabase": "devWrangleDB",
                "extraLoadParams": "BLANKSASNULL EMPTYASNULL TRIMBLANKS TRUNCATECOLUMNS"
            },
            "ssl": false,
            "vendorName": "redshift",
            "name": "Dev Redshift Conn",
            "description": "",
            "type": "jdbc",
            "isGlobal": true,
            "credentialType": "iamRoleArn",
            "credentialsShared": true,
            "uuid": "b8014610-ce56-11e7-9739-27deec2c3249",
            "disableTypeInference": false,
            "createdAt": "2017-11-21T00:55:50.770Z",
            "updatedAt": "2017-11-21T00:55:50.770Z",
            "credentials": [
                {
                    "user": "devDBuser"
                }
            ],
            "creator": {
                "id": 2
            },
            "updater": {
                "id": 2
            },
            "workspace": {
                "id": 1
            }
        }
    ],
    "count": {
        "owned": 1,
        "shared": 0,
        "count": 1
    }
}

From the above information, you retain the following, which uniquely identifies the connection object, regardless of the instance to which it belongs:
```
"uuid": "b8014610-ce56-11e7-9739-27deec2c3249",
```

Against the Prod environment, you now create an import mapping rule:

Endpoint	http://www.wrangle-prod.example.com:3005/v4/deployments/3/objectImportRules
Authentication	Required
Method	PATCH

Request Body:

[{"tableName":"connections","onCondition":{"uuid": "b8014610-ce56-11e7-9739-27deec2c3249"},"withCondition":{"id":12}}]

The response should be status code 200 - Ok with a response body like the following:
```
{
    "deleted": []
}
```
Since the method is a PATCH, you are updating the rules set that applies to all imports for this deployment. In this case, there were no pre-existing rules, so the response indicates that nothing was deleted. If another set of import rules is submitted, then the one you just created is deleted.

See https://api.trifacta.com/ee/9.7/index.html#operation/updateObjectImportRules

See https://api.trifacta.com/ee/9.7/index.html#operation/updateValueImportRules

Step - Import Package to Create Release

You are now ready to import the package to create the release.

Steps:

Against the Prod environment, you now import the package:

Endpoint

http://www.wrangle-prod.example.com:3005/v4/deployments/3/releases

Authentication

Required

Method

POST

Request Body

The request body must include the following key and value combination submitted as form data:

key	value
data	"@path-to-flow-WrangleOrders.zip"

The response should be status code 201 - Created with a response body like the following:

{    "importRuleChanges": {
        "object": [{"tableName":"connections","onCondition":{"uuid": "b8014610-ce56-11e7-9739-27deec2c3249"},"withCondition":{"id":12}}],
        "value": []
    },
    "flowName": "Wrangle Orders"
}

See https://api.trifacta.com/ee/9.7/index.html#operation/importPackageForDeployment

Step - Activate Release

When a package is imported into a release, the release is automatically set as the active release for the deployment. If at some point in the future, you need to change the active release, you can use the following endpoint to do so.

Steps:

Against the Prod environment, use the following endpoint:
Endpoint
http://www.wrangle-prod.example.com:3005/v4/releases/5
Authentication
Required
Method
PATCH
Request Body
{ "active": true }

Endpoint	http://www.wrangle-prod.example.com:3005/v4/releases/5
Authentication	Required
Method	PATCH
Request Body	{ "active": true }

The response should be status code 200 - OK with a response body like the following:

{
    "id": 3,
    "updater": {
        "id": 3
    },
    "updatedAt": "2017-11-28T00:06:12.147Z"
}

See https://api.trifacta.com/ee/9.7/index.html#operation/patchRelease

Step - Run Deployment

You can now execute a test run of the deployment to verify that the job executes properly.

Note

When you run a deployment, you run the primary flow in the active release for that deployment. Running the flow generates the output objects for all recipes in the flow.

Note

For datasets with parameters, you can apply parameter overrides through the request body through the following API call. For more information, see https://api.trifacta.com/ee/9.7/index.html#operation/runDeployment

Steps:

Against the Prod environment, use the following endpoint:
Endpoint
http://www.wrangle-prod.example.com:3005/v4/deployments/3/run
Authentication
Required
Method
POST
Request Body
None.

Endpoint	http://www.wrangle-prod.example.com:3005/v4/deployments/3/run
Authentication	Required
Method	POST
Request Body	None.

The response should be status code 201 - Created with a response body like the following:

{
    "data": [
        {
            "reason": "JobStarted",
            "sessionId": "dd6a90e0-c353-11e7-ad4e-7f2dd2ae4621",
            "id": 33
        }
    ]
}

See https://api.trifacta.com/ee/9.7/index.html#operation/runDeployment

Step - Iterate

If you need to make changes to fix issues related to running the job:

Recipe changes should be made in the Dev environment and then passed through export and import of the flow into the Prod deployment.
Connection issues:
- Check Flow View in the Prod instance to see if there are any red dots on the objects in the package. If so, your import rules need to be fixed.
- Verify that you can import data through the connection.
Output problems could be related to permissions on the target location.

Step - Set up Production Schedule

When you are satisfied with how the production version of your flow is working, you can set up periodic schedules using a third-party tool to execute the job on a regular basis.

The tool must hit the Run Deployment endpoint and then verify that the output has been properly generated.

In this section:

API Task - Deploy a Flow

Overview

Prerequisites

Task

Step - Get Flow Id

Step - Export a Flow

Step - Create Deployment

Step - Create Connection

Step - Create Import Rules

Step - Import Package to Create Release

Step - Activate Release

Step - Run Deployment

Step - Iterate

Step - Set up Production Schedule

Search results