Skip to main content

Set Up AWS Account and VPC for Private Data

Private data processing involves running a data processing cluster for Alteryx Analytics Cloud inside of your AWS account and VPC. This combination of software, your infrastructure, and AWS resources managed by Alteryx, is referred to as a private data plane. This page focuses on how to set up your AWS account and VPC for Alteryx Analytics Cloud to create a private data plane there.

Note

The AWS Account and VPC setup requires access and permissions to the AWS Console. If you don’t have this access, contact your IT team to complete this step.

Shared Responsibility Model

In the private data handling scenario, Alteryx Analytics Cloud requires clear boundaries of ownership. The shared responsibility matrix represents these boundaries of own.

Customer

Alteryx, Inc.

AWS Account-Wide Resources

  • Account Details

  • IAM Credentials

  • IAM Policy

  • Specification

VPC

  • Infrastructure

  • Subnets

  • Routing

  • Endpoints

  • Specification

Cloud Resources

  • S3

  • EKS

  • IAM Roles

  • IAM Policies

  • Secrets Manager

  • EMR Serverless

  • EC2

Software

  • On-demand Jobs

  • Long-running Services

Account-Wide Resources

At the highest level, Alteryx requires a set of permissions to run a private data plane. However, you will own the AWS account, the IAM credentials, and the IAM policy.

Alteryx provides a Cloud Formation template which defines the necessary permissions you may use to assist in completing this step.

Virtual Private Cloud

At the next level down, Alteryx defines a specification for the VPC. This includes the definition of a number of subnets, CIDR blocks, route tables, and endpoints.

You must implement the VPC according to this spec. Alteryx provides Cloud Formation templates to assist in subnet and route table creation which you can use to assist in completing this step.

Cloud Resources

Once you’ve completed setup of the AWS account and VPC, sign in to the Alteryx Analytics Cloud to trigger the provisioning process that creates your private data processing cluster. The list of resources varies depending on which services you enable in the private data plane, but includes temporary storage, a Kubernetes cluster, compute nodes, secret management, and elastic spark processing.

The Alteryx Analytics Cloud will create and maintain these resources for you using automated provisioning pipelines in Terraform.

Software

After provisioning the needed resources, Alteryx deploys and maintains the software necessary to process your data within the private cluster. This includes a few long-running services and on-demand jobs.

Setup Steps

Step 1: Select the AWS Account

Select the account where you want to run your private data plane.

Because IAM credentials are scoped to the entire account, the most secure way to run a private data plane is in a dedicated AWS account. This is not required but recommended.

You probably want this account to be in the same region as the S3 bucket you selected for private data storage, as well as any data sources you want to connect to the Alteryx Analytics Cloud. This improves performance and reduces egress costs.

The VPC created in the AWS account should be dedicated to the Alteryx Analytics Cloud. You can set up connectivity to private data sources using VPC peering, transit gateways, PrivateLink, or others.

Step 2: Configure IAM

With your AWS account in place, the next step is to set up the IAM user account and access keys.

Step 2a: Create a IAM User (Service Account)
  1. Create an IAM user with the name: aac_automation_sa. Ensure that this user doesn't have console access.

  2. On Set Permissions, select Next.

  3. Tag the IAM user:

    Key Name

    Value

    AACResource

    aac_iam_user

  4. Select Create User

  5. Generate an access key:

    1. Select the new IAM user and then select the Security credentials tab.

    2. Select Create access key.

    3. Select Other under Access key best practices & alternatives. Then select Next.

    4. Select Create access key.

Note

You need the IAM user access key and secret key later when you provision the cloud resources and deploy software.

Step 2b: Create the IAM Policy and Bind to the Service Account

You need to create a custom IAM policy. Name it AAC_SA_Policy and use the following policy document. We recommend using the JSON tab instead of the visual editor. The Alteryx Analytics Cloud requires some * permissions to run. Expect some security warnings when you create the policy.

Note

You can run a cloud formation template from Alteryx to assist with Step 2b.

Cloud formation templates can't tag resources. If you use the template, remember to complete Step 2c after running the template to assign a tag to this policy.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "arn:aws:iam::*:role/*",
            "Condition": {
                "StringEqualsIfExists": {
                    "iam:PassedToService": [
                        "ec2.amazonaws.com",
                        "ec2.amazonaws.com.cn"
                    ]
                }
            }
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "eks:CreateAddon",
                "eks:CreateNodegroup",
                "eks:DeleteAddon",
                "eks:DeleteCluster",
                "eks:DeleteNodegroup",
                "eks:DescribeAddon",
                "eks:DescribeCluster",
                "eks:DescribeNodegroup",
                "eks:DescribeUpdate",
                "eks:TagResource",
                "eks:ListNodegroups",
                "eks:ListUpdates",
                "eks:UpdateNodegroupVersion",
                "eks:UntagResource",
                "eks:UpdateNodegroupConfig",          
                "eks:UpdateClusterConfig",
                "eks:UpdateClusterVersion",
                "iam:CreateServiceLinkedRole",
                "kms:CreateGrant",
                "kms:Decrypt",
                "kms:DescribeKey",
                "kms:Encrypt",
                "kms:GetKeyPolicy",
                "kms:GetKeyRotationStatus",
                "kms:ListGrants",
                "kms:ListResourceTags",
                "kms:ListRetirableGrants",
                "kms:PutKeyPolicy",
                "kms:RetireGrant",
                "kms:RevokeGrant",
                "kms:ScheduleKeyDeletion",
                "kms:TagResource",
                "kms:UntagResource"
            ],
            "Resource": [
                "arn:aws:eks:*:*:addon/*/*/*",
                "arn:aws:eks:*:*:cluster/*",
                "arn:aws:eks:*:*:nodegroup/*/*/*",
                "arn:aws:eks:*:*:identityproviderconfig/*/*/*/*",
                "arn:aws:kms:*:*:key/*",
                "arn:aws:iam::*:role/*"
            ]
        },
        {
            "Sid": "VisualEditor2",
            "Effect": "Allow",
            "Action": [
                "iam:AttachRolePolicy",
                "iam:CreateOpenIDConnectProvider",
                "iam:CreatePolicy",
                "iam:CreatePolicyVersion",
                "iam:CreateRole",
                "iam:DeleteOpenIDConnectProvider",
                "iam:DeletePolicy",
                "iam:DeletePolicyVersion",
                "iam:DeleteRole",
                "iam:DeleteRolePolicy",
                "iam:DetachRolePolicy",
                "iam:GetOpenIDConnectProvider",
                "iam:GetPolicy",
                "iam:GetPolicyVersion",
                "iam:GetRole",
                "iam:GetRolePolicy",
                "iam:GetUser",
                "iam:GetUserPolicy",
                "iam:ListAttachedRolePolicies",
                "iam:ListAttachedUserPolicies",
                "iam:ListGroupsForUser",
                "iam:ListInstanceProfilesForRole",
                "iam:ListPolicyTags",
                "iam:ListPolicyVersions",
                "iam:ListRolePolicies",
                "iam:PassRole",
                "iam:PutRolePolicy",
                "iam:TagOpenIDConnectProvider",
                "iam:TagPolicy",
                "iam:TagRole",
                "iam:UntagOpenIDConnectProvider",
                "iam:UntagPolicy",
                "iam:UntagRole",
                "iam:UpdateOpenIDConnectProviderThumbprint",
                "iam:UpdateRole",
                "iam:UpdateAssumeRolePolicy"
            ],
            "Resource": [
                "arn:aws:iam::*:policy/*",
                "arn:aws:iam::*:oidc-provider/*",
                "arn:aws:iam::*:user/*",
                "arn:aws:iam::*:role/*"
            ]
        },
        {
            "Sid": "VisualEditor3",
            "Effect": "Allow",
            "Action": [
                "autoscaling:*",
                "ec2:*",
                "eks:CreateCluster",
                "eks:ListClusters",
                "elasticloadbalancing:*",
                "iam:GetAccountName",
                "iam:ListAccountAliases",
                "iam:ListRoles",
                "iam:CreateInstanceProfile",
                "iam:DeleteInstanceProfile",
                "iam:GetInstanceProfile",
                "iam:TagInstanceProfile",
                "iam:UntagInstanceProfile", 
                "iam:RemoveRoleFromInstanceProfile", 
                "iam:AddRoleToInstanceProfile", 
                "kms:CreateKey",
                "logs:CreateLogGroup",
                "logs:DeleteLogGroup",
                "logs:DescribeLogGroups",
                "logs:ListTagsLogGroup",
                "logs:PutRetentionPolicy",
                "logs:TagResource",
                "logs:UntagResource",
                "logs:TagLogGroup",
                "logs:UntagLogGroup",
                "networkmanager:Describe*",
                "networkmanager:Get*",
                "networkmanager:List*",
                "s3:CreateBucket",
                "s3:DeleteBucket",
                "s3:DeleteBucketPolicy",
                "s3:DeleteBucketWebsite",
                "s3:DeleteObject",
                "s3:DeleteObjectVersion",
                "s3:DeleteObjectVersionTagging",
                "s3:GetAccelerateConfiguration",
                "s3:GetBucketAcl",
                "s3:GetBucketCORS",
                "s3:GetBucketLocation",
                "s3:GetBucketLogging",
                "s3:GetBucketObjectLockConfiguration",
                "s3:GetBucketOwnershipControls",
                "s3:GetBucketPolicy",
                "s3:GetBucketPolicyStatus",
                "s3:GetBucketPublicAccessBlock",
                "s3:GetBucketRequestPayment",
                "s3:GetBucketTagging",
                "s3:GetBucketVersioning",
                "s3:GetBucketWebsite",
                "s3:GetEncryptionConfiguration",
                "s3:GetLifecycleConfiguration",
                "s3:GetObject",
                "s3:GetObjectAcl",
                "s3:GetObjectVersion",
                "s3:GetObjectVersionAcl",
                "s3:GetObjectVersionAttributes",
                "s3:GetObjectVersionForReplication",
                "s3:GetObjectVersionTagging",
                "s3:GetObjectVersionTorrent",
                "s3:GetReplicationConfiguration",
                "s3:ListAllMyBuckets",
                "s3:ListBucket",
                "s3:ListBucketVersions",
                "s3:PutAccelerateConfiguration",
                "s3:PutBucketAcl",
                "s3:PutBucketCORS",
                "s3:PutBucketLogging",
                "s3:PutBucketObjectLockConfiguration",
                "s3:PutBucketOwnershipControls",
                "s3:PutBucketPolicy",
                "s3:PutBucketPublicAccessBlock",
                "s3:PutBucketRequestPayment",
                "s3:PutBucketTagging",
                "s3:PutBucketVersioning",
                "s3:PutBucketWebsite",
                "s3:PutEncryptionConfiguration",
                "s3:PutLifecycleConfiguration",
                "s3:PutObject",
                "s3:PutObjectAcl",
                "s3:PutObjectVersionAcl",
                "s3:PutObjectVersionTagging",
                "sts:GetCallerIdentity"
            ],
            "Resource": "*"
        },
        {
            "Sid": "VisualEditor4",
            "Effect": "Allow",
            "Action": "secretsmanager:*",
            "Resource": "arn:aws:secretsmanager:*:*:secret:*"
        }
    ]
}
Step 2c: Tag the IAM Policy
  1. Tag the custom IAM policy created on step 2b.

    Key Name

    Value

    AACResource

    aac_sa_custom_policy

  2. Attach the AAC_SA_Policy IAM policy to the aac_automation_sa service account created in Step 2a.

Step 3: Create a VPC

Create the VPC after you create the IAM policy:

  1. Create a new VPC in one of the supported regions.

  2. Select the VPC and more option.

  3. Configure a /18 and /21 CIDR in the VPC. You might need to create the VPC with a single CIDR and then select Edit CIDRs to add the second.

  4. Select 3 in the Number of Availability Zones (AZs) section.

  5. Select 0 in the Number of public subnets section.

  6. Select 0 in the Number of private subnets section.

  7. Select None in the NAT gateways section.

  8. Enable the S3 Gateway VPC endpoint within the VPC.

  9. Enable DNS hostnames and resolution.

  10. Tag the VPC.

Tag Name

Value

AACResource

aac_vpc

Note

Connections to private data sources require network paths between the VPC and the data source. As defined in the shared responsibility matrix, you set up these network paths in accordance with your own network policies and preferences.

Step 4: Tag Transit Gateway and Internet Gateway

If your network setup requires usage of a transit gateway or internet gateway, set up and tag them now.

Tag Name

Value

AACResource

aac

Note

Cloud formation template/terraform code uses the tag values for configuration.

Step 5: Configure Subnet

A private data plane requires up to 5 subnet groups depending on what you want to run within your data plane. Each group contains 3 individual subnets, each in a different availability zone.

Alteryx provides a single Cloud Formation template that can assist with Step 5 and 6.

Used for Standard Cloud Execution
  • eks_control group (required)—The EKS control plane uses this subnet to accept incoming job execution requests.

  • eks_node group (required)—The EKS cluster uses this subnet to execute Alteryx software jobs (connectivity, conversion, processing, publishing).

  • public group (required)—This group doesn’t run any services but the eks_node group uses it for egress out of the cluster.

Used for EMR
  • private group (optional)—Use this group if you enable EMR processing within your private data plane. EMR services do not run in the cluster, but the IP space is needed to interact with the AWS Serverless EMR endpoints.

Used for Cloud Execution for Desktop
  • option group (optional)—Use this group if you enable Cloud Execution for Desktop within your private data plane. If you enable this option, an AMI swarm runs in this subnet to handle Designer Desktop processing jobs running in the cloud.

Create Subnets in the VPC

Subnets with tag names are required.

Create subnets and tag them following this example (modify values, as needed, to meet your network architecture):

CIDRs

Subnet Name

Subnet

AZ

Tag Name

Tag Value

10.64.0.0/18

eks_node

10.64.0.0/21

AZa

AACSubnet

eks_node

eks_node

10.64.8.0/21

AZb

AACSubnet

eks_node

eks_node

10.64.16.0/21

AZc

AACSubnet

eks_node

10.10.0.0/21

private

10.10.0.0/24

AZa

AACSubnet

private

private

10.10.1.0/24

AZb

AACSubnet

private

private

10.10.2.0/24

AZc

AACSubnet

private

option

10.10.4.0/26

AZa

AACSubnet

option

option

10.10.4.64/26

AZb

AACSubnet

option

option

10.10.4.128/26

AZc

AACSubnet

option

public

10.10.5.0/27

AZa

AACSubnet

public

public

10.10.5.32/27

AZb

AACSubnet

public

public

10.10.5.64/27

AZc

AACSubnet

public

eks_control

10.10.5.96/27

AZa

AACSubnet

eks_control

eks_control

10.10.5.128/27

AZb

AACSubnet

eks_control

eks_control

10.10.5.160/27

AZc

AACSubnet

eks_control

Note

This table is shown as an example for subnet configuration in a VPC with 10.10.0.0/21 and 10.64.0.0/18 CIDRs. Choose the CIDR blocks that fit your routing architecture.

Step 6: Subnet Route Tables

Create the route table for your subnets. Route table entries for the subnets are as follows:

Note

Alteryx provides a single cloud formation template that can assist with Step 5 and 6.

Note

Your <gateway id> could be either a NAT gateway created per AZ or a transit gateway, depending on your network architecture.

Subnet Name

Route Destination

Target

Comments

eks_node

/18 CIDR Block

/21 CIDR Block

<s3 prefix id>

0.0.0.0/0

local

local

<vpce endpoint id>

<gateway id>

Configure the same routes to all 3 AZs subnet routing tables.

private

/18 CIDR Block

/21 CIDR Block

<s3 prefix id>

0.0.0.0/0

local

local

<vpce endpoint id>

<gateway id>

Configure the same routes to all 3 AZs subnet routing tables.

0.0.0.0/0 should be egressing out to the public network.

eks_control

/18 CIDR Block

/21 CIDR Block

<s3 prefix id>

0.0.0.0/0

local

local

<vpce endpoint id>

<gateway id>

Configure the same routes to all 3 AZs subnet routing tables.

public

/18 CIDR Block

/21 CIDR Block

0.0.0.0/0

local

local

<internet gateway id>

Configure the same routes to all 3 AZs subnet routing tables.

option

/18 CIDR Block

/21 CIDR Block

<s3 prefix id>

0.0.0.0/0

local

local

<vpce endpoint id>

<gateway id>

Configure the same routes to all 3 AZs subnet routing tables.

0.0.0.0/0 should be egressing out to the public network.

Step 7: Adjust the Quota

Your private data plane requires a quota increase on these services. Adjust the Applied quota value numbers as follows:

Amazon EMR Serverless
  • Quota Name: Max concurrent vCPUs per account.

  • Quota Description: Maximum number of vCPUs that can be concurrently run in this account in the current Region.

  • AWS default quota value: 16

  • Applied quota value: 1024

Amazon EC2
  • Quota Name: Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instances

  • Quota Description: Maximum number of vCPUs assigned to the Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instances

  • AWS default quota value: 5

  • Applied quota value: 2500

Steps to Request Quota Increase
  1. Sign in to the AWS account console.

  2. Search for Service Quotas and select the service.

  3. Select AWS Service from the navigation pane on the left.

  4. Search for the service (for example, Amazon EMR Serverless or Amazon EC2).

  5. Select the quota name.

  6. Select Request quota increase.

  7. Request the specified quota increase.

Special Instructions for Cloud Formation Template

The provided Cloud Formation templates cover Step 2b: Create IAM policy, Step 5: Create subnets in the VPC and Step 6: Subnet Route Tables. You must complete the other steps within the AWS Console.

You must have the AdministratorAccess permission to execute this template.

  1. Download ‘iam-template’.yaml and ‘vpc-template.yaml’

    1. To set up IAM policy for the service account, use URI: https://prod-css-public-storage.s3.amazonaws.com/CFT/pdh-iam-cft.yaml.

    2. To set up VPC subnets and route tables, use URI: https://prod-css-public-storage.s3.amazonaws.com/CFT/pdh-subnets-cft.yaml.

  2. Sign in to your AWS account. Search for the AWS Cloud Formation service and launch the service.

  3. Select the AWS region where you’d like to provision your private data plane.

  4. Select Create Stack.

  5. Under Prerequisite - Prepare template, select Template is ready.

  6. Under Specify template, select Upload a template file.

  7. Select Choose file and select the downloaded YAML file.

  8. Enter the stack name and required parameters, then select Next.

  9. Under Configure stack option, select Next.

  10. Verify all the parameters and select Submit.

This completes the setup for your AWS account and VPC. The next step is to configure Private Data Processing.