Follow this guide to deploy the Machine Learning module for AWS private data processing.
Before you deploy the Machine Learning module, you must complete these steps on the Set Up AWS Account and VPC for Private Data page...
Configured a VPC dedicated to Alteryx Analytics Cloud (AAC) as mentioned in the Create a VPC section.
Service account and base IAM policy attached to the service account as mentioned in the Configure IAM section.
Successfully triggered private data processing provisioning as mentioned in the Trigger Private Data Handling Provisioning section.
You need to create a custom IAM policy. Name it AAC_MachineLearning_SA_Policy
and use the following policy document. We recommend using the JSON tab instead of the visual editor. AACAAC requires some * permissions to run. Expect some security warnings when you create the policy.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": "iam:PassRole",
"Resource": "arn:aws:iam::*:role/*",
"Condition": {
"StringEqualsIfExists": {
"iam:PassedToService": [
"ec2.amazonaws.com",
"ec2.amazonaws.com.cn"
]
}
}
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": [
"eks:*",
"iam:CreateServiceLinkedRole",
"kms:CreateGrant",
"kms:Decrypt",
"kms:DescribeKey",
"kms:Encrypt",
"kms:GetKeyPolicy",
"kms:GetKeyRotationStatus",
"kms:ListGrants",
"kms:ListResourceTags",
"kms:ListRetirableGrants",
"kms:PutKeyPolicy",
"kms:RetireGrant",
"kms:RevokeGrant",
"kms:ScheduleKeyDeletion",
"kms:TagResource",
"kms:UntagResource"
],
"Resource": [
"arn:aws:eks:*:*:addon/*/*/*",
"arn:aws:eks:*:*:cluster/*",
"arn:aws:eks:*:*:nodegroup/*/*/*",
"arn:aws:eks:*:*:identityproviderconfig/*/*/*/*",
"arn:aws:eks:*:*:access-entry/*/*/*",
"arn:aws:kms:*:*:key/*",
"arn:aws:iam::*:role/*"
]
},
{
"Sid": "VisualEditor2",
"Effect": "Allow",
"Action": [
"iam:AttachRolePolicy",
"iam:CreateOpenIDConnectProvider",
"iam:CreatePolicy",
"iam:CreatePolicyVersion",
"iam:CreateRole",
"iam:DeleteOpenIDConnectProvider",
"iam:DeletePolicy",
"iam:DeletePolicyVersion",
"iam:DeleteRole",
"iam:DeleteRolePolicy",
"iam:DetachRolePolicy",
"iam:GetOpenIDConnectProvider",
"iam:GetPolicy",
"iam:GetPolicyVersion",
"iam:GetRole",
"iam:GetRolePolicy",
"iam:GetUser",
"iam:GetUserPolicy",
"iam:ListAttachedRolePolicies",
"iam:ListAttachedUserPolicies",
"iam:ListGroupsForUser",
"iam:ListInstanceProfilesForRole",
"iam:ListPolicyTags",
"iam:ListPolicyVersions",
"iam:ListRolePolicies",
"iam:PassRole",
"iam:PutRolePolicy",
"iam:TagOpenIDConnectProvider",
"iam:TagPolicy",
"iam:TagRole",
"iam:UntagOpenIDConnectProvider",
"iam:UntagPolicy",
"iam:UntagRole",
"iam:UpdateOpenIDConnectProviderThumbprint",
"iam:UpdateRole",
"iam:UpdateAssumeRolePolicy"
],
"Resource": [
"arn:aws:iam::*:policy/*",
"arn:aws:iam::*:oidc-provider/*",
"arn:aws:iam::*:user/*",
"arn:aws:iam::*:role/*"
]
},
{
"Sid": "VisualEditor3",
"Effect": "Allow",
"Action": [
"autoscaling:*",
"ec2:*",
"eks:CreateCluster",
"eks:ListClusters",
"elasticloadbalancing:*",
"iam:GetAccountName",
"iam:ListAccountAliases",
"iam:ListRoles",
"iam:CreateInstanceProfile",
"iam:DeleteInstanceProfile",
"iam:GetInstanceProfile",
"iam:TagInstanceProfile",
"iam:UntagInstanceProfile",
"iam:RemoveRoleFromInstanceProfile",
"iam:AddRoleToInstanceProfile",
"kms:CreateKey",
"logs:CreateLogGroup",
"logs:DeleteLogGroup",
"logs:DescribeLogGroups",
"logs:ListTagsLogGroup",
"logs:PutRetentionPolicy",
"logs:TagResource",
"logs:UntagResource",
"logs:TagLogGroup",
"logs:UntagLogGroup",
"logs:ListTagsForResource",
"networkmanager:Describe*",
"networkmanager:Get*",
"networkmanager:List*",
"s3:CreateBucket",
"s3:DeleteBucket",
"s3:DeleteBucketPolicy",
"s3:DeleteBucketWebsite",
"s3:DeleteObject",
"s3:DeleteObjectVersion",
"s3:DeleteObjectVersionTagging",
"s3:GetAccelerateConfiguration",
"s3:GetBucketAcl",
"s3:GetBucketCORS",
"s3:GetBucketLocation",
"s3:GetBucketLogging",
"s3:GetBucketObjectLockConfiguration",
"s3:GetBucketOwnershipControls",
"s3:GetBucketPolicy",
"s3:GetBucketPolicyStatus",
"s3:GetBucketPublicAccessBlock",
"s3:GetBucketRequestPayment",
"s3:GetBucketTagging",
"s3:GetBucketVersioning",
"s3:GetBucketWebsite",
"s3:GetEncryptionConfiguration",
"s3:GetLifecycleConfiguration",
"s3:GetObject",
"s3:GetObjectAcl",
"s3:GetObjectVersion",
"s3:GetObjectVersionAcl",
"s3:GetObjectVersionAttributes",
"s3:GetObjectVersionForReplication",
"s3:GetObjectVersionTagging",
"s3:GetObjectVersionTorrent",
"s3:GetReplicationConfiguration",
"s3:ListAllMyBuckets",
"s3:ListBucket",
"s3:ListBucketVersions",
"s3:PutAccelerateConfiguration",
"s3:PutBucketAcl",
"s3:PutBucketCORS",
"s3:PutBucketLogging",
"s3:PutBucketObjectLockConfiguration",
"s3:PutBucketOwnershipControls",
"s3:PutBucketPolicy",
"s3:PutBucketPublicAccessBlock",
"s3:PutBucketRequestPayment",
"s3:PutBucketTagging",
"s3:PutBucketVersioning",
"s3:PutBucketWebsite",
"s3:PutEncryptionConfiguration",
"s3:PutLifecycleConfiguration",
"s3:PutObject",
"s3:PutObjectAcl",
"s3:PutObjectVersionAcl",
"s3:PutObjectVersionTagging",
"sts:GetCallerIdentity",
"memorydb:CreateSubnetGroup",
"memorydb:CreateUser",
"memorydb:CreateAcl",
"memorydb:CreateCluster",
"memorydb:TagResource",
"memorydb:DescribeSubnetGroups",
"memorydb:DescribeUsers",
"memorydb:DescribeACLs",
"memorydb:DescribeClusters",
"memorydb:ListTags",
"memorydb:DeleteUser",
"memorydb:DeleteSubnetGroup",
"memorydb:DeleteAcl",
"memorydb:DeleteCluster",
"memorydb:UpdateAcl",
"memorydb:UpdateCluster",
"memorydb:UpdateSubnetGroup",
"memorydb:UpdateUser"
],
"Resource": "*"
},
{
"Sid": "VisualEditor4",
"Effect": "Allow",
"Action": "secretsmanager:*",
"Resource": "arn:aws:secretsmanager:*:*:secret:*"
}
]
}
Tag the custom IAM policy created in Step 1a.
Tag Name | Value |
---|---|
AACResource | aac_sa_custom_policy |
Attach the AAC_MachineLearning_SA_Policy
IAM policy to the aac_automation_sa
service account created on the Set Up AWS Account and VPC for Private Data page.
Note
AAC_MachineLearning_SA_Policy
is an example policy name. You can choose any name for the policy, but the name must start with AAC_MachineLearning
.
Note
If you purchased Designer Cloud and Machine Learning, then configure the subnets as mentioned in the Designer Cloud setup guide. Both Designer Cloud and Machine Learning resources share the same subnets.
Machine Learning in the private data plane requires up to 4 subnet groups. Each group contains 3 individual subnets, each in a different availability zone.
eks_control group (required): The EKS control plane uses this subnet to accept incoming job execution requests.
eks_node group (required): The EKS cluster uses this subnet to execute Alteryx software jobs (for example, connectivity, conversion, processing, and publishing).
public group (required): This group doesn’t run any services but the
eks_node
group uses it for egress out of the cluster.aac_private group (required): This group runs services private to the private data processing.
Configure subnets in the aac_vpc
VPC.
Create subnets and tag them following this example. Modify values, as needed, to meet your network architecture…
CIDRs | Subnet Name | Subnet | AZ | Tag Name | Tag Value |
---|---|---|---|---|---|
10.64.0.0/18 | eks_node | 10.64.0.0/21 | AZa | AACSubnet | eks_node |
eks_node | 10.64.8.0/21 | AZb | AACSubnet | eks_node | |
eks_node | 10.64.16.0/21 | AZc | AACSubnet | eks_node | |
10.10.0.0/21 | eks_control | 10.10.0.0/27 | AZa | AACSubnet | eks_control |
eks_control | 10.10.0.32/27 | AZb | AACSubnet | eks_control | |
eks_control | 10.10.0.64/27 | AZc | AACSubnet | eks_control | |
public | 10.10.0.128/27 | AZa | AACSubnet | public | |
public | 10.10.0.160/27 | AZb | AACSubnet | public | |
public | 10.10.0.192/27 | AZc | AACSubnet | public | |
private | 10.10.1.0/25 | AZa | AACSubnet | private | |
private | 10.10.1.128/25 | AZb | AACSubnet | private | |
private | 10.10.2.0/25 | AZc | AACSubnet | private |
Important
You must tag subnets with Tag Name
and Tag Value
as mentioned in the table.
Create the route table for your subnets.
Note
This route table is an example.
Subnet Name | Route Destination | Target | Comments |
---|---|---|---|
eks_node | /18 CIDR Block /21 CIDR Block <s3 prefix id> 0.0.0.0/0 | Local Local <vpce endpoint id> <gateway id> | Configure the same routes to all 3 AZs subnet routing tables. |
eks_control | /18 CIDR Block /21 CIDR Block <s3 prefix id> 0.0.0.0/0 | Local Local <vpce endpoint id> <gateway id> | Configure the same routes to all 3 AZs subnet routing tables. |
public | /18 CIDR Block /21 CIDR Block 0.0.0.0/0 | Local Local <internet gateway id> | Configure the same routes to all 3 AZs subnet routing tables. |
Note
Your <gateway id>
can be either a NAT gateway created per AZ or a transit gateway, depending on your network architecture.
Your private data plane requires a quota increase on these services. Adjust the Applied quota value numbers as follows:
Quota Name: Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instances
Quota Description: Maximum number of vCPUs assigned to the Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instances.
AWS default quota value: 5
Applied quota value: 2500
Sign in to the AWS account console.
Search for Service Quotas and select the service.
Select AWS Service from left navigation pane.
Search for the service (for example, Amazon EMR Serverless or Amazon EC2).
Select the quota name.
Select Request quota increase.
Request the specified quota increase.
Host | Port | Protocol | Purpose |
---|---|---|---|
443 | HTTPS | Retrieve feature flags from Unleash. |
Attention
Si vous modifiez ou supprimez l'une des ressources de cloud public provisionnées par AAC une fois que la gestion des données privées est provisionnée, l'état sera défini sur incohérent. Cette incohérence déclenche des erreurs lors de l'exécution de la tâche ou du désapprovisionnement de la configuration de gestion du plan de données privé.
Data plane provisioning triggers from the Admin Console inside AACAAC. You need Workspace Admin privileges within a workspace in order to see it.
From the AACAAC landing page, select the Profile menu and then select Workspace Admin.
From the Admin Console, select Private Data Handling and then select Processing.
Select the Machine Learning checkbox and then select Update.
Selecting Update triggers the deployment of the cluster and resources in the AWS account. This runs a set of validation checks to verify the correct configuration of the AWS account.
Note
The provisioning process takes approximately 35–40 minutes to complete.
After the provisioning completes, you can view the created resources (for example, EC2 instances and node groups) through the AWS console. It is very important that you don't modify them on your own. Manual changes might cause issues with the function of the private data plane.