Skip to main content

Machine Learning in Azure

Follow this guide to deploy the Machine Learning module for Azure private data processing.

Prerequisite

Before you deploy the Machine Learning module, you must complete these steps on the Set Up Azure Subscription and Vnet for Private Data page...

  1. Configured a resource group dedicated to Alteryx Analytics Cloud (AAC) as mentioned in the Create Resource Group section.

  2. Configured a Vnet dedicated to AAC as mentioned in the Configure Virtual Private Network section.

  3. App registration and base IAM role attached to the service account as mentioned in the Configure IAM section.

  4. Successfully triggered private data processing provisioning as mentioned in the Trigger Private Data Handling Provisioning section.

Subscription Setup

Step 1: Configure IAM

Step 1a: Create IAM Custom Role

You need to create a custom IAM role. Name it AAC_MachineLearning_SA_Role and use the following role document. We recommend using the JSON tab instead of the visual editor. AAC requires some * permissions to run. Expect some security warnings when you create the role.

Note

AAC_MachineLearning_SA_Role is an example role name. You can choose any name for the role, but the name must start with AAC_MachineLearning.

Important

You must update the "assignableScopes" scope of this custom role. Replace <subscription ID> with your subscription ID.

{
    "properties": {
        "roleName": "AAC_MachineLearning_SA_Role",
        "description": "Custom role for provisioning AAC private data handling",
        "assignableScopes": [
            "/subscriptions/<subscription ID>"
        ],
        "permissions": [
            {
                "actions": [
                    "Microsoft.Authorization/*/read",
                    "Microsoft.Compute/availabilitySets/*",
                    "Microsoft.Compute/locations/*",
                    "Microsoft.Compute/virtualMachines/*",
                    "Microsoft.Compute/virtualMachineScaleSets/*",
                    "Microsoft.Compute/cloudServices/*",
                    "Microsoft.Compute/disks/write",
                    "Microsoft.Compute/disks/read",
                    "Microsoft.Compute/disks/delete",
                    "Microsoft.Network/applicationGateways/backendAddressPools/join/action",
                    "Microsoft.Network/loadBalancers/backendAddressPools/join/action",
                    "Microsoft.Network/loadBalancers/inboundNatPools/join/action",
                    "Microsoft.Network/loadBalancers/inboundNatRules/join/action",
                    "Microsoft.Network/loadBalancers/probes/join/action",
                    "Microsoft.Network/loadBalancers/read",
                    "Microsoft.Network/locations/*",
                    "Microsoft.Network/networkInterfaces/*",
                    "Microsoft.Network/networkSecurityGroups/join/action",
                    "Microsoft.Network/networkSecurityGroups/read",
                    "Microsoft.Network/networkSecurityGroups/write", 
                    "Microsoft.Network/networkSecurityGroups/delete",
                    "Microsoft.Network/publicIPAddresses/join/action",
                    "Microsoft.Network/publicIPAddresses/read",
                    "Microsoft.Network/virtualNetworks/read",
                    "Microsoft.Network/virtualNetworks/subnets/join/action",
                    "Microsoft.RecoveryServices/locations/*",
                    "Microsoft.ResourceHealth/availabilityStatuses/read",
                    "Microsoft.Resources/deployments/*",
                    "Microsoft.Resources/subscriptions/resourceGroups/read",
                    "Microsoft.Resources/subscriptions/read",
                    "Microsoft.Resources/subscriptions/operationresults/read",
                    "Microsoft.Storage/storageAccounts/*",
                    "Microsoft.ContainerService/fleets/read",
                    "Microsoft.ContainerService/fleets/listCredentials/action",
                    "Microsoft.ContainerService/managedClusters/listClusterAdminCredential/action",
                    "Microsoft.ContainerService/managedClusters/accessProfiles/listCredential/action",
                    "Microsoft.ContainerService/managedClusters/read",
                    "Microsoft.ContainerService/managedClusters/runcommand/action",
                    "Microsoft.KeyVault/*",
                    "Microsoft.Network/virtualNetworks/subnets/read",
                    "Microsoft.ContainerService/managedClusters/write",
                    "Microsoft.ManagedIdentity/userAssignedIdentities/read",
                    "Microsoft.ManagedIdentity/userAssignedIdentities/assign/action",
                    "Microsoft.Network/routeTables/routes/write",
                    "Microsoft.Network/routeTables/routes/read",
                    "Microsoft.ContainerService/managedClusters/listClusterUserCredential/action",
                    "Microsoft.ContainerService/managedClusters/delete",
                    "Microsoft.ContainerService/managedClusters/agentPools/read",
                    "Microsoft.ContainerService/managedClusters/agentPools/write",
                    "Microsoft.ContainerService/managedClusters/agentPools/delete",
                    "Microsoft.ContainerService/managedClusters/availableAgentPoolVersions/read",
                    "Microsoft.ContainerService/managedClusters/agentPools/upgradeNodeImageVersion/write",
                    "Microsoft.ManagedIdentity/userAssignedIdentities/federatedIdentityCredentials/read",
                    "Microsoft.ManagedIdentity/userAssignedIdentities/federatedIdentityCredentials/write",
                    "Microsoft.ManagedIdentity/userAssignedIdentities/federatedIdentityCredentials/delete",
                    "Microsoft.ManagedIdentity/userAssignedIdentities/delete",
                    "Microsoft.ManagedIdentity/userAssignedIdentities/write",
                    "Microsoft.Network/routeTables/read",
                    "Microsoft.Network/routeTables/write",
                    "Microsoft.Authorization/roleAssignments/write",
                    "Microsoft.Authorization/roleAssignments/delete",
                    "Microsoft.Cache/redis/*",
                    "Microsoft.Network/privateEndpoints/read",
                    "Microsoft.Network/privateEndpoints/write",
                    "Microsoft.Network/privateEndpoints/delete",
                    "Microsoft.Network/privateDnsZones/read",
                    "Microsoft.Network/privateDnsZones/write",
                    "Microsoft.Network/privateDnsZones/delete",
                    "Microsoft.Network/privateDnsZones/SOA/read",
                    "Microsoft.Network/privateDnsZones/SOA/write",
                    "Microsoft.Network/privateEndpoints/privateDnsZoneGroups/write",
                    "Microsoft.Network/privateEndpoints/privateDnsZoneGroups/read",
                    "Microsoft.Network/privateEndpoints/privateDnsZoneGroups/delete",
                    "Microsoft.Network/privateDnsZones/virtualNetworkLinks/read",
                    "Microsoft.Network/privateDnsZones/virtualNetworkLinks/write",
                    "Microsoft.Network/privateDnsZones/virtualNetworkLinks/delete",
                    "Microsoft.Network/privateDnsZones/join/action",
                    "Microsoft.Network/virtualNetworks/join/action",
                    "Microsoft.Insights/autoScaleSettings/write",
                    "Microsoft.Insights/autoScaleSettings/read",
                    "Microsoft.Insights/autoScaleSettings/delete"
                ],
                "notActions": [],
                "dataActions": [
                    "Microsoft.ContainerService/fleets/*",
                    "Microsoft.ContainerService/managedClusters/*"
        ],
                "notDataActions": []
            }
        ]
    }
}

Step 1b: Bind Custom Role to App Registration in the Subscription

Add the AAC_MachineLearning_SA_Role IAM custom role to the aac_automation_sa service account created in the App Registration section.

Step 2: Route Table

Create the route table for your subnets.

Important

You must configure the Vnet with a network connection to the internet in your subscription.

Note

This route table is an example.

Address Prefix

Next Hop Type

/18 CIDR Block

v-net

/22 CIDR Block

v-net

0.0.0.0/0

<gateway_ID>

Note

Your <gateway id> can be either a NAT gateway created per AZ or a transit gateway, depending on your network architecture.

Step 3: Configure Subnet

Note

If you purchased Designer Cloud and Machine Learning, then configure the subnets as mentioned in the Designer Cloud setup guide. Both Designer Cloud and Machine Learning resources share the same subnets.

Machine Learning in the private data processing requires 3 subnets.

  • aac_aks_node (required): The AKS cluster uses this subnet to execute Alteryx software jobs (connectivity, conversion, processing, publishing).

  • aac_public (required): This group doesn’t run any services, but the aks_node group uses it for egress out of the cluster. Delegate this subnet to Microsoft.ContainerService/managedClusters which grants the AKS service permissions to inject the API server pods and internal load balancer into that subnet.

  • aac_private (required): This group runs services private to the private data processing.

Step 3a: Create Subnet in the Vnet

Configure subnets in the aac_vpc VPC.

Follow this example to create subnets with subnet name, subnet size, and other configurations (modify values, as needed, to meet your network architecture). Attach the Network security group created in the Create Network Security Group section to the subnets.

Address Space

Subnet Name

Subnet

Service Endpoints

Route Table

Notes

10.64.0.0/18

aac_aks_node

10.64.0.0/19

Microsoft.Storage

Microsoft.KeyVault

Attaching this subnet to a route table is optional. Azure sets up the network during AKS creation.

AKS Cluster Subnet

10.10.0.0/22

aac_public

10.10.0.0/25

Microsoft.Storage

Attach to the route table created in Step 2.

Delegate Microsoft.ContainerService/managedClusters

10.10.0.0/22

aac_private

10.10.1.0/24

Microsoft.KeyVault

Attach to the route table created in Step 2.

 

Important

10.64.0.0/18 and 10.10.0.0/22 are an example. The subnet name must match with the name as shown in the table.

Step 4: Quota Adjustment

Adjust quotas per these parameters:

CPU Limits

  • Quota Name: Total Regional vCPUs

    • Scope: Regional

    • Azure default quota value: 10

    • Applied quota value: 2500

  • Quota Name: Standard Basv2 Family vCPUs

    • Scope: Regional

    • Azure default quota value: 10

    • Applied quota value: 2500

Step 5: Feature Registration on Subscription

To enable Host-based encryption on AKS, enable Encryption at Host on Subscription.

Private Data Processing

Caution

If you modify or remove any of the AAC-provisioned public cloud resources once private data handling is provisioned, it leads to an inconsistent state. This inconsistency triggers errors during the job execution or deprovisioning of the private data handling setup.

Step 1: Trigger Machine Learning Deployment

Machine Learning provisioning triggers from the Admin Console inside AAC. You need Workspace Admin privileges within a workspace in order to see it.

  1. From the AAC landing page, select the Profile menu and then select Workspace Admin.

  2. From the Admin Console, select Private Data Handling and then select Processing.

  3. Select the Machine Learning checkbox and then select Update.

Selecting Update triggers the deployment of the cluster and resources in the Azure subscription. This runs a set of validation checks to verify the correct configuration of the Azure subscription.

Note

The provisioning process takes approximately 35–40 minutes to complete.

After the provisioning completes, you can view the created resources (for example, VM instances and node pools) through the Azure portal. It is very important that you don't modify them on your own. Manual changes might cause issues with the function of the private data processing.

Step 2: Assign Key Vault Permission to User-Managed Identity

After the successful creation of Private Data Handling, AAC creates a user-managed identity called aac-k8s-user-identity in your Azure subscription. The user-managed identity allows the Kubernetes service account to retrieve private data storage access credentials from the key vault.

  1. Sign in to the Azure subscription in which you provisioned private data storage and private data handling.

  2. In the Resource menu, select the key vault where you stored the private data storage credential.

  3. Select Access policies and then select Create.

  4. Select Get under Secret Management Operations (Secret permissions) and then select Next.

  5. Under the Principal tab, search for aac-k8s-user-identity and then select it.

  6. Select Next.

  7. Under Application (optional), select Next.

  8. Select Create.