Skip to main content

Set Up GCP Project and VPC for Private Data

Google Cloud Platform (GCP) private data processing involves running an Alteryx Analytics Cloud (AAC) data processing cluster inside of your GCP project and VPC. This combination of your infrastructure, together with Alteryx-managed GCP resources and software, is commonly referred to as a private data processing.

This page focuses on how to set up your GCP project and VPC for a private data processing on AACAAC.

注意

The GCP project and VPC setup require access and permissions to the GCP console. If you don’t have this access, please contact your IT team to complete this step.

小心

Never delete resources provisioned for Private Data Processing.

Setup Steps

重要

To continue with these steps, you must have the GCP Owner RBAC role assigned to you.

Step 1: Select the GCP Project

Select the project where you’d like to run your private data processing.

To improve performance and reduce egress costs, your Google storage and private data handling GKE cluster should be in the same region that you selected for private data storage. This applies to any data sources that you want to connect to the AACAAC .

The VPC created in the GCP project should be dedicated to AACAAC. You can set up connectivity to private data sources using VPC peering, transit gateways, PrivateLink, or others.

重要

You should only set up 1 private data handling instance per GCP project.

Step 2: Enable Google APIs

To create cloud resources for Private Data Handling, you must enable APIs in the project.

  1. From the GCP console, select APIs & Services.

  2. Select ENABLED APIS AND SERVICES.

  3. Enable these APIs:

    1. Cloud Logging API

    2. Cloud Monitoring API

    3. Compute Engine API

    4. Secret Manager API

    5. Service Networking API

    6. Cloud Asset API

    7. Kubernetes Engine API

    8. Google Cloud Memorystore for Redis API

Step 3: Configure IAM

With your GCP project in place, now set up the service principal and access keys.

Step 3a: Create a Service Account

  1. Create a service account with the name aac-automation-sa.

  2. Generate keys with the key type as JSON.

  3. Store the JSON Blob file.

注意

You'll need the service key JSON Blob file to provision the cloud resources in a later step.

Step 3b: IAM Binding to the Service Account

Assign these roles to the aac-automation-sa service account:

  • Secret Manager Admin: roles/secretmanager.admin

  • Service Account Admin: roles/iam.serviceAccountAdmin

  • Service Account User: roles/iam.serviceAccountUser

  • Project IAM Admin: roles/resourcemanager.projectIamAdmin

  • Service Account key Admin: roles/iam.serviceAccountKeyAdmin

  • Compute Network Viewer: roles/compute.networkViewer

  • Cloud KMS Viewer: roles/cloudkms.viewer

重要

GCP doesn't allow wildcard (*) in the policy document. GCP also has limitations on the number of individual permissions assigned to a custom role. Therefore, you must assign the service account a set of GCP-managed predefined roles.

Step 4: Configure Virtual Private Network

Step 4a: Create a VPC Network

  1. Create a virtual network.

  2. Select Subnet creation mode = Custom.

  3. Disable or delete the default firewall rules.

  4. Select Dynamic routing mode = Global.

  5. The VPC requires 1 subnet. Configure the subnets as shown in this table:

Subnet Name

Subnet Size

Secondary Subnet Name

Secondary Subnet Size

aac-private

10.10.10.0/24

N/A

N/A

重要

The subnet IP addresses and sizes in the table are shown as an example.

Modify values as needed to meet your network architecture. Subnet region must be the region where ‘Private data Handling’ is to be provisioned.

The subnet name MUST match with the name as shown in the table.

Step 4b: Subnet Route Table

重要

You must configure the VPC with a network connection to the internet in your project.

注意

The <gateway id> could be either a NAT gateway or internet gateway, depending on your network architecture.

This is an example subnet route table:

Address Prefix

Next Hop

/24 CIDR Block (aac-private)

aac-vpc

0.0.0.0/0

<gateway_ID>

Step 5: Trigger Private Data Handling Provisioning

Data processing provisioning triggers from the Admin Console inside AACAAC. You need Workspace Admin privileges within a workspace in order to see it.

  1. From the AACAAC landing page, select the Profile menu and then select Workspace Admin.

  2. From the left navigation panel, select Private Data Handling and then select Processing.

小心

如果在预配了私有数据处理后修改或删除任何 AAC 预配的公有云资源,则会导致状态不一致。这种不一致性会导致在作业执行时出错,或取消预配好的私有数据平面处理。

Make sure that Private Data Storage shows Successfully Configured before you proceed. If the status is Not Configured, go to  GCS as Private Data Storage first, then return to this step.

Under the Processing section, enter the required Environment Details from the GCP Project and VPC setup steps you just completed:

  1. Enter the GCP Project ID.

  2. Select the Region of the GCP project you want to use for private data processing.

  3. Enter the VPC Name.

  4. Enter the GKE Control Plane Address Range.

  5. Copy and paste JSON Blob file you created in the previous step.

  6. Select Create.

Selecting Create triggers the deployment of the cluster and resources in the GCP project. This runs a set of validation checks to verify the correct configuration of the GCP project. If there are incorrectly configured permissions, or the creation or tagging of the VPC resources is not correct, you receive an error message with a description of how to fix the error.

Once the initial validation checks complete, provisioning commences. A message box on the screen periodically refreshes with status updates.

注意

The provisioning process takes approximately 35–40 minutes to complete.

After the provisioning completes, you can view the created resources (for example, VM instances and node pools) through the GCP console. It is very important that you don't modify them on your own. Manual changes might cause issues with the function of the private data processing.