Data Sources, Credentials, and Connections
This article describes data management in Alteryx Analytic Hub. Data management is done primarily by Data Stewards and Site Admins. The role you have on the Hub determines the level of permissions you have for working with data on the Hub.
A connection is a set of all the information needed to successfully connect to a database. It is a combination of a data source and a credential. Workflows contain a virtual reference that points to the connection held within the Analytics Hub.
A connection links two components which are managed independently:
- Data Source—where the data is located. This usually means information like hostname and port and other optional parameters. A data source always has an owner. Data sources are also the primary input for metadata loaders. There can be multiple connections created out of a data source, but each connection can have at most one credential assigned to it.
- Credential—the information used to authenticate against the data source. This is usually a combination of a username and a password.
- See the information about a data source, such as the server and database name, port number used in the connection, and the database driver used to connect.
- See the list of data sources a user can access.
- Open a connection to see the username of the associated credential.
- Replace the originally used credential with another one.
- Modify the data source by changing the database or server name.
- See the password inside a credential.
- Modify the data source by changing technology.
Connections are bound to the data source. You create a new connection along with adding a new data source. For an existing data source, you can add new, or edit and share connections by choosing a data source on the Data Sources page.
Supported Data Sources
Analytics Hub fully supports these data source technologies:
- Microsoft SQL Server
- Amazon Redshift
Full support includes metadata loading, workflow dependency resolution, and usage in workflows as Input and Output tools.
Other Data Sources are partially supported only as Input and Output tools in workflows. These data sources can't be used with metadata loading or workflow dependencies. These data sources require the appropriate ODBC driver and DSN configuration to be defined on Analytics Hub engine workers.
All of the drivers for the supported technologies are installed with the Analytics Hub installer except for MySQL ODBC 8.0.20 (64-bit), which you have to install separately.
Add a Data Source
Users with the role Site Admin, Data Steward, or Contributor can add data sources.
To add a data source, follow these steps:
- Go to Data Sources and select Add Data Source.
- Select a Technology.
- Enter the required information. The Data Source name can contain a maximum of 63 characters.
- Select Next.
- Enter the connection details. Optionally, pick the Credential.
- Select Add.
Private and Shareable Connections
Connections can be either Private or Shareable. By design, a Private connection can’t be shared. The Private or Shareable state of a connection is defined by the Credential that is a part of the Connection. To make a connection shareable, go to Credentials. Select the credential you want to allow to be shared, and select the option Allowed in shared connections.
Even though you pass credentials with a shared connection, the person you’re sharing with can’t see the password in the credential or change the credential in any way. No credential object shows up in the credentials list of the person you’re sharing with.
Share Data Sources and Connections
Both data sources and connections can be shared and used by other users or groups. Data Stewards and Site Admin can access all data sources on the site.
You can share:
- A Connection
- A Data Source with an Associated Connection
- A Data Source without a Connection
Share a Connection
When you share a connection with a user, you are automatically sharing its data source information. Sharing a connection causes a data source to appear in the data source list of the user with whom you're sharing.
- To share a connection with someone who already has access to the data source, go to the data source you want to share in Data Sources.
- Find the connection you want to share in the Connections table, and select the vertical ellipsis icon > Share.
- Select the person you want to share a connection with and select Share connection.
- Choose the Connection you want to share and select Share. If the connection you want to share appears dimmed, you have to edit the Connection and turn on Allowed in shared connections.
Share a Data Source
- To share a Data Source from the Data Sources table, select the vertical ellipsis icon > Share. If you're on the details page of the Data Source, select Share.
- Begin typing in the Add Members or Groups field or scroll through the drop-down menu.
- Optionally, you can select Connections to share. If the connection you want to share appears dimmed, you have to edit the Connection and turn on Allowed in shared connections.
- To finish, select Share. Now you can see that the Data Source/Connection is shared. The number of shares changed.
Share a Data Source without a Connection
You can share a Data Source without specifying a Connection. If you choose this path, the users you share the Data Source with will need to create their own Connection and associate it with the Data Source.
Primary and Associated Data Sources
In case a data source is pointing to the same host as another one, we can organize the data sources into Primary or Associated relationships to avoid duplication. The Primary and Associated Data Sources are identified with help of icons with the following letters: A for Alternative or Associated, P for Primary and S for Standalone Data Source.
To see what data sources are associated with the primary one, choose the data source that is primary and select Associated data sources.
Set a Data Source as a Primary Data Source
To set a data source as the primary data source, follow these steps:
- Go to Data Sources. Select the data source you want to set as a primary source.
- Select vertical ellipsis icon > Set as primary data source.
- In the window Set as primary data source, choose the data source you want to set as the alternative. Select Set.
Associate a Data Source
The Standalone data source will become an alternative (associated) source.
- Go to Main Menu > Data Sources. Select the data source you want to associate another data source with.
- Select vertical ellipsis icon > Associate Data Source.
- In the window Associate with another data source, choose the data source you want to set as primary. Select Set. The current data source will be set as alternative data source, and the selected data source will be set as primary data source.
Change a Data Source or Connection
Easily change a data source or connection. Go to the details page of a Workflow.
- Select Run > Advanced Settings.
- Select the arrows icon.
- Select a new Data Source or Connection.
- Select Change.
Harvest Metadata to Build a Data Catalog
To build a data catalog, you first need to harvest metadata. After setting up a data source with a connection, you need to configure the metadata loader. Only Data Stewards and Site Admins have permission to configure metadata loaders.
- Go to Data Sources and select the data source for which you want to configure the metadata loader. Select the vertical ellipsis icon > Load Metadata > Configure.
- In the Configure Metadata Loader modal, select the connection you want to use.
- Enter schemas in the Schemas to Load field. If you leave it blank, all schemas will be imported.
- The metadata loader is configured, and you can select the vertical ellipsis icon > Load Metadata > Sync Now to build the data catalog. The sync will take a while to finish.
Note for Snowflake Users: You have to have the Snowflake role of Account Administrator to load metadata for a Snowflake database. Otherwise the loader will fail.