Skip to main content

Databricks Volumes

Connection Type

ODBC (64-bit)

Driver Configuration Requirements

The host must be a Databricks Unity Catalog cluster JDBC/ODBC Server hostname.

Type of Support

In-Database Write

Validated On

Databricks Cluster and SQL Warehouse Simba Apache Spark Driver 2.6.23.1039

Driver Details

In-Database processing requires 64-bit database drivers.

Alteryx Tools Used to Connect

In-Database Workflow Processing

Connect In-DB Tool 

Blue icon with database being plugged in.

Data Stream In Tool 

Blue icon with a stream-like object flowing into a database.

Caution

  • Databricks Volumes is only supported using DCM.

  • Databricks Volumes is only supported using DSN-less connections.

  • Databricks Volumes is only supported for Unity Catalog.

  • Writing to Databricks Unity Catalog is only supported using the In-DB tools.

  • Alteryx supports MergeInDB for Databricks Unity Catalog, go to the Write Data In-DB tool.

Configure In-DB Connection

  1. Open the Manage In-DB Connections window.

  2. Select Databricks Unity Catalog in the Data Source dropdown.

  3. Select New to create a new connection.

  4. Enter a Connection Name.

  5. On the Read tab, select Setup Connection to open the DCM connection manager for Databricks Unity Catalog. The DCM Connection Manager is pre-filtered to show only Apache Spark ODBC DSN-less with Simba Databricks Unity Catalog connections.

  6. Select an existing connection or select +New to create a new connection. Go to Databricks Unity Catalog for configuring a new connection using DCM.

  7. On the Write tab, select Databricks UC Volumes Bulk Loader (Avro) in the dropdown.

  8. Select Setup Connection to open the DCM Connection Manager for the Databricks Connection. The DCM Connection Manager is pre-filtered to show only Apache Spark ODBC Bulk DSN-less with Databricks UC Volumes connections.

  9. Select an existing connection or select +New to create a new connection. See below for configuring a new connection using DCM.

  10. Select Apply and OK to save the connection and close the window.

  11. If the In-DB Connections Manager was accessed through the Connect In-DB tool, the Choose Table or Specify Query window loads and allows you to select the tables.

Configure Apache Spark ODBC Bulk DSN-less with Databricks UC Volumes in DCM

This connection is used for writing data to Databricks Unity Catalog using Volumes staging.

  1. Open Data Connection Manager and navigate to Apache Spark ODBC Bulk DSN-less with Databricks UC Volumes.

    - From an Input tool or the In-DB Connection Manager, DCM is pre-filtered.

    - From the File Menu, go to File > Manage Connections > +New > Apache Spark > Apache Spark ODBC Bulk DSN-less with Databricks UC Volumes.

  2. Enter a Data Source Name.

  3. Enter the Databricks Unity Catalog Host name.

  4. The Port is set to 443 by default. Change as needed.

  5. Enter the http path. The http path is the Databricks compute resources URL.

  6. Enter the Catalog. This sets the catalog that is used for writing data and creating tables.

  7. Enter the Schema. This sets the schema that is used for writing data and creating tables.

  8. Enter the full path for the Databricks Volume in the format /Volumes/<catalog>/<schema>/<volume>/<path/to/folder>.

  9. Select Save to save the Data Source.

  10. Select +Connect Credential to add a Credential.

    1. Select an Authentication Method.

    2. To use a Personal Access Token, select Username and password as the authentication method and make the username “token”.

    3. To use Azure AD, go to Databricks Azure OAuth Authentication.

    4. Select an Existing Credential or select Create New Credential to create a new credential and enter the Personal Access Token or the information for Azure AD.

  11. Select Link to link the credential to the Data Source.

  12. Select Connect.