Apache Spark ODBC
Connection Type | ODBC (64-bit) |
Driver Configuration Requirements | For optimal performance, you must enable the Fast SQLPrepare option within the driver Advanced Options to allow Alteryx to retrieve metadata without running a query. |
Driver Details | In-Database processing requires 64-bit database drivers. |
Type of Support | Read & Write, In-Database |
Validated On | Database Version: 2.3.1.3.0.1.0-187 ODBC Client Version: 2.6.18.1030 |
For more information about the Simba ODBC driver, see the Simba ODBC documentation.
Alteryx Tools Used to Connect
Standard Workflow Processing
In-database Workflow Processing
To use the Apache Spark ODBC, you must have Apache Spark SQL enabled. Not all Hadoop distributions support Apache Spark. If you are unable to connect using Apache Spark ODBC, contact your Hadoop vendor for instructions on how to set up the Apache Spark server correctly.
If you have issues with reading or writing Unicode® characters, access the Simba Impala ODBC driver. Under Advanced Options, select the “Use SQL Unicode Types” option.
Read Support
Install and configure the Apache Spark ODBC driver:
Spark Server Type: Select the appropriate server type for the version of Apache Spark that you are running. If you are running Apache Spark 1.1 and later, then select Apache SparkThriftServer.
Authentication Mechanism: See the installation guide downloaded with the Simba Apache Spark driver to configure this setting based on your setup.
To set up the driver Advanced Options, see the installation guide downloaded with the Simba Apache Spark driver.
Write Support
For both standard and in-database workflows, use the Data Stream In tool to write to Apache Spark. Write support is via HDFS.
Limitations
Cloudera ended support for Spark Thrift JDBC/ODBC server with Cloudera Enterprise version CDH 6.0. See Cloudera documentation for more information: Unsupported Features in CDH 6.0.1 | 6.x | Cloudera Documentation and Unsupported Interfaces and Features.