If you are integrating Hadoop cluster, the associated Hadoop dependencies must be installed on the Trifacta node.
The Hadoop dependencies for the latest supported version of each Hadoop distribution are included in the Alteryx software distribution.
Supported Versions:
Configure for EMR in the Configuration Guide
Not required for:
Note
If you are integrating with one of the following running environments, please skip installing Hadoop dependencies.
Azure running environments:
Azure Databricks
Hadoop dependencies for other versions of the Hadoop distribution can be acquired from the Alteryx FTP site using one of the following methods.
Log in: https://ftp.trifacta.com/login
Browse to the following directory:
Releases/Trifacta_x.y/hadoop/
where:
x.y
corresponds to the release number that you are installing (e.g. Release 6.8).Download the following file:
hadoop_deps.tar.gz
Example is for Release 6.8:
wget --user CustomerUsername --ask-password ftps://ftp.trifacta.com/Releases/Trifacta_6.8/hadoop/hadoop-deps.tar.gz
Example is for Release 6.8:
sftp CustomerUsername@ftp.trifacta.com:Releases/Trifacta_6.8/hadoop/hadoop-deps.tar.gz .
Example is for Release 6.8:
curl -O -C - -u CustomerUsername:CustomerPassword ftps://ftp.trifacta.com/Releases/Trifacta_6.8/hadoop/hadoop-deps.tar.gz
Access the FTP server via your preferred FTP client.
Browse to the following directory:
Releases/Trifacta_x.y/hadoop/
where:
x.y
corresponds to the release number that you are installing (e.g. Release 6.8).Download the following file:
hadoop_deps.tar.gz
If needed, transfer the download to the Trifacta node.
Extract it to the following directory:
sudo tar -vxf hadoop-deps.tar --directory /opt/trifacta/
Note
After you extract the files to the target directory, verify that the ownership of the new directory (/opt/trifacta/hadoop-deps/
) and its subfolders match the ownership settings for the rest of the Alteryx installation in /opt/trifacta
.