Input Data Tool
Use Input Data to add data to your workflow by connecting to a file or database.
The Input Data tool has one anchor. The output anchor only displays the results of your actions. Use the Join Tool to join the results of an Input Data tool with the original data.
The Input Data tool Configuration window has two sections: Connect a File or Database and Options.
Connect a File or Database
With the Input Data tool on the canvas, follow these steps:
- In the Configuration window, select the Connect a File or Database dropdown.
- Designer displays the Data connections window. Configure your data connection using one of these: Recent, Saved, Files, Data Sources, or Gallery.
Recent connections contains recently configured files and data connections. You can select a recent connection or select Clear list to delete recent connections.
Select a Saved connection. To rename and edit your connections, use Manage Data Connections.
To connect to a file in a local or network directory, there are four options:
- In File connections, select Select file to browse to a file.
- Drag a file to File connections.
- In All supported file types, select a file type extension to browse to a file of that type.
- Close Data connections and drag a file directly onto the canvas.
In the file browse window, type a wildcard as part of the file path.
Consider a case where you have multiple data files with both:
- Multiple files are read using the wildcard format such as *.csv or 2019*.csv in a single Input Data tool, as long as the files all contain the same number of fields, and that the data types for each field are the same. Designer sets the number of fields and the file types based on the first file read. Any subsequent files that do not match are skipped and a warning displays. It is not possible to control which file is read first when using a wildcard syntax like *.csv; it is up to the system which file is designated as the first.
- Similar names in the same directory.
Type the file name they have in common and add an * to substitute all subsequent characters or a ? to substitute one character. Include the file extension that is common to all files when specifying the file names.
This path brings in every .csv file in the data\datafiles directory with a file name that begins with
It would bring in
This path brings in every .csv file contained within the data\datafiles directory with a file name that begins with
ABCD_ + 1character.
On Select Excel Input, select one of the Excel inputs:
- Select a sheet to choose from the sheets available in the Excel file.
- Select a range (cells, rows, or columns): Optionally, select a range of cells, rows, or columns within the sheet.
- Select a named range to choose from the named ranges available in the Excel file.
- Import only the list of sheet names to create output with a single column containing sheet names as values. None of the related data is output when this option is selected.
If the Excel sheet that you select is open, an error message is displayed and you will not be able to proceed. Close the sheet, select the Refresh button in the Select Excel Input dialog, and finish the input process.
When you select a .gz, .tgz or .zip file, the Select file type to extract window displays. Use the drop-down to select the type of files to use.
Select files: Check the box next to the file you want to extract.
To see all files in the Gzip or Zip file, including files that are not supported by Alteryx, select Other Files under Select file type to extract. Select a file type to Parse other files as.
Alteryx doesn't support Gzip and Zip files in Gallery.
|Alteryx Spatial Zip||.sz|
|Comma Separated Values||.csv|
|ESRI Personal GeoDatabase||.mdb|
|Extensible Markup Language||.xml|
|Google Earth/Google Maps||.kml|
|MapInfo Professional Interchange Format||.mif|
|MapInfo Professional Table||.tab|
|Microsoft Access 2000-2003||.mdb|
|Microsoft Access 2007, 2010, 2013, 2016||.accdb|
|Microsoft Excel Binary||.xlsb|
|Microsoft Excel 1997-2003||.xls|
|Microsoft Excel Macro-Enabled||.xlsm|
Unknown file types include any other file types that Alteryx can't read natively. If you use an Input data tool to read an unknown file type, the Resolve File Type window displays with the following options:
- Read it as a built in type: Select the file type that resembles the structure of your file.
- Read it as a fixed width text file: When you select this option, you can assign columns by using a sliding ruler. See Flat file layout for more information on this option.
- Read it as a delimited text file: Select this option if there is a single delimiter that signifies column breaks in your data. Select an appropriate delimiter type, and optionally select whether the First Row Contains Field Names in your data.
Data sources displays supported and frequently used data sources.
- Tools: If you select Quick connect for a tool you have not installed, a browser opens to the Alteryx gallery for you to download and install that tool. Read the instructions on the page carefully. Once the tool is installed, the Input Data tool changes on the canvas to the tool you selected from the Data sources tab.
- Data sources:
- ODBC launches the ODBC connection window that displays a filtered list of DSNs on the system that use that particular driver.
- OleDB launches the native Windows OleDB manager.
- OCI launches the Native Oracle OCI connection manager. From here, select the Net Service Name as defined in your tnsnames.ora file that you want to use for this connection as well as the username and password credentials.
- Bulk opens a special dialog allowing you set up a bulk connection for the selected connection type.
- Quick connect: For SQL or Oracle Quick connect, you can either use a pre-existing saved connection or create a new saved connection. Refer to the following for details:
- All other Quick connections use another tool.
Select Quick connect under HDFS to create a new Hadoop database connection.
Alteryx connects to a Hadoop Distributed File System and reads .csv and .avro files. All Hadoop distributions implementing the HDFS standard are supported.
Configure HDFS Connections
HDFS can be read using httpfs (port 14000), webhdfs (port 50070), or Knox Gateway (8443). Consult with your Hadoop administrator for which to use. If you have a Hadoop High Availability (HA) cluster, your Hadoop admin must explicitly enable httpfs.
MapR may not support webhdfs.
In the HDFS Connection window:
- Select a server configuration: HTTPFS, WebHDFS, or Knox Gateway.
- Host: Specify the installed instance of the Hadoop server. The entry must be a URL or IP address.
- Port: Displays the default port number for httpfs (14000), webhdfs (50070), or Knox Gateway (8443), or enter a specific port number.
- URL: The URL defaults based on the Host. The URL can be modified.
- User Name: Depending on the cluster setup, specify the user name and password for access.
- httpfs: A user name is needed, but it can be anything.
- webhdfs: The user name is not needed.
- Knox Gateway: A user name and password is needed. Self-signed certificates are not supported in Alteryx. Use a trusted certificate when configuring Knox authentication.
- Kerberos: Select a Kerberos authentication option for reading and writing to HDFS. The option you choose depends on how your IT admin configured the HDFS server:
- None: No authentication is used.
- Kerberos MIT: Alteryx uses the default MIT ticket to authenticate with the server. You must first acquire a valid ticket using the MIT Kerberos Ticket Manager.
- Kerberos SSPI: Alteryx uses Windows Kerberos keys for authentication, which are obtained when logging in to Windows with your Windows credentials. The User Name and Password fields are therefore not available.
- (Recommended) Click Test to test the connection.
- Select OK.
- Specify the path of the file (for example,
path/to/file.csv), or browse to the file and select it.
- Select the Avro or CSV file format and click OK.
To connect to HDFS for in-database processing, use the Connect In-DB Tool.
You can also make a Generic connection or a 32-bit connection to databases.
Before you connect to a database, consider the following:
- Both ODBC and OleDB connection types support spatial connections. Alteryx auto-detects if a database supports spatial functionality and displays the required configurations.
- To connect to a database for in-database processing, see In-Database Overview.
Point to an option and select a saved or shared data connection to connect it, or select Manage to view and edit connections.
All Connections: Displays a list of connections saved to your computer plus connections shared with from a Gallery.
My Computer: Displays a list of connections saved to your computer.
Gallery: Displays a list connections shared with you from a Gallery.
Add a Gallery: Opens the Gallery Login screen. Use your user name and password to log in. After logging in, return to Saved Data Connections and point to the Gallery in the list to view connections shared from the Gallery.
See Manage data connections for more on managing saved and shared data connections and troubleshooting.
|Hadoop Distributed File System (HDFS)|
|Hadoop Distributed File System (HDFS)|
|Hortonworks||Hadoop Distributed File System (HDFS)|
|MapR||Hadoop Distributed File System (HDFS)|
|Microsoft||Microsoft Analytics Platform System|
|Microsoft Azure Data Lake Store|
|Microsoft Azure SQL Data Warehouse|
|Microsoft Azure SQL Database|
|Microsoft Cognitive Services|
|Microsoft Dynamics CRM|
|Microsoft SQL Server|
Gallery displays each gallery and its URL that has been added on the local computer. A list below each gallery name contains the saved connections stored on the server you have access to.
Select +Gallery to add another gallery.
Select file format options. Options vary based on the file or database to which you connect. See File Format Options.
Preview Data Layout
Preview the data layout in the Preview window. The data layout preview is limited to 100 records, or fewer in wide files. If data contains 1500 columns, at least one record and up to two rows are displayed. To view all data, use a Browse Tool.
Use Classic Mode
To use classic mode:
- Select Options > User Settings > Edit User Settings.
- On the Defaults tab, check the box Use classic mode for the Input/Output tool menu options.
- Select OK.
- Select the canvas or F5 to refresh.
You can now use the Input Data tool classic mode to select your files and data sources.
Convert an Output Data Tool to an Input Data Tool
You can convert an Output Data tool to an Input Data tool.
- Select and hold (or right-click) the Output Data tool you want to convert to an Input Data tool.
- Select Convert To Input Data.
The output data becomes the input data.