Input Data Tool
One Tool Example
Input Data has a One Tool Example. Go to Sample Workflows to learn how to access this and many other examples directly in Alteryx Designer.
Use Input Data to add data to your workflow by connecting it to a file or database.
Note
The use of wildcards to read multiple CSV files with the Hadoop Distributed File System produces an error.
Configure the Tool
The Input Data Configuration window has 2 sections: Connect a File or Database and Options.
Connect a File or Database
The below steps describe the process with Data Connection Manager (DCM) disabled. When enabled, the Connect a File or Database dropdown is replaced with a Set Up a Connection button opening the Data Connections window displaying only data sources supported by DCM, and opening DCM by selecting a technology.
With the Input Data tool on the canvas, follow these steps:
In the Configuration window, select the Connect a File or Database dropdown.
Designer displays the Data connections window. Configure your data connection using one of these: Recent, Saved, Files, Data Sources, or Server.
The Recent connections section contains recently configured files and data connections. You can select a recent connection or select Clear list to delete recent connections.
Select a Saved connection. To rename and edit your connections, use Manage Saved Connections.
To connect to a file in a local or network directory, there are 4 options:
In File connections, select Select file to browse to a file.
Drag a file to File connections.
In All supported file types, select a file type extension to browse to a file of that type.
Close Data connections and drag a file directly onto the canvas.
Data Sources
Data sources displays supported and frequently used data sources.
Tools: If you select Quick connect for a tool you have not installed, a browser opens to the Alteryx Community for you to download and install that tool. Read the instructions on the page carefully. Once the tool is installed, the Input Data tool changes on the canvas to the tool you selected from the Data sources tab.
Data sources:
ODBC launches the ODBC connection window that displays a filtered list of DSNs on the system that use that particular driver.
OleDB launches the native Windows OleDB manager.
OCI launches the Native Oracle OCI connection manager. From here, select the Net Service Name as defined in your tnsnames.ora file that you want to use for this connection as well as the username and password credentials.
Bulk opens a special dialog allowing you set up a bulk connection for the selected connection type.
Quick connect: For SQL or Oracle Quick connect, you can either use a pre-existing saved connection or create a new saved connection. Refer to these pages for details:
All other Quick connections use another tool.
Server
Server displays each Alteryx Server UI and its URL that has been added on the local computer. A list below each Server name contains the saved connections stored on the Server you have access to.
Select + Server to add another Alteryx Server connection.
In the file browse window, enter a wildcard as part of the file path.
Consider a case where you have multiple data files with...
The same number of fields where data types for each field are the same.
Similar names in the same directory.
Note
Multiple files are read using the wildcard format, like *.csv or 2019*.csv in a single Input Data tool, as long as the files all contain the same number of fields, where the data types for each field are the same. Designer sets the number of fields and the file types based on the 1st file read. Any subsequent files that do not match are skipped and a warning displays. It is not possible to control which file is read 1st when using a wildcard syntax like *.csv. It is up to the system which file is designated as the first.
Enter the file name that the multiple files have in common and add an * (asterisk) to substitute all subsequent characters or a ? to substitute 1 character. Include the file extension that is common to all files when you specify the file names.
Example 1
This path brings in every .csv file in the data\datafiles
directory with a file name that begins with ABCD
.
data\datafiles\ABCD*.csv
It brings in ABCD_4.csv
and ABCD_012.csv
.
Example 2
This path brings in every .csv file contained within the data\datafiles
directory with a file name that begins with ABCD_
and has 1 additional character.
data\datafiles\ABCD_?.csv
On Select Excel Input, select 1 of the Excel inputs:
Select a sheet to choose from the sheets available in the Excel file.
Select a range (cells, rows, or columns): Optionally, select a range of cells, rows, or columns within the sheet.
Select a named range to choose from the named ranges available in the Excel file.
Import only the list of sheet names to create output with a single column containing sheet names as values. None of the related data is output when this option is selected.
If the Excel sheet that you select is open, an error message displays and you will not be able to proceed. Close the sheet, select the Refresh button in the Select Excel Input dialog, and finish the input process.
The Input Data tool skips columns that are not defined. A defined column is a column that either contains data or previously contained data that was cleared by the user. To include columns that are not defined, specify a range via the Select a range (cells, rows or columns) option in the Select Excel Input dialog.
See Gzip File Support and Zip File Support for more information.
When you select a .gz, .tgz or .zip file, the Select file type to extract window displays. Use the dropdown to select the type of files to use.
Select files: Check the box next to the file you want to extract.
To see all files in the Gzip or Zip file, including files that are not supported by Alteryx, select Other Files under Select file type to extract. Select a file type to Parse other files as.
Important
Alteryx doesn't support Gzip and Zip files in Gallery.
.cydb | |
.yxdb | |
Alteryx Spatial Zip | .sz |
.avro | |
Comma Separated Values | .csv |
dBase | .dbf |
ESRI Personal GeoDatabase | .mdb |
ESRI Shapefile | .shp |
Extensible Markup Language | .xml |
.flat | |
GIS | .grc, .grd |
Google Earth/Google Maps | .kml |
.gz, .tgz | |
IBM SPSS | .sav |
.json | |
MapInfo Professional Interchange Format | .mif |
MapInfo Professional Table | .tab |
Microsoft Access 2000-2003 | .mdb |
Microsoft Access 2007, 2010, 2013, 2016 | .accdb |
Microsoft Excel Binary | .xlsb |
Microsoft Excel 1997-2003 | .xls |
.xlsx | |
Microsoft Excel Macro-Enabled | .xlsm |
QlikView | .qvx |
SAS | .sas7bdat |
SQLite | .sqlite |
SRC Geography | .geo |
Text | .txt, .asc |
.zip |
Unknown file types include any other file types that Alteryx can't read natively. If you use an Input data tool to read an unknown file type, the Resolve File Type window displays with these options:
Read it as a built in type: Select the file type that resembles the structure of your file.
Read it as a fixed width text file: When you select this option, you can assign columns by using a sliding ruler. Go to Flat File Layout for more information on this option.
Read it as a delimited text file: Select this option if there is a single delimiter that signifies column breaks in your data. Select an appropriate delimiter type, and optionally select whether the First Row Contains Field Names in your data.
Options
Select file format options. Options vary based on the file or database to which you connect. Go to File Format Options for more information.
Preview Data Layout
Preview the data layout in the Preview window. The data layout preview is limited to 100 records or fewer in wide files. If data contains 1500 columns, at least 1 record, and up to 2 rows display. To view all data, use a Browse tool.
Duplicate Column Names
If your input file contains multiple columns with the same name, Designer automatically renames the duplicate columns according to these rules:
Duplicate Name Ends with 1 or 9
If the last character in the duplicate column name is either 1 or 9, Designer appends an underscore (_) and a number, starting with 2 to the duplicate column name.
Original Column Name | Duplicate Column (Renamed by Designer) |
---|---|
A1 | A1_2 |
A9 | A9_2 |
Duplicate Name Ends with a Digit Between 2-8 (Inclusive)
If the last character in the duplicate column is a digit between 2 and 8 (inclusive), Designer increments that digit to rename the duplicate column name.
However, if the 2nd to last character is also a digit, Designer appends an underscore (_) and a number, starting with 2 to the duplicate column name.
Original Column Name | Duplicate Column (Renamed by Designer) |
---|---|
A2 | A3 |
A5 | A6 |
A22 | A22_2 |
Duplicate Name Ends with a Letter or Special Character
If the last character in the duplicate column is a letter or special character, Designer adds a number (starting with 2) to rename the duplicate column name.
Original Column Name | Duplicate Column (Renamed by Designer) |
---|---|
age | age2 |
registered? | registered?2 |
Note
Visual Query Builder can't display multi-byte characters correctly. Use the tables tab instead.
Convert an Output Data Tool to an Input Data Tool
You can convert an Output Data tool to an Input Data tool.
Right-click the Output Data tool you want to convert to an Input Data tool.
Select Convert To Input Data.
The output data becomes the input data.