Python Tool
The Python tool is a code editor for users of Python. Proficiency in Python is recommended before using this tool.
from ayx import Alteryx
Alteryx.help().
Python support
Designer accepts custom Python code. Alteryx does not provide support for custom Python code.
Alteryx public gallery compatibility
Planning to publish your workflow to gallery.alteryx.com? You must first apply for an exemption. This restriction does not apply to private instances of Alteryx Server and Alteryx Gallery.
Getting started
The Python tool configuration window interface resembles a Jupyter Notebook. If you are unfamiliar with Jupyter Notebooks, go to Help > User Interface Tour or Help > Notebook Help. For code assistance, see the additional references that are available under the tool's Help option.
Install additional data science packages if needed
The Python tool includes the following more common data science packages:
- ayx: Alteryx API
- jupyter: Jupyter metapackage.
- matplotlib: Python plotting package.
- numpy: NumPy, array processing for numbers, strings, records, and objects.
- pandas: Powerful data structures for data analysis, time series, and statistics.
- requests: Python HTTP for Humans.
- scikit-learn: A set of Python modules for machine learning and data mining .
- scipy: SciPy, Scientific Library for Python .
- six: Python 2 and 3 compatibility utilities.
- SQLAlchemy: Database Abstraction Library.
- statsmodels: Statistical computations and models for Python.
Additional package installation
Depending on what installation type of Designer you're using, you can install additional packages using Alteryx.installPackages. The example below installs keras.
from ayx import Package
Package.installPackages("keras")
- If you are running Alteryx non-admin, you can install additional Python packages without any special permissions.
- If you are running Alteryx admin, you must first run Alteryx as administrator to install additional Python packages. If you are unable to run Alteryx as administrator, you cannot install additional Python packages.
Connect inputs
The Python tool accepts multiple inputs. Once inputs are connected, you must run the workflow to cache the incoming data streams.
To access an incoming data connection:
- Import the Alteryx library: from ayx import Alteryx
- Access the connection and provide a variable to use a data reference:
- Use the connection name: Alteryx.read("<connection name>")
- Read in all connections and referencing the returned 0-index array: Alteryx.read(Alteryx.getIncomingConnectionNames()[<index number>])
-
Run your workflow before beginning to work with the Python tool. Running your workflow caches your data and makes it accessible to the Python tool. Your data is then treated as a pandas data frame. More information about pandas data frames can be found at pandas.pydata.org.
from ayx import Alteryx
data1 = Alteryx.read("#1")
from ayx import Alteryx
data2 = Alteryx.read(Alteryx.getIncomingConnectionNames()[1])
Configure the tool
Run your workflow before beginning to work with the Python tool.
You should start development using Interactive mode. That way all errors, warnings, and print statements display in the Jupyter Notebook. Use Production mode to improve speed when you have completed development and just want to run your code through a standard Python interpreter.
Click Interactive to set Interactive mode. Use interactive mode when developing. This allows you to interact with the incoming data through a Jupyter Notebook without having to re-run the workflow to see the results of your code.
When you run the workflow:
- Alteryx caches a copy of the incoming data is and makes it available to the Python tool.
After making changes upstream, you should re-run the workflow to refresh the cached data. This will ensure the cached data is representative of the actual incoming data.
- The Jupyter shell executed the code in the Jupyter Notebook.
- If your code calls Alteryx.write(), the Jupyter shell sends the results through the output anchors.
- The Jupyter Notebook displays any errors, warnings, and print statements. This is the same as selecting Run All.
Click Production to set Production mode. In Production mode, Alteryx consolidates all Python cells from the Jupyter Notebook into a single, read-only script. It is this read-only script that Alteryx uses to pass your code to the Python interpreter.
When you run the workflow:
- Alteryx bypasses the Jupyter shell and runs the read-only script through a standard python interpreter. No results, errors, warnings, or print statements are printed to the Jupyter Notebook.
To edit the Production-mode script, click Interactive mode and then edit the cells in the Jupyter Notebook. Once your edits are complete, click Production mode.
Set data storage format
There are two data storage format options: SQLite and YXDB.
To use SQLight storage format:
- Click the Alteryx menu within the tool's configuration window
- Select Sqlite override
To use YXDB storage format:
- Click the Alteryx menu within the tool's configuration window
- Deselect Sqlite override to remove the checkmark
SQLite | YXDB | |
Blob | Not supported | Supported |
Spatial objects | Not supported |
Supports passing spatial objects between the Python tool and other tools. It is helpful to use the metadata tags when creating spatial object outputs from the Python tool. |
Column limitation | Limit is 2000 | No limitation |
Null values note | Numeric/byte columns containing null values will be converted to a data type of float64 - double precision float. |
If you are not changing the arrangement of the rows or using GeoSpatial python, Alteryx recommends that you slice the GeoSpatial data off the dataset and rejoin it after the Python tool. The reason for this is the conversion to and from Alteryx Binary to GeoSpatial text is not speedy.
Import a file or directory
Depending on how much control you want over relative paths, you can use the Alteryx import function from the Alteryx menu or use the import command. You can import an existing Python script or Jupyter Notebook using the Alteryx import function. If you want to manage relative paths, use the import command in the cell. Import examples include using the import command to import a directory, or using the Alteryx import function to import a single script.
To import a Python script or Jupyter Notebook
- Click the Alteryx menu and then select Import Script.
- Click Choose File and then navigate to a
.py
or.ipynb
file. - Click Import.
Alteryx imports the file.
Use the Kernel menu
- Stop processing: Click the Kernel menu and then select Interrupt to stop processing.
- Restart processing: Click the Kernel menu and then select a Restart option to restart the processing of the interactive environment.
- Restart processing: Click the Kernel menu and then select Reconnect to clear the workbook of intermediate results.
- Change kernel does not provide functionality.
- It is recommended that you do not select Shutdown.
Follow best practices
The following best practices will help you work with the Python tool successfully.
Use the Alteryx.getWorkflowConstant when referring to a workflow constant such as Engine.WorkflowDirectory. Otherwise, the result or output of the command permanently replaces the command in your Jupyter Notebook when you run your code. Avoid using % wrappers in workflow constants. For example, to call the Engine.WorkflowDirectory, use the following:
from ayx import Alteryx
Alteryx.getWorkflowConstant("Engine.WorkflowDirectory")
Output data from the tool
Use Alteryx.write to output data from the tool.
- To send data to other tools on the canvas, use Alteryx.write(<pandas data frame>, <output anchor number>).
Alteryx.write(df,1)
- Alteryx.write only accepts pandas data frames. If you have data in another format, use the pandas library to convert it to a pandas data frame. The pandas library is pre-installed with Designer and can be accessed in the Jupyter Notebook using import pandas.
- You can send up to five data frames to the output anchors.