In-Database Overview

Version:
Current
Last modified: August 17, 2020

In-Database Tools

In-Database Standard Tools

Link
Blue icon with binoculars.

Browse In-DB Tool

Link
Blue icon with database being plugged in.

Connect In-DB Tool

Link
Blue icon with a stream-like object flowing into a database.

Data Stream In Tool

Link
Blue icon with a stream-like object flowing out of a database.

Data Stream Out Tool

Link
Blue icon showing an input into a database.

Dynamic Input In-DB Tool

Link
Blue icon showing an output from a database.

Dynamic Output In-DB Tool

Link
Blue icon showing light filtering through a prism.

Filter In-DB Tool

Link
Blue icon with beaker full of bubbles.

Formula In-DB Tool

Link
Blue shape filled with two overlapping circles.

Join In-DB Tool

Link
Macro Input Tool Icon

Macro Input Tool

Link
Macro Output In-DB Tool

Macro Output In-DB Tool

Link
Blue icon with beaker pouring liquid.

Sample In-DB Tool

Link
Blue icon with a checkmark selecting one of three options.

Select In-DB Tool

Link
Blue icon with sigma.

Summarize In-DB Tool

Link
Transpose In-DB Tool Icon

Transpose In-DB Tool

Link
Blue icon with a double helix.

Union In-DB Tool

Link
Blue icon with floppy disk.

Write Data In-DB Tool

In-Database Predictive Tools

Link
Boosted Model Tool Icon

Boosted Model Tool

Link
Decision Tree Tool Icon

Decision Tree Tool

Link
Forest Model Tool Icon

Forest Model Tool

Link
Linear Regression Tool Icon

Linear Regression Tool

Link
Logistic Regression Icon

Logistic Regression Tool

Link
Score Tool

Score Tool

Change Tool Versions

When a predictive tool with in-database support is placed on the canvas with another In-DB tool, the predictive tool automatically changes to the In-DB version. To manually change the version of the tool...

  1. Right-click the tool.
  2. Point to Choose Tool Version.
  3. Select a different version of the tool.

See Predictive Analytics for more about predictive in-database support.

In-Database Tool Anchors and Connections

Since in-database workflow processing occurs within the database, the In-Database tools are not compatible with the standard Alteryx tools. Several visual indicators show connection compatibility.

Visualization of Standard tool connection.

Standard tools use a green arrow anchor to connect to another tool. The connection displays as a single line.

Visualization of In-Database tool connection.

In-Database tools use a blue square database anchor to connect to another In-Database tool. The connection between two In-Database tools displays as a double line. Due to the nature of in-database processing, connection progress is not displayed.

To connect standard tools to In-Database tools, use the Dynamic Input and Output tools or the Data Stream tools.

How In-Database Works

Data is streamed into and out of an in-database workflow using the Data Stream In and Data Stream Out tools, or by connecting directly to a database using the Connect In-DB tool. The Data Stream In and Data Stream Out tools use an In-DB anchor to connect to In-DB tools, and a standard workflow anchor to connect to standard workflow tools.

Chart showing many different connections

How It Works

  1. Define the connection to the database using the Connect In-DB tool, or use the Data Stream In tool to stream data from a standard workflow into a temporary table in the database.
  2. Connect other In-Database tools to the workflow to process the data.
  3. Use the Write Data In-DB tool to create or update a table in the database, or use the Data Stream Out tool to stream the In-DB results to a workflow for standard processing.
  4. Run the workflow to process the data in the data platform. No results are returned to the Alteryx Engine until the complete in-database workflow has been processed.

In-Database Performance Benefits

Workflow processing can take longer when a large amount of data is streamed in and out of a database. In-database processing can be used to speed up a workflow.

For example, in a standard workflow, a large database table is pulled into memory to be joined with a small spreadsheet. The majority of run time is spent streaming in the database records. In an In-Database workflow, the small spreadsheet is streamed into the database, reducing run time substantially.

Frequently Asked Questions

 

What permissions are required to output data to the database using the Write In-DB tool?

Write privileges are required to create a table in the database.

What happens to the temporary tables created by the Data Stream In tool?

The temporary tables are deleted at the end of the run. If Alteryx crashes while the Data Stream In tool is being run, then the next time that an in-database workflow is run, all temp files created by Alteryx in the database in the previous three days are cleaned out.

Why can't I create temporary tables?

To allow visibility of a temporary table across sessions, Alteryx must create a permanent table that is eventually deleted at the end of a workflow. It is necessary to have CREATE permissions to stream in data from a database and write data to a database. The exception is Microsoft SQL Server.

Are the underlying rules for a database time out maintained during an in-database process, or does Alteryx modify or circumvent those?

The underlying rules are maintained during the process the same as with the database connections via the standard Input Data and Output Data tools. If there is a database timeout or if there is a limit to the number of queries per day that you can run, it will affect your connection to the database.

How is the SQL statement created for an in-database workflow?

A SELECT statement is triggered by the Connect In-DB tool and additional queries are created by downstream tools and nested within this query. The addition of one of the following three tools completes the query and sends it to the underlying database: Write In-DB, Data Stream Out, Browse In-DB.

You can input your own SQL statement in the Query box for the Connect In-DB tool, which also gets embedded within the SELECT statement.

When is the SQL statement processed?

The SQL query for the underlying database is triggered at runtime for each Browse In-DB, Data Stream Out or Write Data In-DB tool.

How is Browse caching enabled?

The Browse Data In-DB tool can be configured to cache the data as a .yxdb file when the workflow is run.

Once the data is cached, if the workflow is re-run and the database connection or query (including the number of records to browse) has not changed, the query will not be re-run. Instead, the data will be pulled from the cache.

An output message indicates whether or not the data was cached. Clicking the link will open the data results as a .yxdb file in a separate window.

When is the in-database cache used?

The in-database cache is used any time a workflow is re-run without changes to upstream tools. Making a change to any upstream tool will trigger a new query and a new cache will be created.

Does the “Browse first [100] records” option impact the amount of data processed by the workflow?

No, the “Browse first [100] records” option only limits the number of records displayed in the Browse In-DB tool. Other tools in the workflow will process the number of records that pass through at any given point.

What does the error "Inconsistent datatypes: expected - got CLOB" mean?

The field has CLOB/LOB datatype and will not work with most of the comparison operators in the Filter or Formula tools. The error reflects that no columns are returned, even when the data matches the comparison. This is expected behavior with SQL and Oracle, as they do not support comparisons with LOB data

Was This Helpful?

Running into problems or issues with your Alteryx product? Visit the Alteryx Community or contact support.