Object Terms

Terminology applicable to Dataprep by Trifacta.

Note

This list is not comprehensive.

These terms apply to the objects that you import, create, and generate in Dataprep by Trifacta.

author

In an application role, the author privilege allows the highest level of access, except for ownership, to application objects. This privilege can be applied to object types within an assignable role. See Overview of Authorization.

collaborator

Anyone who has been provided editor- or author-level access to an object. See Overview of Sharing.

data quality rule

A data quality rule is a pass/fail test of your data against a condition that you define. Data quality rules can be created to validate your data against the meaning of the data and to assess your efforts to transform it. For more information, see Data Quality Rules Panel.

dataset with parameters

An imported dataset that has been created with parameterized references, typically used to collect multiple assets stored in similar locations or filenames containing identical structures. For example, if you stored orders in individual files for each week in a single directory, you could create a dataset with parameters to capture all of those files in a single object, even if more files are added at a later time.

The path to the asset or assets is specified with one or more of the following types of parameters: Datetime, Wrangle , regular expression, wildcard, or variable.

See Overview of Parameterization.
See Create Dataset with Parameters.

data type

A data type refers to the expected class of values for a column of data. A data type defines the types of information that are expected and can include specific formatting of that information. Column values that do not meet the expectations of the column data type are determined to be invalid for the data type.

editor

In an application role, the editor privilege allows viewing and modifying application objects. This privilege can be applied to object types within an assignable role. See Overview of Authorization.

flow

A container for holding a set of related imported datasets, recipes, and output objects. Flows are managed in Flow View page.

See Application Asset Overview.
See Flow View Page.

flow parameter

A named reference that you can apply in your recipe steps. When applied, the flow parameter is replaced with its corresponding value, which may be the default value or an override value. See Overview of Parameterization.

imported dataset

A reference to an object that contains data to be wrangled in Dataprep by Trifacta. An imported dataset is created when you specify the file(s) or table(s) that you wish to read through a connection.

See Application Asset Overview.
See Import Data Page.

job

A job is the sequence of processing steps that apply each step of your recipe in sequence across the entire dataset to generate the desired set of results.

See Application Asset Overview.
See Run Job Page.

macro

A macro is a sequence of one or more reusable recipe steps. Macros can be configured to accept parameterized inputs, so that their functionality can be tailored to the recipe in which they are referenced.

See Application Asset Overview.
See Overview of Macros.

output

Associated with a recipe, an output is a user-defined set of files or tables, formats, and locations where results are written after a job run on the recipe has completed.

See Application Asset Overview.
See Flow View Page.

output destinations

An output may contain one or more destinations, each of which defines a file type, filename, and location where the results of the output are written.

See Run Job Page.
See Flow View Page.

output parameter

You can create variable or timestamp parameters that can be applied to parts of the file or table paths of your outputs. Variable values can be specified at the time of job execution.

See Overview of Parameterization.
See Run Job Page.

plan

A plan is a sequence of triggers and tasks that can be applied across multiple flows. For example, you can schedule plans to execute sequences of flows at a specified frequency. For more information, see Overview of Operationalization.

privilege

Note

This feature may not be available in all product editions. For more information on available features, see Compare Editions.

A privilege determines the level of access to a type of Alteryx object. Privileges are assigned using roles. For more information, see Overview of Authorization.

parameter override

A value that is applied instead of the default or inherited value for a parameter. A parameter override may be applied at the flow level or at the time of job execution.

See Manage Parameters Dialog.
See Run Job Page.

recipe

A sequence of steps that transforms one or more datasets into a desired output. Recipes are built in the Transformer page using a sample of the dataset or datasets. When a job is executed, the steps of the recipe are applied in the listed order to the imported dataset or datasets to generate the output.

See Application Asset Overview.
See Recipe Panel.

reference

A pointer to the output of a recipe. A reference can be used in other flows, so that those flows get the latest version of the output from the referenced recipe.

See Application Asset Overview.
See Flow View Page.

reference dataset

A reference that has been imported into another flow.

See Application Asset Overview.
See Flow View Page.

results

A set of generated files or tables containing the results of processing a selected recipe, its datasets, and all upstream dependencies. See Job Details Page.

results profile

Optionally, you can create a profile of your generated results. This profile is available through the Trifacta Application and may assist in analyzing or troubleshooting issues with your dataset. See Overview of Visual Profiling.

role

Note

This feature may not be available in all product editions. For more information on available features, see Compare Editions.

A role is a set of privileges that governs access levels to one or more types of objects. For more information, see Overview of Authorization.

sample

When you review and interact with your data in the data grid, you are seeing the current state of your recipe applied to a sample of the dataset. If the entire dataset is smaller than the defined limit, you are interacting with the entire dataset.

You can create new samples using one of several supported sampling techniques. See Overview of Sampling.

schedule

You can associate a single schedule with a flow. A schedule is a combination of one or more trigger times and the one or more scheduled destinations that are generated when the trigger is hit. A schedule must have at least one trigger and at least one scheduled destination in order to work.

See also trigger and scheduled destination.
See Overview of Scheduling.

scheduled destination

When a schedule's trigger is fired, each recipe that has a scheduled destination associated with it is queued for execution. When the job completes, the outputs specified in the scheduled destination are generated. A recipe may have only one scheduled destination, and a scheduled destination may have multiple outputs (publishing actions) associated with it.

See also schedule and trigger.
See Overview of Scheduling.

schema

A schema defines the column names, data types, and ordering of your dataset. Schemas apply to relational datasources and some file types, such as Avro or Parquet. For more information, see Overview of Schema Management.

snapshot

When a plan is triggered, a snapshot of all tasks in the plan is taken. The tasks of the plan are executed against this snapshot. Subsequent revisions to these objects may impact the execution of the plan. For more information, see Overview of Operationalization.

target

A set of columns, their order, and their formats to which you are attempting to wrangle your dataset. A target represents the schema to which you are attempting to wrangle. You can assign a target to your recipe, and the schema can be superimposed on the columns in the data grid, allowing you to make simple selections to transform your dataset to match the column names, order, and formats of the target. See Overview of Target Schema Mapping.

task

A task is an executable action that is part of a plan. For example, when a plan is triggered, the first task in the plan is queued for execution, which may be to execute all of the recipes and their dependencies in a flow. For more information, see Overview of Operationalization.

trigger

A trigger is a periodic time associated with a schedule. When a trigger's time occurs, all of flows associated with the trigger are queued for execution.

A schedule can have multiple triggers. See also schedule and scheduled destination.
For more information on flow-based triggers, see Overview of Scheduling.

variable (dataset)

A replacement for the parts of a file path to data that change with each refresh. A variable can be overwritten as needed at job runtime.

See Overview of Parameterization.
See Create Dataset with Parameters.

viewer

In an application role, the viewer privilege allows read-only access application objects. This privilege can be applied to object types within an assignable role. See Overview of Authorization.

In this section:

Object Terms

author

collaborator

data quality rule

dataset with parameters

data type

editor

flow

flow parameter

imported dataset

job

macro

output

output destinations

output parameter

plan

privilege

parameter override

recipe

reference

reference dataset

results

results profile

role

sample

schedule

scheduled destination

schema

snapshot

target

task

trigger

variable (dataset)

viewer

Search results