Zero-shot Text Classification

The Zero-shot Text Classification tool assigns scored categories to bodies of text based on a category list you define. For example, you can feed in newspaper articles and define the label categories "Politics" and "Technology" and the tool provides a probability for the relevance of each label. The Zero-shot Text Classification tool doesn’t require training data and leverages ONNX Runtime using the huggingface transformer model.

Alteryx Intelligence Suite Required

This tool is part of Alteryx Intelligence Suite. Intelligence Suite requires a separate license and add-on installer to Designer. After you install Designer, install Intelligence Suite and start your free trial.

Language Support

The Zero-shot Text Classification tool only supports English at this time.

Tool Components

The Zero-shot Text Classification tool has 3 anchors (2 inputs and 1 output):

D input anchor: Use the D input anchor to connect the text data you want to categorize.
L input anchor: Use the L input anchor to pass category labels to the tool.
Output anchor: Use the output anchor to pass the scored categories for each body of text downstream.

Configure the Tool

Add a Zero-shot Text Classification tool to the canvas.
Use the D input anchor to connect the Zero-shot Text Classification tool to the text data you want to use in the workflow.
If you have large bodies of text, split the text into smaller sections or pre-process your text with the Text Pre-processing or Text Summary tools.
Use the L input anchor to pass the category labels to the Zero-shot Classification tool. You can use the Text Input tool to create your list of category labels.
Select the Column with Text you want to analyze. The tool doesn’t require training data.
Select the Column with Labels for the categories you want to score.
(Optional) Select Multi-label Classification to treat categories independently from each other. Use this option to determine if your text belongs to more than 1 category.
Run the workflow.

Output

The output includes 2 sets of columns:

Column for each category label. Each column represents the degree to which the text in each row is associated with each category. A higher value in the category column indicates a greater probability the text associates with that category.
Column that contains the category label with the highest probability value if you use more than 1 category label.

Zero-shot Text Classification

Tool Components

Configure the Tool

Output

Search results