Use Sentiment Analysis to determine whether text data reflects positive, negative, or neutral sentiment.
We recommend that you don't use the Text Pre-processing tool to process text data for use with the Sentiment Analysis tool. The Text Pre-processing tool can remove features that the Sentiment Analysis tool relies on to determine sentiment.
Currently the Sentiment Analysis tool can only analyze data that contains characters used in the English language.
The Sentiment Analysis tool has 2 anchors:
- Input anchor: Use the input anchor to connect the text data you want to analyze.
- Output anchor: Use the output anchor to pass the data you've analyzed downstream.
Configure the Tool
- Add a Sentiment Analysis tool to the canvas.
- Use the anchors to connect the sentiment Analysis tool to the text data you want to use in the workflow.
- Select the Algorithm you want to use to perform a sentiment analysis.
- Select the Text Field you want to analyze.
- Run the workflow.
The Sentiment Analysis tool has some advanced options.
Currently, only one algorithm is available.
The Valence Aware Dictionary for Sentiment Reasoning (VADER) algorithm measures the valence and magnitude of emotion in text. The valence of emotion refers to whether it is positive or negative. The magnitude of emotion refers to how positive or negative it is. VADER also identifies text that is not emotional or neutral in its valence.
To use punctuation to parse text into sentences before analysis, check the box for Find Sentiment at Sentence Level.
Many sentiment-analysis algorithms, including the VADER algorithm, are tuned to find sentiment at the sentence level, which means the algorithms parse sentences, analyze each sentence individually, and then returns the average compound sentence score for the whole body of text. For the algorithms to parse sentences, your text data has to contain end punctuation.
To categorize the text data as "positive," "negative," and "neutral" in your output, check box for Output Categorical Sentiment. Then you can use the Max Negative Classification and Min Positive Classification fields to define the range for each category using the compound sentiment score.
Max Negative Classification defines how sensitive to negative sentiment the algorithm should be. Increase this parameter to broaden the range of negative sentiment the algorithm can detect. Min Positive Classification defines how sensitive to positive sentiment the algorithm should be. Decrease this parameter to broaden the range of positive sentiment the algorithm can detect.
The compound sentiment score ranges from -1 to 1. The algorithm categorizes any number between -1 and Max Negative Classification as "negative"; any number between Max Negative Classification and Min Positive Classification as "neutral"; and any number between Min Positive Classification and 1 as "positive."
The Sentiment Analysis tool outputs up to five columns. Four columns are included by default. The fifth column appears if you choose the Output Categorical Sentiment option.
These are the columns in the order they appear:
- negative_sentiment: This column displays the score for how negative a piece of text is, ranging from 0 to 1, with 0 being the least negative and 1 being the most negative. The score represents the proportion of words that fall in this category. The scores of negative sentiment, neutral sentiment, and positive sentiment should sum to approximately 1.
- neutral_sentiment: This column displays the score for how neutral a piece of text is, ranging from 0 to 1, with 0 being not neutral (in other words, either positive or negative) and 1 being the most neutral. The score represents the proportion of words that fall in this category. The scores of negative sentiment, neutral sentiment, and positive sentiment should sum to approximately 1.
- positive_sentiment: This column displays the score for how positive a piece of text is, ranging from 0 to 1, with 0 being not positive and 1 being the most positive. The score represents the proportion of words that fall in this category. The scores of negative sentiment, neutral sentiment, and positive sentiment should sum to approximately 1.
- compound_sentiment_score: This column displays a score from ranging from -1 to 1. Negative numbers indicate negative sentiment, and positive numbers indicate positive sentiment. -1 is the most negative score, 0 is the most neutral score, and 1 is the most positive score.
- sentiment_category: The sentiment category derives from the compound sentiment score, and it includes positive, negative, and neutral categories. What the algorithm classifies as positive, neutral, and negative depends on the settings for Max Negative Classification and Min Positive Classification.