FINDNTH Function
Returns the position of the nth occurrence of a letter or pattern in the input string where a specified matching string is located in the provided column. You can search either from left or right.
A column reference can refer to a column of String, Object, or Array type, which makes the
FINDNTH
function useful for filtering data before it has been completely un-nested into tabular data.Returned value is from the beginning of the string, regardless of the string index value.
If no match is found, the function returns a null value.
Wrangle vs. SQL: This function is part of Wrangle, a proprietary data transformation language. Wrangle is not SQL. For more information, see Wrangle Language.
Basic Usage
Column reference example:
findnth(Message, `{hashtag}`, 2, left, true)
Output: Searches the Message column value for second occurrence of the hashtag
pattern from the left of the column by considering the case-sensitive matching. If a match is found, returns the index value of the location in the string where the pattern appears.
String literal example:
findnth('Hello, World','o', 2, left ,false)
Output: Searches the string Hello, World
for the 2nd occurrence of the value o
from the left of the column by ignoring the case. In this case, the returned value is 8
.
Syntax and Arguments
findnth(string_to_search, search_for, match_number, direction, ignore_case)
Argument | Required? | Data Type | Description |
---|---|---|---|
string_to_search | Y | string | Name of the column, function returning a string, or string literal to be applied to the function |
search_for | Y | string | Name of column, function returning a string, or string literal or pattern to find |
match_number | Y | integer | Count of characters from the start of the value to include in the match |
direction | N | string | The direction to search the string from. This can be either left or right. Default is left. |
ignore_case | N | boolean | If |
For more information on syntax standards, see Language Documentation Syntax Notes.
string_to_search
Item to be searched. Valid values can be:
String literals must be quoted (
'Hello, World'
).Column reference to any type that can be inferred as a string, which encompasses all values.
Function returning a string value.
Multiple values and wildcards are not supported.
Usage Notes:
Required? | Data Type | Example Value |
---|---|---|
Yes | String literal, function returning a string, or column reference (String, Array, or Object) | myColumn |
search_for
Column of strings, function returning a string, string literal or pattern to find. An input value can be a literal, Wrangle , or a regular expression.
Multiple values and wildcards are not supported.
Usage Notes:
Required? | Data Type | Example Value |
---|---|---|
Yes | String literal or pattern | 'Hello' |
match_number
The number of string or pattern matches to find. For example, a value of 2
instructs the FINDNTH
function to begin searching from the second occurrence of the character or pattern in the column or a string value.
Usage Notes:
Required? | Data Type | Example Value |
---|---|---|
Yes | integer (non-negative) | '2' |
Value must be a non-negative integer value.
Value must a non-negative integer. If the value is
0
, then the match fails for all strings.
direction
The direction to search the string from. The direction can be either left or right. By default, it is left.
Usage Notes:
Required? | Data Type | Example Value |
---|---|---|
No | string | left |
ignore_case
If true
, the FINDNTH
function ignores case when trying to match the string literal or pattern value.
Default value is false
, which means that case-sensitive matching is performed by default.
Usage Notes:
Required? | Data Type | Example Value |
---|---|---|
No | Boolean | true |
Examples
Tipp
For additional examples, see Common Tasks.
Example - Filter Hashtag messages
Source:
The table below contains your customer's tweet messages. You are interested in the second hashtag that is listed in each message.
Tipp
You can use either the FIND or the FINDNTH function for the first value in the string. However, you must use the FINDNTH function to find the second or later values in the string.
User Name | Location | Messages |
---|---|---|
Eugenie | U.K | #dataprep #businessintelligence #CommitToCleanData #London |
Jeniffer | NewYork | Learn how #NewYorklife# #bigdata #dataprep #NewYork # |
Patrick | Berlin | #bigdata #machine learning #datawrangling #Berlin |
Christy | SanFrancisco | #predictivetransformation, #businessintelligence, #startwiththeuser, #machinelearning #SFO |
Dave | Paris | #commitocleandata, #pivot, #aggregation, #bigdata, #dataprep, #machinelearning |
Transformation:
First, you must identify the records that contain the pattern hashtag
. The following transform generates a new column containing the position of the 2nd hashtag in the Messages column. This value can be retrieved using the {hashtag}
Alteryx pattern.
Transformation Name | |
---|---|
Parameter: Formula type | Single row formula |
Parameter: Formula | FINDNTH(Message, `{hashtag}`, 2, left, true) |
Parameter: New column name | find2ndhashtag_Message |
User Name | Location | Messages | find2ndhashtag_Message |
---|---|---|---|
Eugenie | U.K | #dataprep #businessintelligence #CommitToCleanData #London | 11 |
Jeniffer | NewYork | Learn how #NewYorklife# #bigdata #dataprep #NewYork # | 26 |
Patrick | Berlin | #bigdata #machinelearning #datawrangling #Berlin | 10 |
Christy | SanFrancisco | #predictivetransformation, #businessintelligence, #startwiththeuser, #machinelearning #SFO | 28 |
Dave | Paris | #commitocleandata, #pivot, #aggregation, #bigdata, #dataprep, #machinelearning | 20 |
When the step is added to the recipe, the find2ndhashtag_Message
column is generated, containing the index value returned by the FINDNTH
function. In this case, each row contains at least two instances of the {hashtag}
Alteryx pattern, and the generated column contains the index position where it occurs in the Message
column.
The next step is to extract the hashtag from the Messages
column to convert into a meaningful data.
Transformation Name | |
---|---|
Parameter: Column to extract from | Messages |
Parameter: Option | Custom text or pattern |
Parameter: Text to extract | `{hashtag}` |
Parameter: Number of matches to extract | 2 |
Anmerkung
The number of matches to extract parameter defaults to 1, meaning that the transformation extracts a maximum of one value from each cell. This value can be set from 1-50. In this case, you should set to the value to 2
, since you are interested in the second hashtag.
The above transformation generates two new columns with the first two extracted hashtag values from the Messages column. Next steps are:
Delete the column containing the first hashtag value.
Rename the
Messages2
column to beSecond_hashtag_value
.
Results:
User Name | Location | Messages | Second_hashtag_value | find2ndhashtag_Message |
---|---|---|---|---|
Eugenie | U.K | #dataprep #businessintelligence #CommitToCleanData #London | #businessintelligence | 11 |
Jeniffer | NewYork | Learn how #NewYorklife# #bigdata #dataprep #NewYork # | #bigdata | 26 |
Patrick | Berlin | #bigdata #machine #learning #datawrangling #Berlin | #machinelearning | 10 |
Christy | SanFrancisco | #predictivetransformation, #businessintelligence, #startwiththeuser, #machinelearning #SFO | #businessintelligence | 28 |
Dave | Paris | #commitocleandata, #pivot, #aggregation, #bigdata, #dataprep, #machinelearning | #pivot | 20 |