Removes all characters from a string that are not letters, numbers, accented Latin characters, or whitespace.
Nota
Non-Latin letters are also removed.
Suggerimento
This function also removes common punctuation, such as the following:
. , ! & ?
To preserve these characters, you might replace them with an alphanumeric text string. For example, the question mark might be replaced by:
zzQUESTIONMARKzz
After the function has been applied, you can replace these strings with the original values.
Wrangle vs. SQL: This function is part of Wrangle, a proprietary data transformation language. Wrangle is not SQL. For more information, see Wrangle Language.
Column reference example:
removesymbols(MyName)
Output: Returns the value in the MyName
column value with all non-alphanumeric characters removed.
String literal example:
removesymbols('Héllõ, Wõrlds!?!?')
Output: Returns the string: Héllõ Wõrlds
.
Wildcard example:
removesymbols($col)
Output: Strips all non-alphanumeric or space characters from all columns in the dataset.
removesymbols(column_string)
Argument | Required? | Data Type | Description |
---|---|---|---|
column_string | Y | string | Name of the column or string literal to be applied to the function |
For more information on syntax standards, see Language Documentation Syntax Notes.
Name of the column or string constant to be trimmed of symbols.
Missing string or column values generate missing string results.
String constants must be quoted (
'Hello, World'
).Multiple columns and wildcards are not supported.
Usage Notes:
Required? | Data Type | Example Value |
---|---|---|
Yes | String literal or column reference | myColumn |
Suggerimento
For additional examples, see Common Tasks.
This example demonstrates functions that can be used to clean up strings.
Functions:
Item | Description |
---|---|
TRIM Function | Removes leading and trailing whitespace from a string. Spacing between words is not removed. |
REMOVEWHITESPACE Function | Removes all whitespace from a string, including leading and trailing whitespace and all whitespace within the string. |
REMOVESYMBOLS Function | Removes all characters from a string that are not letters, numbers, accented Latin characters, or whitespace. |
Source:
In the following (space)
and (tab)
indicate space keys and tabs, respectively. Carriage return and newline characters are also supported by whitespace functions.
Strings | source |
---|---|
String01 | this source(space)(space) |
String02 | (tab)(tab)this source |
String03 | (tab)(tab)this source(space)(space) |
String04 | this source's? |
String05 | Why, you @#$%^&*()! |
String06 | this söurce |
String07 | (space)this söurce |
String08 | à mañana |
Transformation:
The following transformation steps generate new columns using each of the string cleanup functions:
Transformation Name | |
---|---|
Parameter: Formula type | Single row formula |
Parameter: Formula | TRIM(source) |
Parameter: New column name | 'trim_source' |
Transformation Name | |
---|---|
Parameter: Formula type | Single row formula |
Parameter: Formula | REMOVEWHITESPACE(source) |
Parameter: New column name | 'removewhitespace_source' |
Transformation Name | |
---|---|
Parameter: Formula type | Single row formula |
Parameter: Formula | REMOVESYMBOLS(source) |
Parameter: New column name | 'removesymbols_source' |
Results:
Strings | source | removesymbols_source | removewhitespace_source | trim_source |
---|---|---|---|---|
String01 | this source(space)(space) | this source(space)(space) | thissource | this source |
String02 | (tab)(tab)this source | (tab)(tab)this source | thissource | this source |
String03 | (tab)(tab)this source(space)(space) | (tab)(tab)this source(space)(space) | thissource | this source |
String04 | this source's? | this sources | thissource's? | this source's? |
String05 | "Why, you @#$%^&*()!" | Why you | Why,you@#$%^&*()! | Why, you @#$%^&*()! |
String06 | this söurce | this söurce | thissöurce | this söurce |
String07 | (space)this söurce | (space)this söurce | thissöurce | this söurce |
String08 | à mañana | à mañana | àmañana | à mañana |