Skip to main content

SUBSTITUTE Function

Replaces found string literal or pattern or column with a string, column, or function returning strings.

Input can be specified as a column reference, a function returning a string, or a string literal, although string literal usage is rare.

  • A column reference can refer to a column of String type.

  • If no match is found, the function returns the source string.

  • If multiple matches are found in a single string, all replacements are made.

Wrangle vs. SQL: This function is part of Wrangle, a proprietary data transformation language. Wrangle is not SQL. For more information, see Wrangle Language.

Basic Usage

Column reference example:

substitute(myURL,`{ip-address}`,myDomain)

Output: Searches the myURL column values for sub-strings that match valid IP addresses. Where matches are found, they are replaced with the corresponding value in the myDomain column.

Function reference example:

substitute(upper(companyName),'ACME','New ACME')

Output: Searches the uppercase version of values from the companyName column for the string literal ACME. When found, these matches are replaced by New ACME in the companyName column.

Syntax and Arguments

substitute(string_source,string_pattern,string_replacement[,ignore_case, pattern_before, pattern_after])

Argument

Required?

Data Type

Description

string_source

Y

string

Name of the column, a function returning a string, or string literal to be applied to the function

string_pattern

Y

string

String literal or pattern or a column or a function returning strings to find

string_replacement

Y

string

String literal, column or function returning a string to use as replacement

ignore_case

N

string

When true, matching is case-insensitive. Default is false.

pattern_before

N

string

String literal or pattern to find before finding the string_pattern value.

pattern_after

N

string

String literal or pattern to find after finding the string_pattern value.

For more information on syntax standards, see Language Documentation Syntax Notes.

string_source

Name of the item to be searched. Valid values can be:

  • String literals must be quoted ('Hello, World').

  • Column reference to any type that can be inferred as a string, which encompasses all values

  • Functions that return string values

Multiple values and wildcards are not supported.

Usage Notes:

Required?

Data Type

Example Value

Yes

String literal or column reference (String, Array, or Object)

myColumn

string_pattern

String to find. This value can be a string literal, a Wrangle , a regular expression, a column, or a function returning a String value.

  • String literals must be quoted ('Hello, World').

  • Multiple values and wildcards are not supported.

Usage Notes:

Required?

Data Type

Example Value

Yes

String value or pattern or column reference (String)

'Hello'

string_replacement

Value with which to replacement any matched patterns. Value can be a string, a function returning string values, or a column reference containing strings.

  • String literals must be quoted ('Hello, World').

  • Column reference to any type that can be inferred as a string, which encompasses all values.

  • Multiple values and wildcards are not supported.

Usage Notes:

Required?

Data Type

Example Value

Yes

String literal, column reference (String, Array, or Object), or function returning String value

'##REDACTED##'

ignore_case

When true, matches are case-insensitive. Default is false.

Note

This argument is not required. By default, matches are case-sensitive.

Usage Notes:

Required?

Data Type

Example Value

No

String value

'false'

pattern_before

String literal or pattern to find in a position before the pattern to match.

Note

This argument is not permitted when string_ pattern or string_replacement is of column data type.

Astuce

Use this argument if there are potentially multiple instances of the pattern to match in the source.

Usage Notes:

Required?

Data Type

Example Value

No

String literal or pattern

`{digit}{3}`

pattern_after

String literal or pattern to find in a position after the pattern to match.

Note

This argument is not permitted when string_ pattern or string_replacement is of column data type.

Astuce

Use this argument if there are potentially multiple instances of the pattern to match in the source.

Usage Notes:

Required?

Data Type

Example Value

No

String literal or pattern

' '

Examples

Astuce

For additional examples, see Common Tasks.

Example - Partial obfuscation of credit card numbers

Source:

Suppose you have the following transactional data, which contains customer credit card numbers.

TransactionId

CreditCardNum

AmtDollars

T001

4111-1111-1111-1111

100.29

T002

5500-0000-0000-0004

510.21

T003

3400-0000-0000-009

162.13

T004

3000-0000-0000-04

294.12

For security purposes, you wish to redact the first three sets of digits, so only the last set of digits appears.

Transformation:

To make the substitution, you must first change the type of the column to be a string:

Transformation Name

Change column data type

Parameter: Columns

CreditCardNum

Parameter: New type

'String'

You can then use the following transformation to perform the pattern-based replacement of four-digit sets that end in a dash with XXXX:

Transformation Name

Edit column with formula

Parameter: Columns

CreditCardNum

Parameter: Formula

substitute(CreditCardNum, `{digit}+\-`, 'XXXX-')

To indicate that the column no longer contains valid information, you might choose to rename it like in the following:

Transformation Name

Rename columns

Parameter: Option

Manual rename

Parameter: Column

CreditCardNum

Parameter: New column name

'CreditCardNumOBSCURED'

Results:

TransactionId

CreditCardNumOBSCURED

AmtDollars

T001

XXXX-XXXX-XXXX-1111

100.29

T002

XXXX-XXXX-XXXX-0004

510.21

T003

XXXX-XXXX-XXXX-009

162.13

T004

XXXX-XXXX-XXXX-04

294.12