Skip to main content

Countpattern Transform

Note

Transforms are a part of the underlying language, which is not directly accessible to users. This content is maintained for reference purposes only. For more information on the user-accessible equivalent to transforms, see Transformation Reference.

Counts the number of instances of a specified pattern in a column and writes that value into a newly generated column. Source column is unchanged.

Basic Usage

countpattern col: myCol on: 'honda'

Output: Generates a new column containing the number of instances of the string honda that appear in each row of the column, myCol.

Syntax and Parameters

countpattern col:column_ref [ignoreCase:true|false] [after:start_point | from: start_point] [before:end_point | to:end_point] [on:'exact_match']

Token

Required?

Data Type

Description

countpattern

Y

transform

Name of the transform

col

Y

string

Source column name

ignoreCase

N

boolean

If true, matching is case-insensitive.

Matching parameters:

Note

At least one of the following parameters must be included to specify the pattern to count: after, before, from, on, to.

Token

Required?

Data Type

Description

after

N

string

String literal or pattern that precedes the pattern to match

before

N

string

String literal or pattern that appears after the pattern to match

from

N

string

String literal or pattern that identifies the start of the pattern to match

on

N

string

String literal or pattern that identifies the pattern to match.

to

N

string

String literal or pattern that identifies the end of the pattern to match

For more information on syntax standards, see Language Documentation Syntax Notes.

col

Identifies the column to which to apply the transform. You can specify only one column.

countpattern col: MyCol on: 'MyString'

Output: Counts the number of instances of the value MyString in the MyCol column and writes this value to a new column.

Usage Notes:

Required?

Data Type

Yes

String (column name)

after

countpattern col: MyCol after: 'entry:'

Output: Counts 1 if there is anything that appears in the MyCol column value after the string entry:. If the value entry: does not appear in the column, the output value is 0.

A pattern identifier that precedes the value or pattern to match. Define the after parameter value using string literals, regular expressions, or Wrangle .

Usage Notes:

Required?

Data Type

No

String (string literal or pattern)

  • The after and from parameters are very similar. from includes the matching value as part of the extracted string.

  • after can be used with either to, on, or before. See Pattern Clause Position Matching

before

A pattern identifier that occurs after the value or pattern to match. Define the pattern using string literals, regular expressions, or Wrangle .

countpattern col: MyCol before: '|'

Output:

  • Counts 1 if there is a value that appears before the pipe character (|) in the MyCol column, and no other pattern parameter is specified. If the before value does not appear in the column, the output value is 0.

  • If another pattern parameter such as after is specified, the total count of instances is written to the new column.

Usage Notes:

Required?

Data Type

No

String or pattern

  • The before and to parameters are very similar. to includes the matching value as part of the extracted string.

  • before can be used with either from, on, or after. See Pattern Clause Position Matching.

from

Identifies the pattern that marks the beginning of the value to match. It can be a string literal, Wrangle , or regular expression. The from value is included in the match.

countpattern col: MyCol from: 'go:'

Output:

  • Counts 1 if contents from MyCol that occur from go:, to the end of the cell when no other pattern parameter is specified. If go: does not appear in the column, the output value is blank.

  • If another pattern parameter such as to is specified, the total count of instances is written to the new column.

Usage Notes:

Required?

Data Type

No

String or pattern

  • The after and from parameters are very similar. from includes the matching value as part of the extracted string.

  • from can be used with either to or before. See Pattern Clause Position Matching.

on

Identifies the pattern to match, which can be a string literal, Wrangle , or regular expression.

countpattern col: MyCol on: `###ERROR`

Tip

You can insert the Unicode equivalent character for this parameter value using a regular expression of the form /\uHHHH/. For example, /\u0013/ represents Unicode character 0013 (carriage return). For more information, see Supported Special Regular Expression Characters.

Usage Notes:

Required?

Data Type

No

String (literal, regular expression, or Wrangle

to

Identifies the pattern that marks the ending of the value to match. Pattern can be a string literal, Wrangle , or regular expression. The to value is included in the match.

countpattern col:MyCol from:'note:' to: `/`

Output:

  • Counts instances from MyCol column of all values that begin with note: up to a backslash character.

  • If a second pattern parameter is not specified, then this value is either 0 or 1.

Usage Notes:

Required?

Data Type

No

String or pattern

  • The before and to parameters are very similar. to includes the matching value as part of the extracted string.

  • to can be used with either from or after. See Pattern Clause Position Matching.

ignoreCase

Indicates whether the match should ignore case or not.

  • Set to true to ignore case matching.

  • (Default) Set to false to perform case-sensitive matching.

countpattern col: MyCol on: 'My String' ignoreCase: true

Output: Counts the instances of the following values if they appear in the MyCol column: My String, my string, My string, etc.

Usage Notes:

Required?

Data Type

No

Boolean

Examples

Tip

For additional examples, see Common Tasks.

Example - counting patterns in tweets

This example demonstrates how to count the number of occurrences of text patterns in a column.

Functions:

Item

Description

IF Function

TheIFfunction allows you to build if/then/else conditional logic within your transforms.

Source:

The dataset below contains fictitious tweet information shortly after the release of an application called, "Myco ExampleApp".

Date

twitterId

isEmployee

tweet

11/5/15

lawrencetlu38141

FALSE

Just downloaded Myco ExampleApp! Transforming data in 5 mins!

11/5/15

petramktng024

TRUE

Try Myco ExampleApp, our new free data wrangling app! See www.example.com.

11/5/15

joetri221

TRUE

Proud to announce the release of Myco ExampleApp, the free version of our enterprise product. Check it out at www.example.com.

11/5/15

datadaemon994

FALSE

Great start with Myco ExampleApp. Super easy to use, and actually fun.

11/5/15

99redballoons99

FALSE

Liking this new ExampleApp! Good job, guys!

11/5/15

bigdatadan7182

FALSE

@support, how can I find example datasets for use with your product?

There are two areas of analysis:

  • For non-employees, you want to know if they are mentioning the new product by name.

  • For employees, you want to know if they are including cross-references to the web site as part of their tweet.

Transformation:

The following counts the occurrences of the string ExampleApp in the tweet column. Note the use of the ignoreCase parameter to capture capitalization differences:

Transformation Name

Count matches

Parameter: Column

tweet

Parameter: Option

Text or pattern

Parameter: Text or pattern to count

'ExampleApp'

Parameter: Ignore case

true

For non-employees, you want to track if they have mentioned the product in their tweet:

Transformation Name

New formula

Parameter: Formula type

Single row formula

Parameter: Formula

if(isEmployee=='FALSE' && countpattern_tweet=='1',true,false)

Parameter: New column name

'nonEmployeeExampleAppMentions'

The following counts the occurrences of example.com in their tweets:

Transformation Name

Count matches

Parameter: Column

tweet

Parameter: Option

Text or pattern

Parameter: Text or pattern to count

'example.com'

Parameter: Ignore case

true

For employees, you want to track if they included the above cross-reference in their tweets:

Transformation Name

New formula

Parameter: Formula type

Single row formula

Parameter: Formula

if(isEmployee=='TRUE' && countpattern_tweet1 == 1, true, false)

Parameter: New column name

'employeeWebsiteCrossRefs'

Results:

After you delete the two columns tabulating the counts, you end up with the following:

Date

twitterId

isEmployee

tweet

employeeWebsiteCrossRefs

nonEmployeeExampleAppMentions

11/5/15

lawrencetlu38141

FALSE

Just downloaded Myco ExampleApp! Transforming data in 5 mins!

false

true

11/5/15

petramktng024

TRUE

Try Myco ExampleApp, our new free data wrangling app! See www.example.com.

true

false

11/5/15

joetri221

TRUE

Proud to announce the release of Myco ExampleApp, the free version of our enterprise product. Check it out at www.example.com.

true

false

11/5/15

datadaemon994

FALSE

Great start with Myco ExampleApp. Super easy to use, and actually fun.

false

true

11/5/15

99redballoons99

FALSE

Liking this new ExampleApp! Good job, guys!

false

true

11/5/15

bigdatadan7182

FALSE

@support, how can I find example datasets for use with your product?

false

false