COUNTIF Function
Generates the count of rows in each group that meet a specific condition. Generated value is of Integer type.
Note
When added to a transformation, this function is applied to the current sample. If you change your sample or run the job, the computed values for this function are updated. Transformations that change the number of rows in subsequent recipe steps do not affect the values computed for this step.
To perform a simple count of rows without conditionals, use the COUNT
function. See COUNT Function.
Wrangle vs. SQL: This function is part of Wrangle, a proprietary data transformation language. Wrangle is not SQL. For more information, see Wrangle Language.
Basic Usage
countif(failed_deliveries >= 10)
Output:Returns the count of records in which the value of thefailed_deliveries
column is greater than or equal to 10.
Syntax and Arguments
countif(test_expression) [group:group_col_ref] [limit:limit_count]
Argument | Required? | Data Type | Description |
---|---|---|---|
test_expression | Y | string | Expression that is evaluated. Must resolve to |
For more information on syntax standards, see Language Documentation Syntax Notes.
For more information on the group
and limit
parameters, see Pivot Transform.
test_expression
This parameter contains the expression to evaluate. This expression must resolve to a Boolean (true
or false
) value.
Usage Notes:
Required? | Data Type | Example Value |
---|---|---|
Yes | String expression that evaluates to | (LastName == 'Mouse' && FirstName == 'Mickey') |
Examples
Tip
For additional examples, see Common Tasks.
Example - COUNTIF Functions
This example demonstrates how to count the number of values within a group, based on a specified conditional test.
Functions:
Item | Description |
---|---|
COUNTIF Function | Generates the count of rows in each group that meet a specific condition. Generated value is of Integer type. |
COUNTAIF Function | Generates the count of non-null values for rows in each group that meet a specific condition. |
Source:
The following data identifies sales figures by salespeople for a week:
EmployeeId | Date | Sales |
---|---|---|
S001 | 1/23/17 | 25 |
S002 | 1/23/17 | 40 |
S003 | 1/23/17 | 48 |
S001 | 1/24/17 | 81 |
S002 | 1/24/17 | 11 |
S003 | 1/24/17 | 25 |
S001 | 1/25/17 | 9 |
S002 | 1/25/17 | 40 |
S003 | 1/25/17 | |
S001 | 1/26/17 | 77 |
S002 | 1/26/17 | 83 |
S003 | 1/26/17 | |
S001 | 1/27/17 | 17 |
S002 | 1/27/17 | 71 |
S003 | 1/27/17 | 29 |
S001 | 1/28/17 | |
S002 | 1/28/17 | |
S003 | 1/28/17 | 14 |
S001 | 1/29/17 | 2 |
S002 | 1/29/17 | 7 |
S003 | 1/29/17 | 99 |
Transformation:
You are interested in the count of dates during the week when each salesperson sold less than 50 units, not factoring the weekend. First, you try the following:
Transformation Name |
|
---|---|
Parameter: Row labels | EmployeeId |
Parameter: Values | COUNTIF(Sales < 50) |
Parameter: Max columns to create | 1 |
You notice, however, that the blank values, when employees were sick or had vacation, are being counted. Additionally, this step does not filter out the weekend. You must identify the weekend days using the WEEKDAY
function:
Transformation Name |
|
---|---|
Parameter: Formula type | Single row formula |
Parameter: Formula | WEEKDAY(Date) |
Parameter: New column name | 'DayOfWeek' |
If DayOfWeek > 5
, then it is a weekend date. For further precision, you can use the COUNTAIF
function to remove the nulls:
Transformation Name |
|
---|---|
Parameter: Row labels | EmployeeId |
Parameter: Values | COUNTAIF(Sales, DayOfWeek<6) |
Parameter: Max columns to create | 1 |
The above counts the non-null values in Sales
when the day of the week is not a weekend day, as grouped by individual employee.
Results:
EmployeeId | countaif_Sales |
---|---|
S001 | 5 |
S002 | 4 |
S003 | 4 |