Summarize In-DB Tool

The Summarize In-DB tool summarizes data by grouping, summing, counting, counting distinct fields, and more. The output contains only the result of the calculations.

In-Database enables blending and analysis against large sets of data without moving the data out of a database and can provide significant performance improvements over traditional analysis methods. For more about the In-Database tool category, see In-Database Overview.

Configure the tool

Data fields from the input appear in the Fields section. Click to select the field to perform summaries on (Shift + click to select multiple fields to execute the same summary).

  • Use the Select menu to the right to Select to make multiple field selections Choices include:
    • All: all fields are selected to apply to actions.
    • None: deselects all fields.
    • Numeric: only numeric fields are selected (integers, fixed decimals, floats, doubles) to apply to Actions.
    • String: only string fields are selected to apply to Actions.
    • Spatial: only spatial fields are selected to apply to Actions.

With the field(s) selected, click the Add button.

Make the selection and it will appear in the Actions section. Different summary functions are available depending on the type of data field selected.

  • Summarize functions include:
    • Group by: Combines database records with identical values in a specified field into a single record. All of the resulting data from the records in a group are then summarized. (any non-blob or spatial object has this option)

      If no Group by field is specified, the entire file will be summarized.

    • Count: Count of records in the group.
    • Count Distinct: Count of unique records in the group.
    • Count Non Null: Count of unique records in the group that are not [Null]. A Null field means there is no value set for this field; this is different from having a zero or an empty string.
    • Count Null: Count of unique records on the group that are [Null].
    • Min: Returns minimum value.
    • Max:Returns the maximum value.
    • Numeric summarize functions include:
      • Sum: Returns the sum value for the group. The sum is calculated by adding all of the values of a group.
      • Average: Calculates an average value for the group. The average is calculated by taking the sum of all values divided by the total number of values.
      • Standard Deviation: Calculates the standard deviation for the group. Standard Deviation is a measurement variability used in statistics.
      • Variance: Calculates the Variance for the group. The variance is calculated by taking the Standard Deviation and multiplying it times itself, StdDev^2.

    From the Actions section, you can select the field and use the up, down, and delete buttons to specify field order for the output.

    Rename a field by typing a new field name into the Output Field Name column.

    Properties: Additional properties need specification for certain actions. Actions with additional properties specification include: Concatenate Strings and Finance actions.