Field Summary Tool Icon

Field Summary Tool

Version:
Current
Last modified: May 13, 2020

Use the Field Summary tool to analyze data and create a summary report containing descriptive statistics of data in selected columns. Use the tool to gain insight into data and receive recommendations for managing data.

Numeric, string, spatial, and date/time data are supported in this tool. A unique set of descriptive statistics are provided for each data type. See Data Types for more on types of data.

Configure the Tool

Select the fields to produce summary info: Check the check box associated with the fields (columns) that you want to produce summary information for.

Sample input data: Check this check box to take a random sample of records. This can reduce the run time of your workflow if you have a large dataset. Each time you run your workflow, a different data sample is displayed. You have 2 sampling options:

  • Random N Records: Specify the Number of Records that you want to randomly sample.
  • Random N% of Records: Specify the Percent of Records that you want to randomly sample.

View the Output

The Field Summary tool has 3 outputs:

  • O anchor: An Alteryx data stream with descriptive statistics for selected columns along with recommendations, in the Remarks column, for managing data.
  • R anchor: A static report with a scatterplot and descriptive statistics for selected columns along with recommendations (see Remarks) for managing data in a column. To view the report, add a Browse tool and connect it to the R output. See Browse Tool.
  • I anchor: An interactive dashboard consisting of expandable panels for each column. To view the report, add a Browse tool and connect it to the I output. See Browse Tool. Each panel consists of a histogram or column chart and summary statistics.
    • Hover over a panel to display additional icons.
      • Select the information icon to view additional information
      • Select the expand icon to open the report in a detailed view.
    • Hover over a bar in the plot to display details.
    • Select variable to view to focus on a smaller set of columns.
    • Sort the panels alphabetically or by the percentage of missing values.

The descriptive statistics available in the output depend on the type of data in the columns selected for analysis. Results are listed horizontally. Scroll left to right to see statistics for each data type.

Numeric Data

If a column contains numeric data, these statistics are provided:

  • Min: The minimum value in the data.
  • Max: The maximum value in the data.
  • Median: The median value in the data.
  • Std. Dev.: The measure of how dispersed the values are in the data.
  • Percent Missing: The percentage of values in the data that are null.
  • Unique Values: The number of unique values in the data.
  • Mean: The average of the data.
  • Layout: Add a Browse tool and connect it to the R output to view the statistics in a visual format. See Browse Tool.
  • Remarks: Recommendations for managing data, if available.

String Data

If a column contains string data, these statistics are provided:

  • Percent Missing: The percentage of values in the data that are null.
  • Unique Values: The number of unique values in the data.
  • Shortest Value: The shortest (length) string value in the data.
  • Longest Value: The longest (length) string value in the data.
  • Min Value Count: The number of values that equal the minimum value.
  • Max Value Count: The number of values that equal the maximum value.
  • Remarks: Recommendations for managing data, if available.

Spatial Data

If a column contains spatial data, these statistics are provided:

  • Percent Missing: The percentage of values in the data that are null.
  • Object Type: The type of spatial object (for example, Point or Polygon) in the data.
  • Avg Area (Sq Miles): The average area, in square miles, of the values in the data.
  • Avg Length (Miles): The average length, in miles, of the values in the data.
  • Avg Num Points: The average number of values in the data that are Points.
  • Remarks: Recommendations for managing data, if available.

Date/Time Data

If a column contains date/time data, these statistics are provided:

  • Percent Missing: The percentage of values in the data that are null.
  • Unique Values: The number of unique values in the data.
  • Latest Date: The latest, or most future, date in the data.
  • Earliest Date: The earliest date in the data.
  • Interval: The interval of dates (for example, Monthly) in the data.
  • Remarks: Recommendations for managing data, if available.
Was This Helpful?

Running into problems or issues with your Alteryx product? Visit the Alteryx Community or contact support.