The Heat Plot tool uses a heat plot color map to show the joint distribution of two variables that are either continuous numeric variables or ordered categories (categorical variables that have a natural order, such as income groups or educational attainment levels). For example, this tool can provide an indication of the joint distribution of customer satisfaction and the length of time a customer has been with the company, highlighting potential problem and success hot-spots with respect to customer tenure.
This tool uses the R programming language. Go to Options > Download Predictive Tools to install R and the packages used by the R Tool.
Input
An Alteryx data stream that includes two continuous numeric fields. Ordered categories need to be coded as a set of consecutive integer values, starting at 1 for the lowest of the categories.
Configuration Properties
There are four tabs to configure in the Heat Plot tool: Field Selection, Density Estimation, Heat Plot Options, and Graphics Options.
Field Selection
Select the horizontal (x) variable for the chart: Select the numeric field that will act as the horizontal axis of the chart.
Select the vertical (y) variable for the chart: Select the numeric field that will act as the vertical axis of the chart.
Density Estimation
Bandwidth for the smoother: The window size over which the joint density between the fields is assessed. "Auto" (the default and preferred option) calculates a bandwidth based on the ranges of the two fields. Otherwise, it should be a number that should not exceed the lesser of the ranges of the two variables contained in the plot (i.e., if one field has values from 1 to 20, while the second has values from 1 to 10, the bandwidth should be smaller, likely much smaller, than 10). Too small a value will only pick up areas with very high concentrations of observations (obscuring the underlying relationship), while too large a value will result in "over-smoothing", implying a misleading level of density in some areas of the plot.
The number of grid points in each direction: The smoothing is done over a grid of points in each direction. Small grid sizes result in plots with a great deal of blockiness, while larger values result in greater levels of smoothing. The default value for this parameter is 25, but commonly used grid values are 25, 50, 75, and 100 points.
Heat Plot Options
Main plot title: The main title for the plot.
New horizontal field name (optional): This option allows for an alternative, more descriptive, name for the field on the horizontal axis of the plot.
New vertical field name (optional): This option allows for an alternative, more descriptive, name for the field on the vertical axis of the plot.
Color palette for the plot: This option controls the color theme for the plot. Five different choices are offered, with the appropriate selection being a matter of user taste.
The number of color levels: This option controls the gradation in color levels across the selected color palette. The default is 100, but values of 100, 125, 150, 175, and 200 are often selected. The higher the number, the smoother the plot will appear.
Are the variables actually ordered categories?: Often the underlying number (typically integers) represent ordered categories as opposed to true numbers (e.g., categories of educational attainment such as elementary school, high school, some college, Bachelor's degrees, graduate degree). These fields can be coded as consecutive integers, and in the plot, the integers can be replaced with category labels. To do this, check on this box, and then enter the desired labels as a comma separated string (starting from lowest to highest) for one or both of the variables. The number of labels needs to correspond to the number of unique values for each variable or an error will be issued.
Include a plot key?: Checking this box produces a key that is printed on the right-hand side of the plot that gives an indication of how the colors in the plot translate to the observed density level. When this box is checked, the user can provide a short descriptive title for the key (the default value is "Density"), and there is an option to display the low and high points in the key using a text description rather than the default of using numeric values.
Graphics Options
Plot size: Select the width and height dimensions of the resulting plot, using either inches or centimeters.
Graph resolution: Select the resolution of the graph in dots per inch: 1x (96 dpi); 2x (192 dpi); or 3x (288 dpi). Lower resolution creates a smaller file and is best for viewing on a monitor. Higher resolution creates a larger file with better print quality.
Output
A graph of the heat plot is output for use in reporting.