Contingency Table Tool
One Tool Example
Contingency Table has a One Tool Example. Visit Sample Workflows to learn how to access this and many other examples directly in Alteryx Designer.
Use Contingency Table to look at up to four variables/fields and determine how they relate to each other. The Contingency Table tool has a similar use to that of the Frequency Table tool. The tool produces two outputs, a data output that lists all of the combinations of values between the fields selected, with a frequency and a percent column. The report output produces tables to show the combinations of values between the fields and also includes some additional row and column percentages.
If you are just analyzing two fields, you can also select to output the chi-square statistic to be included with the report. A chi-square statistic is used to investigate whether distributions of categorical variables differ from one another.
R must be installed for this option to run successfully. Go to Options > Download Predictive Tools and sign in to the Alteryx Downloads and Licenses portal to install R and the packages used by the R tool. Visit Download and Use Predictive Tools.
Configure the Tool
- Include chi-squared statistic: A chi-square (X2) statistic is used to investigate whether distributions of categorical variables differ from one another. This data will be included in the report output. Select the two fields to analyze via Variable 1 and Variable 2.
- Do not include chi-squared statistic: At least two fields and up to four fields may be selected. When you select fields for either option, these rules apply:
- Each variable must have unique values. If the values are not unique across the fields, an error will be thrown.
- Certain field types cannot be selected: FixedDecimal, Float, Double, Date, Time, DateTime, Blob, and SpatialObj. Integer field types are allowed but should only be used if the field is truly categorical.
View the Output
- D anchor: Data output includes these fields:
Name Description InputField_SelectedField1 (2, 3, 4) Original field name of the input data.
Depending on how many fields are selected InputField_SelectedField3 and InputField_SelectedField4 might not be present and the part in italics is updated with the actual selected field name.
Frequency Count of times the value is present in the input data for the given Field Name. Percent (Frequency/Total Records) *100
- R anchor: Report Output includes a Contingency table for each field selected.
The first record in this output shows any warnings for field types. If any of the selected fields are set to numeric data types then a warning is shown. The rest of the report shows a contingency table for each combination of field values, the header for the table shows the fields that were selected by the user and the values for any fields which are not shown in the table. The table also shows a Total column and rows for Frequency, Percent, Row Percent, and Column Percent.
If the chi-square statistic option is selected then underneath the table these values are displayed:
- Chi-squared: The calculated chi-square value.
- df: Degrees of freedom.
- p-value. The returned statistic value from R. The lower the p-value the more likely it is that the variables are dependent to each other.
- I anchor: Interactive Output includes a chart where the viewer can customize what displays with a series of dropdown options.