You are here: Predictive Analytics > Data Investigation > Contingency Table

Contingency Table Tool

The Contingency Table tool has a similar use to that of the Frequency Table but instead of just looking at each field individually, the Contingency Table tool will look at up to four variables/fields and how they relate to each other. The tool will produce two outputs, a data output which lists all of the combinations of values between the fields selected, with a frequency and a percent column. The report output produces tables to show the combinations of values between the fields and also includes some additional row and column percentages.

If the user is just analyzing two fields then they can also select to output the chi-square statistic to be included with the report. A chi square (X2) statistic is used to investigate whether distributions of categorical variables differ from one another. R must be installed for this option to run successfully.

This tool uses the R programming language. Go to Options > Download Predictive Tools to install R and the packages used by the R Tool.

Configuration Properties

When selecting fields for either option, the following rules apply:


D Output: Data output includes the following fields:

Name Description
InputField_SelectedField1 (2, 3, 4) Original field name of the input data.

Depending how many fields are selected InputField_SelectedField3 and InputField_SelectedField4 may not be present and the part in italics will be updated with the actual selected field name.

Frequency Count of times the value is present in the input data for the given Field Name.
Percent (Frequency/Total Records) *100

R Output: Report Output includes a Contingency table for each field selected.

The first record in this output will show any warnings for field types, if any of the selected fields are set to numeric data types than a warning is shown. The rest of the report shows a contingency table for each combination of field values, the header for the table shows the fields that were selected by the user and the values for any fields which are not shown in the table. The table also shows a Total column and rows for Frequency, Percent, Row Percent and Column Percent.

If the chi-square statistic option is selected then underneath the table the following values are displayed; Chi-squared, df, and p-value. Chi-squared is the calculated chi-square value, df is degrees of freedom and p-value is the returned statistic value from R, the lower the p-value the more likely it is that the variables are dependent to each other.

I Output: Interactive Output includes a chart where the viewer can customize what displays with a series of drop down options.