Pivot Data
A pivot table summaries data that is sourced from another table. Using pivot tables, you can calculate aggregating functions, such as sums, maximums, and averages for one or more columns of data.
Optionally, these sums can be performed across groups of values from one column and broken out in columns based on the values in another. In Dataprep by Trifacta, a pivot table is composed of the following basic elements:
Pivot table element | Description |
---|---|
Column labels | List of one or more columns whose values are represented as the columns in the generated pivot table. |
Row labels | List of one or more columns whose values become the rows in the generated pivot table. |
Values | Also known as facts, these values are one or more aggregation formulas, which are calculated in the following manner: "Show me the value of this formula computed by each row value for every value represented in the generated table." |
Note
If your aggregation does not include the kind of transformation listed above, in which the data is pivoted from rows into columns, you can use the Group By transformation and an aggregate function. See Create Aggregations.
Building a Pivot Table
Pivot tables are very powerful tools for summarizing and visualizing large-scale volumes of data. In Dataprep by Trifacta, search for pivot table
in the Search panel to create one.
Note
A pivot table completely replaces the source table. Data that is not captured in the pivot definition is lost.
Tip
In your flows, you may find it useful to create your pivot tables in independent recipes that are chained from your primary recipe.
Example Data
Pivot tables are perhaps best explained by example. The following table snippet captures transactional data from a number of stores for a range of products across a set of dates. Transactional values include total sales, quantity, and cost (POS_Sales
, POS_Qty
, and POS_Cost
):
Daily | Store_Nbr | POS_Sales | POS_Qty | POS_Cost | PRODUCT_DESC |
---|---|---|---|---|---|
2/8/13 | 1 | 70 | 7 | 4.97 | ACME LAWN GARDEN BAG CLEAR |
2/7/13 | 2 | 10.62 | 9 | 8.37 | ACME COOKIES CHOC CHIP |
2/7/13 | 2 | 0 | 0 | 0 | ACME SANDWICH BAG |
2/7/13 | 2 | 7.08 | 6 | 5.58 | ACME SODAS SALTED |
2/7/13 | 2 | 3.92 | 2 | 2.82 | ACME SCENTED OIL REFILL-CTRY SUN |
2/7/13 | 2 | 13.44 | 7 | 10.36 | ACME LARGE FUDGE GRAHAMS COOKIES |
2/7/13 | 2 | 0 | 0 | 0 | ACME SUGAR ICE WAFERS VANILLA |
2/7/13 | 3 | 3.16 | 2 | 2.86 | ACME ZOO ANIMAL FRUIT SNACKS 6'S |
2/7/13 | 3 | 3.16 | 2 | 2.78 | ACME WAFERS SUGER ICE |
2/7/13 | 3 | 3.16 | 2 | 2.82 | ACME SCENTED OIL REFILL-CTRY SUN |
2/7/13 | 3 | 6.32 | 4 | 5.92 | ACME RICE CRACKERS ONION |
2/2/13 | 9 | 150 | 30 | 16.2 | ACME FROSTED OATMEAL COOKIE SQUA |
2/2/13 | 9 | 3.5 | 2 | 4.86 | ACME FRUIT SNACK CASTLE ADVENTRS |
2/2/13 | 9 | 90 | 9 | 8.37 | ACME COOKIES CHOC CHIP |
2/2/13 | 9 | 30 | 6 | 3.24 | ACME ASSORTED COOKIES DRP |
2/2/13 | 9 | 70 | 7 | 6.51 | ACME KITCHEN BAG |
2/2/13 | 9 | 170 | 17 | 15.81 | ACME SNACK BAGS RESEALABLE |
2/2/13 | 9 | 20 | 4 | 2.16 | ACME CHEDDARY SN CRACKERS/PROCES |
2/2/13 | 9 | 6.5 | 2 | 8.98 | ACME RICE CRACKERS TERIYAKI |
2/2/13 | 9 | 1.5 | 3 | 1.62 | ACME COOKIE MAPLE LEAF CREME |
2/2/13 | 9 | 30 | 6 | 3.24 | ACME RICE CHIPS CHEDDAR |
2/1/13 | 7 | 190 | 38 | 20.52 | ACME FROSTED OATMEAL COOKIE SQUA |
2/1/13 | 7 | 20 | 2 | 1.86 | ACME COOKIES CHOC CHIP |
2/1/13 | 7 | 10 | 1 | 0.82 | ACME DIGESTIVE RICH TEA BISCUITS |
2/1/13 | 7 | 120 | 24 | 12.96 | ACME ASSORTED COOKIES DRP |
2/1/13 | 7 | 120 | 12 | 11.16 | ACME KITCHEN BAG |
2/1/13 | 7 | 90 | 9 | 8.37 | ACME SNACK BAGS RESEALABLE |
2/1/13 | 7 | 10 | 1 | 0.71 | ACME FUDGE MINT COOKIES SQUARES |
2/1/13 | 7 | 9.5 | 19 | 10.26 | ACME CHEDDARY SN CRACKERS/PROCES |
2/1/13 | 7 | 10 | 1 | 0.82 | ACME COOKIES MAPLE CREAM |
2/1/13 | 7 | 40 | 8 | 4.32 | ACME COOKIE MAPLE LEAF CREME |
Available Aggregations
The Pivot data transformation supports use of any aggregation function. For more information, see Aggregate Functions.
Simple Pivot Table
From the above, suppose you are interested in the sales from each store for each product. You can use the following transformation to compute these aggregated calculations:
Transformation Name |
|
---|---|
Parameter: Column labels | Store_Nbr |
Parameter: Row labels | PRODUCT_DESC |
Parameter: Values | SUM(POS_Sales) |
Parameter: Max number of columns to create | 500 |
In the above transformation:
The Column labels entry specifies the column whose values make up the calculated columns of the pivot table. The calculation is performed across each of these values. In this case, each column contains calculations for separate store numbers.
The Row labels entry specifies the column whose values define the grouping of the calculations. In this case, the sum of the sales column is performed for each product description value for each store.
The Values entry specifies the aggregation function to compute for each cell in the new table. In this case, you are generating the sum of sales for each product description in each store.
By default, this transformation generates a maximum of 50 new columns. However, if the column used for your Column labels contains more than 50 values, you may want to raise this value.
Note
Avoid creating datasets wider than 2500 columns. Very wide datasets can cause performance degradation.
Results:
PRODUCT_DESC | sum_POS_Sales_1 | sum_POS_Sales_2 | sum_POS_Sales_3 | sum_POS_Sales_7 | sum_POS_Sales_9 |
---|---|---|---|---|---|
ACME LAWN GARDEN BAG CLEAR | 70 | 0 | 0 | 0 | 0 |
ACME COOKIES CHOC CHIP | 0 | 10.62 | 0 | 20 | 90 |
ACME SANDWICH BAG | 0 | 0 | 0 | 0 | 0 |
ACME SODAS SALTED | 0 | 7.08 | 0 | 0 | 0 |
ACME SCENTED OIL REFILL-CTRY SUN | 0 | 3.92 | 3.16 | 0 | 0 |
ACME LARGE FUDGE GRAHAMS COOKIES | 0 | 13.44 | 0 | 0 | 0 |
ACME SUGAR ICE WAFERS VANILLA | 0 | 0 | 0 | 0 | 0 |
ACME ZOO ANIMAL FRUIT SNACKS 6'S | 0 | 0 | 3.16 | 0 | 0 |
ACME WAFERS SUGER ICE | 0 | 0 | 3.16 | 0 | 0 |
ACME RICE CRACKERS ONION | 0 | 0 | 6.32 | 0 | 0 |
ACME FROSTED OATMEAL COOKIE SQUA | 0 | 0 | 0 | 190 | 150 |
ACME FRUIT SNACK CASTLE ADVENTRS | 0 | 0 | 0 | 0 | 3.5 |
ACME ASSORTED COOKIES DRP | 0 | 0 | 0 | 120 | 30 |
ACME KITCHEN BAG | 0 | 0 | 0 | 120 | 70 |
ACME SNACK BAGS RESEALABLE | 0 | 0 | 0 | 90 | 170 |
ACME CHEDDARY SN CRACKERS/PROCES | 0 | 0 | 0 | 9.5 | 20 |
ACME RICE CRACKERS TERIYAKI | 0 | 0 | 0 | 0 | 6.5 |
ACME COOKIE MAPLE LEAF CREME | 0 | 0 | 0 | 40 | 1.5 |
ACME RICE CHIPS CHEDDAR | 0 | 0 | 0 | 0 | 30 |
ACME DIGESTIVE RICH TEA BISCUITS | 0 | 0 | 0 | 10 | 0 |
ACME FUDGE MINT COOKIES SQUARES | 0 | 0 | 0 | 10 | 0 |
ACME COOKIES MAPLE CREAM | 0 | 0 | 0 | 10 | 0 |
Conditional Aggregations
Suppose you are interested in only in the sum of sales for store numbers 1-3. To capture a more limited dataset, you can use the SUMIF
aggregation function:
Transformation Name |
|
---|---|
Parameter: Row labels | PRODUCT_DESC |
Parameter: Values | SUMIF(POS_Sales, Store_Nbr<4) |
Parameter: Max number of columns to create | 500 |
Most aggregation functions have a conditional (*IF
) variant.
Multiple Aggregation Levels
None of the axes of a pivot table is limited to a single dimension. You can have multiple Column labels, Row labels, and Values (formulas). In the following transformation, aggregations have been further broken out by date, and an additional formula (Value) has been added.
Note
Adding multiple Column labels and Values can greatly expand the width of the dataset. Generally, adding Row labels does not expand the total count of rows.
Transformation Name |
|
---|---|
Parameter: Column labels | Store_Nbr |
Parameter: Row labels1 | Date |
Parameter: Row labels2 | PRODUCT_DESC |
Parameter: Values1 | SUM(POS_Qty) |
Parameter: Values2 | SUM(POS_Sales) |
Parameter: Max number of columns to create | 500 |
Results:
Note
Following results table is incomplete. Some columns have been omitted for space reasons.
Daily | PRODUCT_DESC | sum_POS_Qty_1 | sum_POS_Sales_1 | sum_POS_Qty_2 | sum_POS_Sales_2 | sum_POS_Qty_3 | sum_POS_Sales_3 |
---|---|---|---|---|---|---|---|
2/8/13 | ACME LAWN GARDEN BAG CLEAR | 7 | 70 | 0 | 0 | 0 | 0 |
2/7/13 | ACME COOKIES CHOC CHIP | 0 | 0 | 9 | 10.62 | 0 | 0 |
2/7/13 | ACME SANDWICH BAG | 0 | 0 | 0 | 0 | 0 | 0 |
2/7/13 | ACME SODAS SALTED | 0 | 0 | 6 | 7.08 | 0 | 0 |
2/7/13 | ACME SCENTED OIL REFILL-CTRY SUN | 0 | 0 | 2 | 3.92 | 2 | 3.16 |
2/7/13 | ACME LARGE FUDGE GRAHAMS COOKIES | 0 | 0 | 7 | 13.44 | 0 | 0 |
2/7/13 | ACME SUGAR ICE WAFERS VANILLA | 0 | 0 | 0 | 0 | 0 | 0 |
2/7/13 | ACME ZOO ANIMAL FRUIT SNACKS 6'S | 0 | 0 | 0 | 0 | 2 | 3.16 |
2/7/13 | ACME WAFERS SUGER ICE | 0 | 0 | 0 | 0 | 2 | 3.16 |
2/7/13 | ACME RICE CRACKERS ONION | 0 | 0 | 0 | 0 | 4 | 6.32 |
2/2/13 | ACME FROSTED OATMEAL COOKIE SQUA | 0 | 0 | 0 | 0 | 0 | 0 |
2/2/13 | ACME FRUIT SNACK CASTLE ADVENTRS | 0 | 0 | 0 | 0 | 0 | 0 |
2/2/13 | ACME COOKIES CHOC CHIP | 0 | 0 | 0 | 0 | 0 | 0 |
2/2/13 | ACME ASSORTED COOKIES DRP | 0 | 0 | 0 | 0 | 0 | 0 |
2/2/13 | ACME KITCHEN BAG | 0 | 0 | 0 | 0 | 0 | 0 |
2/2/13 | ACME SNACK BAGS RESEALABLE | 0 | 0 | 0 | 0 | 0 | 0 |
2/2/13 | ACME CHEDDARY SN CRACKERS/PROCES | 0 | 0 | 0 | 0 | 0 | 0 |
2/2/13 | ACME RICE CRACKERS TERIYAKI | 0 | 0 | 0 | 0 | 0 | 0 |
2/2/13 | ACME COOKIE MAPLE LEAF CREME | 0 | 0 | 0 | 0 | 0 | 0 |
2/2/13 | ACME RICE CHIPS CHEDDAR | 0 | 0 | 0 | 0 | 0 | 0 |
2/1/13 | ACME FROSTED OATMEAL COOKIE SQUA | 0 | 0 | 0 | 0 | 0 | 0 |
2/1/13 | ACME COOKIES CHOC CHIP | 0 | 0 | 0 | 0 | 0 | 0 |
2/1/13 | ACME DIGESTIVE RICH TEA BISCUITS | 0 | 0 | 0 | 0 | 0 | 0 |
2/1/13 | ACME ASSORTED COOKIES DRP | 0 | 0 | 0 | 0 | 0 | 0 |
2/1/13 | ACME KITCHEN BAG | 0 | 0 | 0 | 0 | 0 | 0 |
2/1/13 | ACME SNACK BAGS RESEALABLE | 0 | 0 | 0 | 0 | 0 | 0 |
2/1/13 | ACME FUDGE MINT COOKIES SQUARES | 0 | 0 | 0 | 0 | 0 | 0 |
2/1/13 | ACME CHEDDARY SN CRACKERS/PROCES | 0 | 0 | 0 | 0 | 0 | 0 |
2/1/13 | ACME COOKIES MAPLE CREAM | 0 | 0 | 0 | 0 | 0 | 0 |
2/1/13 | ACME COOKIE MAPLE LEAF CREME | 0 | 0 | 0 | 0 | 0 | 0 |
Group By
If you wish to maintain the original dataset values, you can apply an aggregate function within a single column.