Skip to main content

Pivot Transform

Note

Transforms are a part of the underlying language, which is not directly accessible to users. This content is maintained for reference purposes only. For more information on the user-accessible equivalent to transforms, see Transformation Reference.

When you pivot data, the values from the selected column or columns become new columns in the dataset, each of which contains a summary calculation that you specify.

  • This calculation can be based on all rows for totals across the dataset or based on group of rows you define in the transform.

  • When your data has been pivoted, the unused columns are dropped, leaving only the operative data for faster evaluation.

  • Transform accepts one or more columns as inputs to the pivot.

Tip

This transform is very useful in Preview mode for quick discovery and analysis.

Basic Usage

pivot col: Dates value:SUM(Sales) group: prodId

Output: Reshapes your dataset to include a ProdId column and new columns for each distinct value in the Dates column. Each distinct ProdId value is represented by a separate row in the reshaped dataset. Within each row, the new columns contain a sum of sales for the product for each date in the dataset.

Syntax and Parameters

pivot col:column_ref value: FUNCTION(arg1,arg2) [group: group_col] [limit:int_num]

Token

Required?

Data Type

Description

pivot

Y

transform

Name of the transform

col

N

string

Source column name or names

value

N

string

Expression containing the aggregation function or functions used to pivot the source column data

group

N

string

Column name or names containing the values by which to group for calculation

limit

N

integer (positive)

Maximum number of unique values in a group. Default is 50.

For more information on syntax standards, see Language Documentation Syntax Notes.

col

Identifies the column or columns to which to apply the transform.

Single-column example:

The specified column contains the values that become new columns in the dataset.

pivot col: autoBrand value:AVERAGE(autoPrice) group:State

Output: Reshapes the dataset to contain the State column followed by columns for each distinct value in the autoBrand column. Each value in those columns is the average value in the autoPrice column, as grouped by State value.

Multi-column example:

For multiple input columns, the transform generates new columns for each combination of the inputs. See the example below.

Usage Notes:

Required?

Data Type

No

String (column name)

value

For this transform, the value parameter contains the aggregation function and its parameters, which define the set of rows to which the function is applied.

Note

For the value parameter, you can only use aggregation functions. Nested functions are not supported. For more information, see Aggregate Functions.

Usage Notes:

Required?

Data Type

No

String containing a list of aggregation functions each containing one column reference

group

For this transform, this parameter specifies the column whose values are used to group the dataset prior to applying the specified function. You can specify multiple columns as comma-separated values.

Note

Transforms that use the group parameter can result in non-deterministic re-ordering in the data grid. However, you should apply the group parameter, particularly on larger datasets, or your job may run out of memory and fail. To enforce row ordering, you can use the sort transform. For more information, see Sort Transform.

Usage Notes:

Required?

Data Type

No

String (column name)

limit

The limit parameter defines the maximum number of unique values permitted in a group. If it is not specified, the default value for this parameter is 50.

Note

Be careful setting this parameter too high. In some cases, the application can run out of memory generating the results, and your results can fail.

Usage Notes:

Required?

Data Type

No. Default value is 50.

Integer (positive)

Examples

Tip

For additional examples, see Common Tasks.

Example - Basic Pivot

Source:

The following dataset contains information about sales across one weekend and two states for three different products.

Date

State

ProdId

Sales

3/9/16

CA

Big Trike 9000

500

3/9/16

NV

Fast GoKart 5000

200

3/9/16

CA

SuperQuick Scooter

700

3/9/16

CA

Fast GoKart 5000

900

3/9/16

NV

Big Trike 9000

300

3/9/16

NV

SuperQuick Scooter

250

3/10/16

NV

Fast GoKart 5000

50

3/10/16

NV

Big Trike 9000

400

3/10/16

NV

SuperQuick Scooter

150

3/10/16

CA

Big Trike 9000

600

3/10/16

CA

SuperQuick Scooter

800

3/10/16

CA

Fast GoKart 5000

1100

Transformation 1: Sum of sales by State for each Date

Apply this transformation in Preview only, just so you can see the results:

Transformation Name

Pivot columns

Parameter: Column labels

Date

Parameter: Row labels

State

Parameter: Values

sum(Sales)

Cancel the transformation update.

State

sum_Sales_03/09/2016

sum_Sales_03/10/2016

CA

2100

2500

NV

750

600

Transform 2: Sum of Sales by Date for each State

Transformation Name

Pivot columns

Parameter: Column labels

State

Parameter: Row labels

Date

Parameter: Values

sum(Sales)

Date

sum_Sales_CA

sum_Sales_NV

03/09/2016

2100

750

03/10/2016

2500

600

Cancel the transformation update again.

Transform 3: Sum of Sales by product ID for each State

Transformation Name

Pivot columns

Parameter: Column labels

State

Parameter: Row labels

ProdId

Parameter: Values

sum(Sales)

ProdId

sum_Sales_CA

sum_Sales_NV

Big Trike 9000

1100

700

Fast GoKart 5000

2000

500

SuperQuick Scooter

1500

300

Example - Multi-column Pivot

The Pivot Columns transformation supports the ability to use the values in multiple columns to specify the columns that are generated. When two or more columns are used in the pivot, columns containing values of all possible combinations from the source columns are generated.

In the following source data, sales data on individual products is organized by brand, product name, and month:

Brand

Product

Month

Sales

AAA

Towels

January

95

AAA

Napkins

January

113

B

Towels

January

99

B

Tissues

January

88

AAA

Towels

February

108

AAA

Napkins

February

91

B

Towels

February

85

B

Tissues

February

105

AAA

Towels

March

81

AAA

Napkins

March

92

B

Towels

March

112

B

Tissues

March

104

Transformation:

If you wanted to create summary sales information for each product by month, you might choose to create a pivot on the Product column. However, if you look at the column values, you might notice that the Paper towels product is available for both the AAA and B brands. In this case, you must perform a multi-column pivot on these two columns, like the following:

Transformation Name

Pivot columns

Parameter: Column labels

Brand,Product

Parameter: Row labels

Month

Parameter: Values

average(Sales)

Results:

Month

avg_Sales_B_Towels

avg_Sales_B_Tissues

avg_Sales_AAA_Napkins

avg_Sales_AAA_Towels

January

99

88

113

95

February

85

105

91

108

March

112

104

92

81

Example - Aggregate Values

You can use the pivot transform to perform aggregation calculations on values in a column.

Source:

In the following table, you can review test results from three different tests for 10 students:

StudentId

TestId

Score

s001

t001

98

s001

t002

98

s001

t003

87

s002

t001

92

s002

t002

96

s002

t003

79

s003

t001

99

s003

t002

76

s003

t003

94

s004

t001

93

s004

t002

99

s004

t003

80

s005

t001

79

s005

t002

80

s005

t003

84

s006

t001

93

s006

t002

74

s006

t003

89

s007

t001

86

s007

t002

81

s007

t003

97

s008

t001

82

s008

t002

73

s008

t003

79

s009

t001

96

s009

t002

97

s009

t003

79

s010

t001

80

s010

t002

79

s010

t003

75

Transformation:

For each student, you're interested in a student's average score and maximum score. You create the following pivot transform, which groups computations of average score and maximum score columns by studentId value.

Transformation Name

Pivot columns

Parameter: Row labels

StudentId

Parameter: Values

average(Score),max(Score)

Parameter: Max number of columns to create

50

For the generated average column, you want results rounded to two decimal places:

Transformation Name

Edit column with formula

Parameter: Columns

average_Score

Parameter: Formula

round(average_Score, 2)

Results:

StudentId

average_Score

max_Score

s001

94.33

98

s002

89

96

s003

89.67

99

s004

90.67

99

s005

81

84

s006

85.33

93

s007

88

97

s008

78

82

s009

90.67

97

s010

78

80