EXAMPLE - LIST and UNIQUE Function

This example demonstrates you to extract values from one column of an array into a new column.

Functions:

Item

Description

LIST Function

Extracts the set of values from a column into an array stored in a new column. This function is typically part of an aggregation.

UNIQUE Function

Extracts the set of unique values from a column into an array stored in a new column. This function is typically part of an aggregation.

DATEFORMAT Function

Formats a specified Datetime set of values according to the specified date format. Source values can be a reference to a column containing Datetime values.

You have the following set of orders for two months, and you are interested in identifying the set of colors that have been sold for each product for each month and the total quantity of product sold for each month.

Source:

OrderId

Date

Item

Qty

Color

1001

1/15/15

Pants

1

red

1002

1/15/15

Shirt

2

green

1003

1/15/15

Hat

3

blue

1004

1/16/15

Shirt

4

yellow

1005

1/16/15

Hat

5

red

1006

1/20/15

Pants

6

green

1007

1/15/15

Hat

7

blue

1008

4/15/15

Shirt

8

yellow

1009

4/15/15

Shoes

9

brown

1010

4/16/15

Pants

1

red

1011

4/16/15

Hat

2

green

1012

4/16/15

Shirt

3

blue

1013

4/20/15

Shoes

4

black

1014

4/20/15

Hat

5

blue

1015

4/20/15

Pants

6

black

Transformation:

To track by month, you need a column containing the month value extracted from the date:

 Transformation Name Edit column with formula Date DATEFORMAT(Date, 'MMM yyyy')

You can use the following transform to check the list of unique values among the colors:

 Transformation Name Pivot columns Date unique(Color, 1000) 10

Date

unique_Color

Jan 2015

["green","blue","red","yellow"]

Apr 2015

["brown","blue","red","yellow","black","green"]

Delete the above transform.

You can aggregate the data in your dataset, grouped by the reformatted Date values, and apply the LIST function to the Color column. In the same aggregation, you can include a summation function for the Qty column:

 Transformation Name Pivot columns Date list(Color, 1000),sum(Qty) 10

Results:

Date

list_Color

sum_Qty

Jan 2015

["green","blue","blue","red","green","red","yellow"]

28

Apr 2015

["brown","blue","red","yellow","black","blue","black","green"]

38