Skip to main content

EXAMPLE - Flatten and Valuestocols Transforms

This example shows how you can break out a column of nested values into separate rows and columns of data.

Source:

The following data covers magazine subscriptions for individual customers. Their subscriptions are stored in an array of values. You are interested in who is subscribing to each magazine.

CustId

Subscriptions

Anne Aimes

["Little House and Garden","Sporty Pants","Life on the Range"]

Barry Barnes

["Sporty Pants","Investing Smart"]

Cindy Compton

["Cakes n Pies","Powerlifting Plus","Running Days"]

Darryl Diaz

["Investing Smart","Cakes n Pies"]

Transformation:

When this data is loaded into the Transformer, you might need to apply a header to it. If it is in CSV format, you might need to apply some replace transformations to clean up the Subscriptions column so it looks like the above.

When the Subscriptions column contains cleanly formatted arrays, the column is re-typed as Array type. You can then apply the following transformation:

Transformation Name

Expand Array into rows

Parameter: Column

Subscriptions

Each CustId/Subscription combination is now written to a separate row. You can use this new data structure to break out instances of magazine subscriptions. Using the following transformation, you can add the corresponding CustId value to the column:

Transformation Name

Convert values to columns

Parameter: Column

Subscriptions

Parameter: Fill when present

CustId

Delete the two source columns:

Transformation Name

Delete columns

Parameter: Columns

CustId,Subscriptions

Parameter: Action

Delete selected columns

Results:

Little_House_and_Garden

Sporty_Pants

Life_on_the_Range

Investing_Smart

Cakes_n_Pies

Powerlifting_Plus

Running_Days

Anne Aimes

Anne Aimes

Anne Aimes

Barry Barnes

Barry Barnes

Cindy Compton

Cindy Compton

Cindy Compton

Darryl Diaz

Darry Diaz