FILL Function

Fills any missing or null values in the specified column with the most recent non-blank value, as determined by the specified window of rows before and after the blank value.

The row from which to extract a value is determined by the order in which the rows are organized at the time that the function is executed.
If you are working on a randomly generated sample of your dataset, the values that you see for this function might not correspond to the values that are generated on the full dataset during job execution.
In addition to the column to which to apply the function, the function accepts two other parameters:
- The first integer parameter defines the number of rows before the row being tested to scan for a non-empty value.
- The second integer parameter defines the number of rows after the row being tested to scan for a non-empty value.
- If no non-empty value is found within these rows, the empty value remains empty.
- The default values are -1 and 0 respectively, which performs an unlimited search before the row for a non-empty value to use to fill.
You can use the group and order parameters to define the groups of records and the order of those records to which this function is applied.
This function works with the following transforms:

Wrangle vs. SQL: This function is part of Wrangle, a proprietary data transformation language. Wrangle is not SQL. For more information, see Wrangle Language.

Basic Usage

fill(myNumber,-1,0)

Output: Returns all values from the myNumber column with any null cells filled by the most recent non-blank value.

fill(myNumber,-5,4)

Output: Returns all values from the myNumber column with any null cells filled by the most recent non-empty value within the window 5 rows before the current row and 4 rows after it.

Syntax and Arguments

fill(col_ref, int_rows_before, int_rows_after) order: order_col [group: group_col]

Argument	Required?	Data Type	Description
col_ref	Y	string	Name of column whose values are applied to the function
int_rows_before	Y	integer	Number of rows before current row to scan for non-empty value. Default is `-1`, which scans all rows before.
int_rows_after	Y	integer	Number of rows after current row to scan for non-empty value. Default is`0`, which scans no rows after.

For more information on the order and group parameters, see Window Transform.

For more information on syntax standards, see Language Documentation Syntax Notes.

col_ref

Name of the column whose values are filled when null.

Multiple columns and wildcards are not supported.

Usage Notes:

Required?	Data Type	Example Value
Yes	String (column reference)	myColumn

int_rows_before

Number of rows before the current row to scan for the most recent non-empty value.

Default value is -1, which scans all preceding rows.
0 does not scan before the current row.

Usage Notes:

Required?	Data Type	Example Value
Yes	Integer	5

int_rows_after

Number of rows after the current row to scan for the most recent non-empty value.

Default value is 0, which does not scan rows after the current row.
-1 scans all following rows.

Usage Notes:

Required?	Data Type	Example Value
Yes	Integer	5

Examples

Dica

For additional examples, see Common Tasks.

Example - Fill with quarterly forecast values

Your monthly sales data includes amount sold for each month. However, the source system only provides the quarterly forecast as part of the data for the first month of each quarter. You can use the FILL function to add the prior forecast to each month's data.

Source:

Date	Amount	Forecast_Qtr
1/31/15	523	1400
2/28/15	135
3/31/15	824
4/30/15	305	1500
5/31/15	763
6/30/15	421
7/31/15	606	1600
8/31/15	477
9/30/15	785
10/31/15	443	1700
11/30/15	622
12/31/15	518

Transformation:

You can use the following transformation to fill the prior forecast value for each blank month in the Forecast_Qtr column. Note that the order parameter must be set to Date to establish the proper sorting:

Transformation Name	`Window`
Parameter: Formulas	fill(Forecast_Qtr,-1,0)
Parameter: Order by	Date

You can now delete the Forecast_Qtr column and rename the generated window column to the deleted name.

To see how you are progressing each month, you might use the following transformation, which computes the average forecast for each month:

Transformation Name	`New formula`
Parameter: Formula type	Single row formula
Parameter: Formula	numformat((Forecast_Qtr/3),'####.##')
Parameter: New column name	'Forecast_Month_Avg'

You can then compare this value to the actual Amount value for each month:

Transformation Name	`New formula`
Parameter: Formula type	Single row formula
Parameter: Formula	numformat(((Amount - Forecast_Month_Avg)/Forecast_Month_Avg)*100, '##.00')
Parameter: New column name	'MonthActualVForecast_Pct'

Results:

Date	Amount	Forecast_Qtr	Forecast_Month_Avg	MonthActualVForecast_Pct
1/31/15	523	1400	466.67	12.07
2/28/15	135	1400	466.67	-71.07
3/31/15	824	1400	466.67	76.57
4/30/15	305	1500	500	-39.00
5/31/15	763	1500	500	52.60
6/30/15	421	1500	500	-15.80
7/31/15	606	1600	533.33	13.63
8/31/15	477	1600	533.33	-10.56
9/30/15	785	1600	533.33	47.19
10/31/15	443	1700	566.67	-21.82
11/30/15	622	1700	566.67	9.76
12/31/15	518	1700	566.67	-8.59

FILL Function

Basic Usage

Syntax and Arguments

col_ref

int_rows_before

int_rows_after

Examples

Example - Fill with quarterly forecast values

Resultados da procura