DATEFORMAT Function
Formats a specified Datetime set of values according to the specified date format. Source values can be a reference to a column containing Datetime values.
If the source Datetime value does not include a valid input for this function, a missing value is returned.
Dataprep by Trifacta supports a wide variety of formats for Datetime fields.
You can explore the available Datetime formats through the Transformer page. From a column's type drop-down, selectDate/Time. Then, select the formatting category. From the displayed drop-down, you can select a specific format. When this transform step is added to your recipe, you can edit it to see how the format is specified in Wrangle.
For more information on formatting numeric types, see NUMFORMAT Function.
Wrangle vs. SQL: This function is part of Wrangle, a proprietary data transformation language. Wrangle is not SQL. For more information, see Wrangle Language.
Basic Usage
dateformat(MyDate, 'yyyy-MM-dd')
Output: Returns the valid date values in the MyDate
column converted to year-month-day format.
Syntax and Arguments
dateformat(Datetime_col, date_format_string)
Argument | Required? | Data Type | Description |
---|---|---|---|
Datetime_col | Y | datetime | Name of column containing date values to be formatted |
date_format_string | Y | string | String literal identifying the date format to apply to the value |
For more information on syntax standards, see Language Documentation Syntax Notes.
datetime_col
Name of the column whose date data is to be formatted.
Missing values for this function in the source data result in missing values in the output.
Multiple columns and wildcards are not supported.
Usage Notes:
Required? | Data Type | Example Value |
---|---|---|
Yes | Datetime | myDate |
date_format_string
String value indicating the date format to apply to the input values.
Dataprep by Trifacta supports Java formatting strings, with some exceptions.
Note
Two-digit values for the year that are older than 80 years from the current year are forward-ported into the future. For example, in a job run on Dec 31, 2021, the date 01/01/41
is interpreted as 01/01/1941. However, if the job is run the next day (January 01, 2022), then the same data is interpreted as 01/01/2041. For more information including workarounds, see Datetime Data Type.
Note
If the platform cannot recognize the date format string, the generated result is written as a string value.
Missing values for this function in the source data result in missing values in the output.
Multiple columns and wildcards are not supported.
Usage Notes:
Required? | Data Type | Example Value |
---|---|---|
Yes | String | 'MM/dd/yyyy' |
Examples
Tip
For additional examples, see Common Tasks.
Example - formatting date values
This example illustrates several ways of wrangling heterogeneous date values, including the use of the DATEFORMAT
function.
Source:
Your dataset includes the following messy date values:
MyDate |
---|
2/1/00 9:20 |
4/5/10 11:25 |
6/7/99 22:00 |
12/20/1894 15:45:00 |
13/7/1999 22:00:00 |
Transformation:
When this data is loaded into the application, it is not immediately recognized as a Datetime column, as the variation among the data complicates deciding on the proper date format. The first three rows look to be in a consistent format, but the other two are problematic.
You can try to change the column to a Datetime type with a format that matches the first three rows. You can select the appropriate format through the type drop-down. When previewed, the transform looks like the following:
Note
Do not add this transform at this time. It is strictly used for reviewing the effects on data quality.
Transformation Name |
|
---|---|
Parameter: Columns | MyDate |
Parameter: New type | Custom or Date/Time |
Parameter: Specify type | 'mm-dd-yy hh:mm:ss','mm*dd*yy*HH:MM' |
When the column is reformatted, you should notice that the last two values in the column are mismatched. In the column histogram, you can see that date ranges include the 1999 date in the third row, so the final row should work if it was a valid date.
The 1894 value looks like an outlier value and could be removed:
Transformation Name |
|
---|---|
Parameter: Condition | Custom formula |
Parameter: Type of formula | Custom single |
Parameter: Condition | matches([MyDate], `12/20/1894`) |
Parameter: Action | Delete matching rows |
For the remaining 1999 row, you can delete it or use the following transforms to conform it to the other rows. Use the following transform to change the 13
month value to a 12
:
Transformation Name |
|
---|---|
Parameter: Column | MyDate |
Parameter: Find | `13/` |
Parameter: Replace with | '12\/' |
Parameter: Match all occurrences | true |
The following two transforms complete the cleanup steps:
Transformation Name |
|
---|---|
Parameter: Column | MyDate |
Parameter: Find | `/1999` |
Parameter: Replace with | '\/99' |
Parameter: Match all occurrences | true |
Transformation Name |
|
---|---|
Parameter: Column | MyDate |
Parameter: Find | `:#+:00` |
Parameter: Replace with | ':00' |
Parameter: Match all occurrences | true |
If you apply the original formatting step, all dates are valid:
Transformation Name |
|
---|---|
Parameter: Columns | MyDate |
Parameter: New type | Custom or Date/Time |
Parameter: Specify type | 'mm-dd-yy hh:mm:ss','mm*dd*yy*HH:MM' |
Now, your Datetime column can be formatted as needed using the dateformat
function. The following step generates a new column that contains year, month, and day information as a single numeric value:
Transformation Name |
|
---|---|
Parameter: Formula type | Single row formula |
Parameter: Formula | dateformat(MyDate, 'yyyyMMdd') |
Results:
The final dataset should look like the following:
MyDate | dateformat_MyDate |
---|---|
2/1/00 9:20 | 20000201 |
4/5/10 11:25 | 20100405 |
6/7/99 22:00 | 19990607 |
12/7/99 22:00 | 19991207 |
Example - Other date formatting variations
Numeric date, year first
Source | Transformation | Results | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
2/15/16 13:26:58.123 3/12/99 2:45:21.456 11/21/11 23:02:18.000 |
| 2016-02-15 1999-03-12 2011-11-21 |
Numeric date, American style
Source | Transformation | Results | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
2/15/16 13:26:58.123 3/12/99 2:45:21.456 11/21/11 23:02:18.000 |
| 2/15/16 3/12/99 11/21/11 |
Full written date
Source | Transformation | Results | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
2/15/16 13:26:58.123 3/12/99 2:45:21.456 11/21/11 23:02:18.000 |
| February 15, 2016 March 12, 1999 November 21, 2011 |
Abbreviated date,including abbreviatedday of week
Source | Transformation | Results | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
2/15/16 13:26:58.123 3/12/99 2:45:21.456 11/21/11 23:02:18.000 |
| Mon Feb 15, 2016 Fri Mar 12, 1999 Mon Nov 21, 2011 |
Full 24-hour time
Source | Transformation | Results | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
2/15/16 13:26:58.123 3/12/99 2:45:21.456 11/21/11 23:02:18.000 |
| 13:26:58.123 2:45:21.456 23:02:18.000 |
Twelve-hour time with AM/PM indicator
Source | Transformation | Results | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
2/15/16 13:26:58.123 3/12/99 2:45:21.456 11/21/11 23:02:18.000 |
Note For this function, use of the lower-case hour indicator ( | 1:26:58 PM 2:45:21 AM 11:02:18 PM |