Skip to main content

Avro Data Types

Use the Input Data tool to read uncompressed and Deflate-compressed Avro files and use the Output Data tool to write Avro files.

Input

Only Deflate compression is supported.

Most of the 14 native Avro data types are supported. The type mapping on import is as follows:

  • String: UTF-8 Converted to V_WString (UTF-16)

  • Bytes: Maintained as Blob (use Blob Tool to convert as necessary)

  • Int: Maintained as Int32

  • Long: Maintained as Int64

  • Float: Maintained as Float

  • Double: Maintained as Double

  • Boolean: Maintained as Bool

  • Null: Not Supported

  • Enum: Converted to String Equivalent

  • Union: Alteryx supports unions with two sub-types. Both sub-types must be equivalent (for example, both int or both double) or one of them must be Null.

    • The Alteryx field type is the type of the non-null branch (or both branches in the case that both are non-null).

    • If the non-null branch is active, the Alteryx field contains that value.

    • If the null branch is active, the Alteryx field is set to null.

    • Invalid unions are imported as JSON into a V_WString (use JSON Parse Tool to convert as necessary). For example, a union with an int as its active branch may be represented as "{"int":123}".

  • Fixed: Maintained as Blob (use Blob Tool to convert as necessary)

These Avro types are not supported natively but are imported as JSON into a String (use the JSON Parse tool to convert as necessary):

  • Record: For example, "{"SubField1":7,"SubField2":"Field2"} for a record containing both int and string fields.

  • Array: For example, "[1,2,3,4,5]" for an array of ints.

  • Map: For example, "{"Key1":Value1,"Key2":Value2}" for a map of string to double.

Output

When writing Avro files, there are 2 options:

  1. Enable Compression (Deflate): Enabling compression increases output time but, with larger files, also reduces network time. The supported compression uses the DEFLATE algorithm (essentially gzip) and should be supported natively by other Avro-capable tools such as Hive.

  2. Support Null Values: Selecting this option writes _all_ fields as Unions with a null branch and a value branch. If the Alteryx value is null the output Avro union has its null branch selected, otherwise, it has its value branch selected.

    If this option is not selected, all output fields are written as their native Avro types (non-union). Alteryx fields that are null are written as their default value (for example, the number 0 for an int32 and an empty string for a string field).

    Consider using a Formula tool to handle Null values with a 'known' value so they can be handled in Hadoop.

The type mapping from Alteryx to Avro is as follows:

  • Bool: Maintained as Boolean

  • Byte, Int16, Int32: Maintained Int (32-bit)

  • Int64: Maintained as Long (64-bit)

  • Float: Maintained as Float

  • DateTime as Date, Time, DateTime

  • Double: Maintained as Double

  • FixedDecimal: Converted to Double

  • String, V_String, Date, Time, DateTime: Maintained as String (UTF-8)

  • WString, V_WString: Converted to String (UTF-8)

  • Blob, SpatialBlob: Maintained as Bytes

AVRO file Read/Write From Alteryx Designer

Earlier, in the .avro file, the DateTime field was represented solely using the logical type timestamp-millis (i.e., in UTC). Now, a Time Zone dropdown has been introduced in both the Read and Write options, offering the values No Time Zone and Local Time Zone.

  • No Time Zone: The DateTime is treated as timestamp-millis in Avro. This ensures backward compatibility, as no time zone conversion occurs during Avro read or write operations.

  • Local Time Zone: DateTime values are converted. During Avro write, the conversion is based on the Default Time Zone configured in the Runtime settings.

During Avro read, DateTime conversion is performed based on the stored logicalType and the Default Time Zone specified in the Runtime settings. For example: If the local time zone is IST (UTC +05:30) and the Default Time Zone is CEST (UTC +02:00), a DateTime value of 2026-03-25 22:00:00 will be converted during .avro write. Since IST is 3 hours and 30 minutes ahead of CEST, the value will be adjusted to 2026-03-25 18:30:00 in CEST.

Below is a summary of DateTime conversions during Avro write and read operations. In the examples, the Local Time Zone is IST. The first four columns represent the .avro Write scenario, while the next three columns represent the Read scenario for the same Avro data.

Incoming Time

Workflow Runtime TZ

AVRO TZ

AVRO Write

AVRO TZ

Workflow Runtime TZ

Alteryx AVRO Read

2026, 2, 11, 22:00:00

Local

NTZ

2026, 2, 11, 22:00:00 (Timestamp-millis)

NTZ

Local

2026, 2, 11, 22:00:00

NTZ

UTC

2026, 2, 11, 22:00:00

LTZ

UTC

2026, 2, 11, 22:00:00

LTZ

Local

2026, 2, 12, 03:30:00

2026, 2, 11, 22:00:00

Local

LTZ

2026, 2, 11, 22:00:00 (Local timestamp-millis)

NTZ

Local

2026, 2, 11, 22:00:00

LTZ

Local

2026, 2, 11, 22:00:00

NTZ

UTC

2026, 2, 11, 22:00:00

LTZ

UTC

2026, 2, 11, 16:30:00

2026, 2, 11, 22:00:00

UTC

NTZ

2026, 2, 11, 22:00:00 (Timestamp-millis)

LTZ

UTC

2026, 2, 11, 22:00:00

UTC

2026, 2, 11, 22:00:00

Local

2026, 2, 11, 22:00:00

Local

2026, 2, 12, 03:30:00

2026, 2, 11, 22:00:00

UTC

LTZ

2026, 2, 12, 03:30:00 (Local--timestamp-millis)

LTZ

Local

2026, 2, 12, 03:30:00

NTZ

Local

2026, 2, 12, 03:30:00

NTZ

UTC

2026, 2, 12, 03:30:00

LTZ

UTC

2026, 2, 11, 22:00:00

2026, 2, 11, 22:00:00

CEST

LTZ

2026, 2, 12, 2: 30:00 (Local-timestamp-millis)

LTZ

Local

2026-02-12 02:30:00

2026-02-11 21:00:00

2026-02-12 02:30:00

2026-02-12 02:30:00

2026, 2, 11, 22:00:00

CEST

NTZ

2026, 2, 11, 22:00:00 (Timestamp-millis)

LTZ

Local

2026-02-12 03:30:00

LTZ

UTC

2026-02-11 22:00:00

NTZ

UTC

2026-02-11 22:00:00

NTZ

Local

2026-02-11 22:00:00