Avro Data Types

Use the Input Data tool to read uncompressed and Deflate-compressed Avro files and use the Output Data tool to write Avro files.

Input

Only Deflate compression is supported.

Most of the 14 native Avro data types are supported. The type mapping on import from is as follows:

The following Avro types are not supported natively, but are imported as JSON into a String (use the JSON Parse Tool to convert as necessary):

Output

When writing Avro files, there are two options:

  1. Enable Compression (Deflate): Enabling compression will increase output time but, with larger files, will also reduce network time. The supported compression uses the DEFLATE algorithm (essentially gzip) and should be supported natively by other Avro-capable tools such as Hive.
  2. Support Null Values: Selecting this option will write _all_ fields as Unions with a null branch and a value branch. If the Alteryx value is null the output Avro union will have its null branch selected, otherwise it will have its value branch selected.
  3. If this option is not selected, all output fields will be written as their native Avro types (non-union).  Alteryx fields that are null will be written as their default value (for example, the number 0 for an int32 and an empty string for a string field).

    Consider using a Formula Tool to handle Null values with a 'known' value so they can be handled in Hadoop.

The type mapping from Alteryx to Avro is as follows: