Code Pages

A Code Page (also referred to as Character Set or Encoding) is a table of values where each character has been assigned a numerical representation. A code page enables a computer to identify characters and display text correctly.

Alteryx supports many code pages that can be selected when inputting and outputting data files via the Input Data Tool and Output Data Tool, or when converting data types using the Blob Convert Tool. Additionally, the ConvertFromCodepage and ConvertToCodepage functions, available within tools that have an expression editor, can use code page identifiers to convert strings between code pages and Unicode (the universal character-encoding standard for all written characters).

Alteryx assumes that a string is Unicode if wide, or Latin 1 if narrow, so if you convert a string to a code page, it will not display correctly. Therefore, code pages should only be used to override text encoding issues within a file. Code pages can be different on different computers, or can be changed for a single computer, leading to data corruption. For the most consistent results, applications should use Unicode, such as UTF-8 or UTF-16, instead of a specific code page, since Unicode allows for the encoding of different languages in the same data stream.

Unicode UTF-8 is the most portable and compact way to store any character and is used most often. Both UTF-8 and UTF-16 are variable-width encoding, but UTF-8 is compatible with ASCII and the files tend to be smaller than with UTF-16.

For more information on code pages, see the MSDN Library.