Code Pages

A Code Page (also referred to as Character Set or Encoding) is a table of values where each character has been assigned a numerical representation. A code page enables a computer to identify characters and display text correctly.

Alteryx supports many code pages that you can select when you input and output data files via the Input Data tool and Output Data tool, or when you convert data types with the Blob Convert tool. Additionally, the ConvertFromCodepage and ConvertToCodepage functions (available within tools that have an expression editor) can use code page identifiers to convert strings between code pages and Unicode®, the universal character-encoding standard for all written characters as created by the Unicode Consortium.

Alteryx assumes that a wide string is a Unicode® string and a narrow string is a Latin 1 string. If you convert a string to a code page, it will not display correctly. Therefore, code pages should only be used to override text encoding issues within a file. Code pages can be different on different computers or can be changed for a single computer, leading to data corruption. For the most consistent results, use Unicode®, like UTF-8 or UTF-16 encoding, instead of a specific code page, which allows different languages to be encoded in the same data stream.

UTF-8 is the most portable and compact way to store any character and is used most often. Both UTF-8 and UTF-16 are variable-width encoding, but UTF-8 is compatible with ASCII and the files tend to be smaller than with UTF-16.

For more information on code pages, go to MSDN Library.

To support the same functionality on Linux, Alteryx employs the ICU library. We use the same IDs that are on Windows, converting them with ICU converters. ICU does not support the whole list of Windows encodings or there can be differences when converting the data from one code page to another.

Code Page Identifiers

These code page identifiers are supported with the ConvertFromCodepage and ConvertToCodepage functions. Go to Functions for more information.

ID	Description	Support
37	IBM EBCDIC - U.S./Canada	Original engine and AMP.
500	IBM EBCDIC - International	Original engine and AMP.
932	ANSI/OEM - Japanese Shift-JIS	Original engine and AMP.
949	ANSI/OEM - Korean EUC-KR	Original engine and AMP. Not supported for the Download and Blob Convert.
1250	ANSI - Central Europe	Original engine and AMP.
1251	ANSI - Cyrillic	Original engine and AMP.
1252	ANSI - Latin I	Original engine and AMP.
1253	ANSI - Greek	Original engine and AMP.
1254	ANSI - Turkish	Original engine and AMP.
1255	ANSI - Hebrew	Original engine and AMP.
1256	ANSI - Arabic	Original engine and AMP.
1257	ANSI - Baltic	Original engine and AMP.
1258	ANSI/OEM - Vietnamese	Original engine and AMP.
10000	MAC - Roman	Original engine and AMP.
28591	ISO 8859-1 Latin I	Original engine and AMP.
28592	ISO 8859-2 Central Europe	Original engine and AMP.
28593	ISO 8859-3 Latin 3	Original engine and AMP.
28594	ISO 8859-4 Baltic	Original engine and AMP.
28595	ISO 8859-5 Cyrillic	Original engine and AMP.
28596	ISO 8859-6 Arabic	Original engine and AMP.
28597	ISO 8859-7 Greek	Original engine and AMP.
28598	ISO 8859-8 Hebrew: Visual Ordering	Original engine.
28599	ISO 8859-9 Latin 5	Original engine and AMP.
28605	ISO 8859-15 Latin 9	Original engine and AMP.
54936	Simplified Chinese GB18030	Original engine and AMP. Not supported for the Download and Blob Convert tools.
65001	Unicode UTF-8	Original engine and AMP.
1200	Unicode UTF-16	Original engine and AMP.

Code Pages

Code Page Identifiers

Search results