Microsoft Azure Data Lake Store

バージョン:
Current
Last modified: March 27, 2020
Connection Type

This tool is not automatically installed with Designer. The latest version is available from the Alteryx Analytics Gallery.

Alteryx Tools Used to Connect

Link
Gray icon with file folder

Microsoft Azure Data Lake File Input Tool

Link
Gray icon with file graphic inside

Microsoft Azure Data Lake File Output Tool

Release Notes
Version Description
v2.0
  • UI upgrade and improved error handling
  • Added support for Gen2 storages
  • Added support for Azure Government, China Cloud and custom endpoints
  • Shared Key authentication support
  • Public application support (own and Alteryx)
  • Multi-tenancy support
  • Excel input and output support
  • Added the ability to use custom delimiters for reading and writing .csv files
  • Compatible with Alteryx Designer version 2019.3 and later.
v1.1.0
  • Fixed end user authentication errors
  • Allowed users to specify redirect URI for end user authentication
v1.0.2
  • Update Code Page options.

  • Distinguished between encodings with same language (e.g., ‘Language’ -> ‘Language (specific code)’) and ordered encodings alphabetically.

  • Allowed user to specify encoding for CSV files on output tool.

  • Improved error message to indicate when an invalid store name is provided.

  • Improved data conversion handling to not throw a warning instead of an error when a field is missing a value.

  • Fixed error where the files/folders displayed are not refreshed after user changes store name.

  • Fixed issue where default value settings were occasionally not respected.

  • Disabled production logging to prevent permissions issues for different installations/configurations of Designer and support scheduled workflow functionality

v1.0.1
  • Fixed issue preventing packages from being installed successfully
v1.0.0
  • Initial release for Azure Data Lake File Input and Azure Data Lake File Output

 

The Azure Data Lake Tools allow you to connect to an Azure Data Lake Store resource and read/write data.
Use the Azure Data Lake (ADL) File Input tool to read data from files located in an Azure Data Lake Store (ADLS) to your Alteryx workflow.
To write a data from your Alteryx workflow to a file located in an ADLS, use the ADL File Output tool
The supported file formats are CSV, XLSX, JSON, or Avro (for the Output tool, the Append action is supported only for CSV format).
All these tools, except Shared Key, authenticate against an Azure Active Directory endpoint.

Authentication and Authorization

The Azure Data Lake endpoints for Gen1 and Gen2 storages differ, during the authentication, you need to specify which kind of storage you would like to connect to. In case you are not certain what type of storage you are using, you can ask your Azure administrator or check on your Microsoft Azure Portal.

TIPS

  • For publishing workflows to Server or AAH, use the Service-to-Service or Shared Key authentication types, so you do not have to re-upload your workflow once your Refresh Token expires.
  • Since loading the metadata can take a long time, you can disable metadata loading by selecting 'Disable Auto Configure in the Advanced User Settings (Options > User Settings > Edit User Settings > Advanced).

You need to have granted permissions to read and write data within an Azure Data Lake Store account. For more information about how these permissions are assigned and applied, see the official Microsoft documentation.

Single vs. Multi-Tenancy

Single-tenant applications are available only in the tenant they were registered in, also known as their home tenant. You or your Azure Administrator can create single-tenant Azure applications and storage under your account which you will use during authentication in Designer. Multi-tenant apps are available to users in both their home tenant and other tenants.

    End-User (Basic)

    The basic End-User authentication is the most convenient way of accessing your ADLS data in Designer. Contact your Azure Administrator to allow the public Alteryx applications in your organization’s Azure tenant. See the Microsoft documentation describing the steps

    Tenant: common
    ADLS Client ID for the Gen1 Alteryx application: 7fa1a397-27aa-40ad-b47c-a47fa9e600bd
    ADLS Client ID for the Gen2 Alteryx application: 2584cace-63ff-47cb-96d2-d153704f4d75


    After this setup, you and your colleagues can use your normal Microsoft credentials to access the ADLS data.

    End-User (Advanced)

    The advanced End-User authentication supports single- and multi-tenant authentication, and can be used with both public and private applications.
    For the Credential setup, see the instructions on Microsoft documentation.

    Authentication Configuration

    • Tenant ID: You can obtain the tenant ID from your Azure Portal, or rely on the auto-discovery mechanism in Azure by typing “common” in the Tennant ID field. In case of access to multiple tenants, you can specify the tenant ID. For more information on multi-tenancy, see the Single vs. Multi-Tenancy section.
    • Client ID: The unique identified of an Azure application. The client ID field is mandatory. 
    • Client secret: If your application is private, then it is mandatory to provide a client secret. If you are using a public application, please leave the field empty.

    Service-to-Service

    The Service-to-Service authentication is suitable for publishing workflows on Server and Hub.
    For the Credential setup, see the instructions on Microsoft documentation.

    Shared Key

    • Shared Key authentication can be used only with Gen2 storages.
    • Publishing to Server will only work for Designer and Server 2020.4 and newer versions because this authentication method was introduced starting with the 2020.4 releases.

    With an Azure storage account, Microsoft generates two access keys that can be used to authorize access to your Azure Data Lake via Shared Key authorization. You can find more information about the Shared Key an its usage on Microsoft documentation

    Azure National Clouds and Custom Endpoints

    Starting with the v2.0 release, the ADLS connectors support access to custom endpoints. The URLs for the US and China national clouds can be selected on the authentication screen of the connectors in the Authentication Authority Endpoint field.

    Application Setup

    The file storages are accessed via registered applications. The application registration is necessary for all authentication types with the exception of End-User (Basic) and Shared Key. To register the application on Azure Portal, see instructions on Microsoft Documentation Portal.

    Use Microsoft Azure Applications in Alteryx Designer

    1. Add Azure Data Lake Input or Azure Data Lake Output on the Designer canvas.
    2. Select the tool to see the Configuration panel on the right.
    3. Fill in the authentication data with ones available on http://portal.azure.com/. To navigate on the Azure Portal, refer to Microsoft Documentation.
    4. Copy Directory ID (tenant) and Application ID (client)  to the Designer
    5. (Optional) Select Use Gen1 if you want to connect to Azure Data Lake Gen1 storage.
    6. Paste Client Secret if connecting in Service-to-Service mode.
    7. Select Connect.

    Data Selection and Configuration Options

    In the Data tab, you can specify the data you would like to use:

    1. Specify the Storage Account Name. This storage needs to be the same type (Gen1, respectively Gen2), as selected on the Authentication page. 
    2. For Gen2 storages, specify File System Name.
    3. Once the storage and file system for Gen2 have been selected, you can configure the path of the file you would like to read or write. You can specify the path either by direct input in the File Path field or using the file browser. For the Azure Data Lake File Output tool, you can use the same mechanism to create a new file. 
    4. For Excel files, the Sheet name can be specified in the Sheet field located under the file browser. If left empty, the first sheet will be automatically selected. In case of new files, the sheet will be given the default name “Sheet”.

    File Formats and Configuration

    The ADLS tools support the following data formats: .csv, .avro, .json and .xlsx.

    • CSV files
      • Read
      • Write: You can overwrite or append to an existing CSV file. 

    Tip

    For compatibility with the Input and Output Data Tools, the encoding should be UTF-8 SIG.

    • JSON files
      • Read: To correctly read JSON files, they must be using UTF-8 encoding without BOM. 
      • Write: The datatype conversion when writing to JSON files has the following limitations: Decimal, Datetime, and Time cells are output as Strings.
    • Avro files
      • Read
      • Write
    • Excel files
      • Read: All data is read as V_Wstrings. 
      • Write

    Additional Details

    • If in a state without access (read/write) to a certain folder created by another account, this is due to permissions.
    • If you encounter an error that states the token may have been revoked, you must log out and then back in to the configuration panel to reauthenticate.

    Token lifetime properties are configurable by the System Administrator.

    The Azure Data Lake Explorer must grant permissions to read and write data within an Azure Data Lake Store account. For more information about how these permissions are assigned and applied, please visit the official Microsoft documentation.

    制限

    JSON and Avro: UTF-8 Only

    JSON and Avro are UTF-8 only.
     

    JSON: Silent Conversion Error

    For JSON, there is a silent conversion error if you try to store numbers that are too large for their datatype.

    Writing to Excel Files Limited

    Writing to Excel files is currently limited to only a full file overwrite.

    アブロバイトフィールド型

    型バイトのフィールドを持つアブロファイルはサポートされておらず、インポート時に失敗します。

    出力: Alteryx のフロートフィールドからアブロ型への変換

    Alteryx ワークフローフロートフィールド値は、デスティネーションアブロファイルで double に変換されます。

    異なるAzure Active Directory ユーザー アカウントを持つ複数のコネクタ

    Microsoft Azure データ レイク、OneDrive、および Dynamics CRM コネクタは、電子メールやパスワードなどの Microsoft ユーザー資格情報による認証をサポートします。対話型ワークフローでは、現在、これらのコネクタ間で異なる Microsoft ユーザーアカウントを使用して認証することはできません。この制限は、スケジュールされたワークフローには影響しません。これらのコネクタのいずれかで microsoft ユーザーアカウントを使用して認証され、別の microsoft ユーザーアカウントを使用して他のコネクタへの認証を試みる場合は、エラーメッセージポップアップが表示されます。この問題を解決するには、次の推奨事項に従ってください。

    • Azure Active Directory 管理者は、1つのユーザーアカウントに必要なアクセス許可を付与し、ワークフローを構築するユーザーが、そのワークフローで必要なサービスにアクセスできる1つのユーザーアカウントを持っていることを確認できます。

    •ログインを試みる前に、別の Microsoft ユーザーアカウントに認証されているすべてのコネクタからログアウトします。

    •可能な場合はエンドユーザー認証を使用しないでください。Dynamics CRM コネクタで、Azure データレイクコネクタおよびアプリケーションログイン認証でサービス間認証を使用します。

    役に立ちましたか?

    Running into problems or issues with your Alteryx product? Visit the Alteryx Community or contact support. Can't submit this form? Email us.