|Type of Support||
Read & Write, In-Database
Database Version: 1.0.35649
For more information about the Simba Athena ODBC driver, see the Simba ODBC documentation.
To avoid error when saving your workflow to Server, select the Encrypt Password For: All Users of This Machine checkbox in Simba Amazon Redshift ODBC Driver DSN Setup.
Alteryx Tools Used to Connect
Standard Workflow Processing
Configure an ODBC Connection
In the ODBC Data Source Administrator...
- Select the Redshift driver and select Configure.
- Enter your Connection Settings and credentials.
- In the Additional Options area, select the Retrieve Entire Results Into Memory option.
This setting would fetch the entire dataset into the physical memory. If the physical memory is low, this setting is subjected to change based on the data volume and the available physical memory, and you may need to involve your DBA for a recommended setting.
- Select OK to save the connection.
Configure an Amazon Redshift Bulk Connection
To use the bulk connection via the Output Data tool...
- Select the Write to File or Database dropdown and select Other Databases > Amazon Redshift Bulk.
- Select a Data Source Name (or select ODBC Admin to create one). See ODBC and OLEDB Database Connections.
- (Optional) Enter a User Name and Password.
- In the Amazon S3 section, enter or paste your AWS Access Key and AWS Secret Key to access the data for upload.
- In the Secret Key Encryption dropdown, select an encryption option:
- Hide: Hide the password using minimal encryption.
- Encrypt for Machine: Any user on the computer is able to fully use the connection.
- Encrypt for User: The signed-in user can use the connection on any computer.
- In the Endpoint dropdown, select Default to allow Amazon to determine the endpoint automatically based on the bucket you select. To specify an endpoint for private S3 deployments, or if you know a specific bucket region, you can alternately select an endpoint (S3 region), enter a custom endpoint, or select from one of ten previously-entered custom endpoints.
If the Bucket you select is not in the region of the endpoint you specify, this error occurs: "The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint." Select Default to clear the error.
- (Optional) Select Use Signature V4 for Authentication to use Signature Version 4 instead of the default Signature Version 2. This increases security, but connection speeds might be slower. This option is automatically enabled for regions requiring Signature Version 4.
Regions That Require Signature Version 4: Regions created after January 30, 2014 support only Signature Version 4. These regions require Signature Version 4 authentication:
- US East (Ohio) Region
- Canada (Central) Region
- Asia Pacific (Mumbai) Region
- Asia Pacific (Seoul) Region
- EU (Frankfurt) Region
- EU (London) Region
- China (Beijing) Region
- Select a Server-Side Encryption method for uploading to an encrypted Amazon S3 bucket. For more information on Amazon S3 encryption methods, see the Amazon Simple Storage Service Developer Guide.
- None (Default): No encryption method is used.
- SSE-KMS: Use server-side encryption with AWS KMS-managed keys. Optionally provide a KMS Key ID. When you select this method, Use Signature V4 for Authentication is enabled by default.
- In Bucket Name, enter the name of the AWS bucket in which your data objects are stored.
Optionally select Use Redshift Spectrum to connect to Spectrum tables.
When bulk loading data to Amazon Redshift, data is written to incorrect fields when order of fields in workflow output is different from the one in the Redshift database. To workaround this:
- Select the Append Field Map option in the Output Data tool to configure it, even if you don't change the default setting. In the workflow XML of the Output Data tool, this will populate the <AppendMapping mode="ByName" /> tag.
- Change Output Option to Overwrite Table (Drop).
Configure Output Options
You can optionally specify or adjust the following Redshift options. For more information, see the Amazon Redshift Database Developer Guide.
To create Spectrum tables with the Output Data tool, specify both the schema and table name.
Distribution Key is ignored if 'Key' is not selected for Distribution Style. Sort Key is ignored if 'None' is selected for Sort Style.
- Primary Key: Select columns for the Primary Key and adjust the order of columns.
- Distribution Style: Select Even, Key, or All.
- Distribution Key: Select a column for the Distribution Key.
- Sort Style: Select None, Compound, or Interleaved.
- Sort Key: Select columns for the Sort Key and adjust the order of columns.
- Enable Vacuum and Analyze Operations: (Bulk connections only) Enabled by default. When enabled, VACUUM and ANALYZE maintenance commands are executed after a bulk load APPEND to the Redshift database.
- Size of Bulk Load Chunks (1 MB to 102400 MB): To increase upload performance, large files are split into smaller files with a specified integer size, in megabytes. The default value is 128.
- Enable backslash (\) as escape character: (Bulk connections only) Enabled by default. When enabled, a character that immediately follows a backslash character is loaded as column data, even if that character normally is used for a special purpose (for example, delimiter character, quotation mark, embedded newline character, or escape character).
An identifiers are folded to lowercase in the database. In query results, tables and column names are returned as lowercase by default. For more information, see Amazon Names and Identifiers documentation.