Configure Application Limits
This section provides information on various settings that you can specify to apply minimum and maximum limits on the Trifacta Application.
Operating System Limits
Raise ulimit setting
To perform normal operations, the Designer Cloud Powered by Trifacta platform may need to maintain a high number of simultaneously open files, the count of which may exceed the default setting for the operating system (the ulimit).
Note
If the Designer Cloud Powered by Trifacta platform hits the ulimit and is unable to open additional files, jobs may fail, or the platform may be unable to access content. The log may contain something similar to the following error: Failed on local exception: java.net.SocketException: Too many open files
.
By default, the operating system sets the limit on the number of open files at 1024
. Please complete the following steps to raise this limit.
Tip
The ulimit should be raised to 64000
depending on the quality of your hardware.
Steps:
If it is running, stop the Designer Cloud Powered by Trifacta platform. See Start and Stop the Platform.
Verify the current ulimit:
ulimit -Hn
Edit the following file:
/etc/security/limits.conf
.At the bottom of the file, add the following entry, which overrides the defined limit with a value of
16000
:* hard nofile 16000
Please add the following line after the previous one if this error is encountered:
"java.lang.OutOfMemoryError: unable to create new native thread"
. This exception means the ulimit for processes must be increased, too:* hard nproc 16000
Save the file and restart the platform. See Start and Stop the Platform.
Browser Limits
Change body limits
If you are encountering log message where the request submitted from the client is too large, you can try to raise the limit on the size of body objects submitted from the client.
Note
Raising these values too high can overload the browser.
You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json
. For more information, see Platform Configuration Methods.
Setting | Description |
---|---|
"webapp.bodyParser.urlEncoded.limit": "10mb", | Maximum permitted size of the URL-encoded body of a request submitted from the client. Size is in MB. |
"webapp.bodyParser.json.limit": "10mb", | Maximum permitted size of a JSON object submitted from the client. Size is in MB. |
Change maximum number of rows displayed in browser per join key
For each matching join key value, the Trifacta Application displays a maximum of three rows in the browser for the current sample. So, when you join a dataset with repeating key values, you may see a fewer number of rows of data than you would expect.
Note
This issue is limited only to the sampled data that is displayed in the browser. When you run a job across the entire dataset, the proper number of rows are generated in the output.
For some users, this simplification may be confusing. As needed, you can use the following steps to change the maximum number of rows displayed in the browser for each join key.
Steps:
You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json
. For more information, see Platform Configuration Methods.
Search for and modify the following parameter:
"webapp.client.sampleOutputTuplesPerJoinKey": 3,
Save your changes and restart the platform.
Change page preview limit
In the Flow and Dataset pages, you can preview the data in datasets that you have imported or are importing. For example, when you click the Eye icon next to a dataset's name, you can see a preview of the data in the dataset, which is useful for ensuring that you have the correct data.
Depending on the size of the datasets, you may wish to increase the limit on the size of preview data. If you are working with wide datasets, you may need to increase the limit so that you can get a solid preview of the contents.
Note
Increasing this preview size may have performance impacts, particularly on lower-quality desktops. You should make adjustments with caution.
Steps:
You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json
. For more information, see Platform Configuration Methods.
Locate the following setting, which defines the number of bytes that are loaded by default in a preview. Maximum permitted value is
1024000
(1 MB)."webapp.client.previewLoadLimit": 128000,
Save your changes and restart the platform.
After the platform has restarted, you should preview a large dataset to verify that performance is acceptable.
Ingestion
Maximum record length
By default, the maximum length for an individual record is 20 MB. After rows have been split, individual records can be up to this limit in length.
As needed, you can modify this limit.
Steps:
You can apply this change through the Admin Settings Page (recommended) or
trifacta-conf.json
. For more information, see Platform Configuration Methods.Locate the following parameter, which reflects the maximum record length in bytes:
"webapp.maxRecordLength": 209715200,
Modify as needed.
Note
Be careful when you raise this value, which can cause out of memory conditions, empty data grids, and browser crashes. You should raise the value incrementally.
Save your changes and restart the platform.
Timeouts
Change application timeout limits
The front-end application respects the following timeout settings for queries issued to back-end datastores, including the Alteryx database.
You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json
. For more information, see Platform Configuration Methods.
Settings | Description |
---|---|
webapp.timeoutMilliseconds | Overall timeout limit in milliseconds for the front-end application. Default value is |
jsdata.remoteTransformTimeoutMilliseconds | Timeout limit in milliseconds for the Transformer Page. This setting is an override of the previous one. Default value is |
You can change the timeout settings if you are experiencing timeouts or other errors because of long-running queries to external data connections.
Note
In most environments, these settings should not be changed. Lowering them can cause reasonable queries to fail, and raising them too high can cause performance issues. Please adjust them only if you are experiencing very long query times to external sources, especially for database views.
Steps:
You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json
. For more information, see Platform Configuration Methods.
Locate the following configuration. Specify new timeout values in milliseconds:
"webapp.timeoutMilliseconds": 120000,
"jsdata.remoteTransformTimeoutMilliseconds": 180000,
Save your changes and restart the platform.
Session timeout
By default, the maximum session duration is set to be one month. If needed, you can change the maximum session duration, as well as other session parameter values.
Steps:
You can apply this change through the Admin Settings Page (recommended) or
trifacta-conf.json
. For more information, see Platform Configuration Methods.Modify the following parameters, as needed:
Parameter
Description
Default
webapp.session.refreshEmbeddedExpiryDateAfterMinutes
Refresh interval in minutes for the expiration date embedded in the session cookie
5
webapp.session.cookieSecureFlag
Set a secure cookie in the client application.
false
You apply this change through the Workspace Settings Page. For more information, see Platform Configuration Methods.:
Setting
Description
Default
Session duration
Maximum session duration in minutes
10080
(one week)Save your changes and restart the platform.
Timeout for suggestion card suggestions
By default, the platform waits a specified length of time for the machine learning service to return suggestion cards. When more time is enabled, the service may be able to discover better suggestions based on the currently selected data.
If needed, you can change the delay limit from its default value of 80
milliseconds.
Steps:
You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json
. For more information, see Platform Configuration Methods.
Locate the following setting and change its value:
"feature.mlTransformSuggestions.delayThreshold": 80,
Save your changes and restart the platform.
Jobs
Maximum number of flow jobs launched in parallel
By default, the Designer Cloud Powered by Trifacta platform permits up to four jobs from the same flow to be launched in parallel for execution. If there are more flow job launches than this limit, the additional jobs are queued for execution after one or more of the launched jobs has completed.
Tip
This limit is most relevant when you are running a scheduled job, which can execute all jobs in a flow at the same time.
Max parallel jobs setting | Description |
---|---|
4 | (Default) Up to four jobs from the same flow can be launched and in the process of execution at the same time.
|
1 | Jobs from the flow are executed sequentially. |
0 | No limit. All jobs from a flow can be executed at the same time. |
Steps:
You can apply this change through the Admin Settings Page (recommended) or
trifacta-conf.json
. For more information, see Platform Configuration Methods.Locate the following parameter. Modify it according to your needs:
"webapp.jobLaunchingBatchSize": 4,
Save your changes and restart the platform.
Job status polling interval
Periodically, the application polls the running environment to check the status of jobs in transit. This polling occurs in the following areas of the application:
Job History page - Checks to see if running jobs have been resolved.
Flow View page - Checks to see if running jobs have been resolved.
Transformer page - Checks to see if sampling jobs have been resolved.
Note
This setting does not apply to the initial sample which is derived from the first N rows of the dataset.
As needed, you can modify the interval at which the application polls for job status from these area. The default value is 5000
milliseconds (5 seconds).
Note
If this setting is lowered too much, polling requests can overlap, resulting in no updates to the application. Application performance can be impeded.
Steps:
You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json
. For more information, see Platform Configuration Methods.
Locate the following setting and change its value:
"webapp.polling.jobStatusInMillis" : 5000,
Save your changes and restart the platform.
Sampling
The following configuration settings define the size of samples stored in the base storage layer and transmitted to the user's web browser for display through the Transformer page.
Size of stored samples
By default, samples are generated and stored in the base storage layer up to 40 MB in size.
Note
This size is applied to all user-generated samples. Modifications to this size can significantly change the volume of data stored in the backend.
Note
If the datasource is compressed or must be converted during ingestion, the stored size of the sample on the base storage layer can exceed this limit.
Tip
This size should be modified in conjunction with any changes to the maximum size of transferred samples, which is described in the following section.
You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json
. For more information, see Platform Configuration Methods.
Setting | Default value | Description |
---|---|---|
| 41943040 | Sets the requested size of samples in bytes that are stored in the base storage layer for each sample. Note This parameter defines the storage size of samples on the backend. Default storage size is four times larger than in previous releases. Note For datasources that must be decompressed or converted during ingest, the actual storage volume may be larger than this limit. |
Sample size load limit
By default, samples that are transferred to the client in the web browser for users are 10 MB in maximum size. If desired, users can increase or decrease this sample size on a per-recipe basis.
As needed, you can configure the following:
setting the default size of samples displayed in the browser (default is 10 MB)
setting the maximum size of samples displayed in browser (default is 40 MB)
users can override the actual size of the sample downloaded to their browser based on their own experience
Notes:
Increasing the sample size may degrade the user experience in the Transformer page in the following ways:
Generation of column details and data grid histograms
Preview card loading time
Time required to complete brushing and linking in histograms
Note
If you increase the sample size above the default setting and encounter unacceptable performance in the above areas, you should reduce the sample size settings.
You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json
. For more information, see Platform Configuration Methods.
Setting | Default value | Description and Notes |
---|---|---|
webapp.client.defaultLoadLimit | 10485760 | Sets the default maximum number of bytes that can be loaded into the browser for samples. Note On a per-recipe basis, users can override this setting through the Transformer page. See Change Recipe Sample Size. |
| 41943040 | Sets the maximum number of bytes that can be loaded into the browser for samples. Note This value cannot be overridden by users. Users can set the sample size in their browser up to this limit and no higher. |
Photon random sample load limit
Unless it is not available for some reason, Trifacta Photon is used to generate random samples. By default, the Trifacta Photon running environment loads a maximum of 1 GB (1024 MB) of data from the imported dataset for generating a new random sample. This data comes from the top of the file, meaning that rows that are deeper than 1 GB in the source data cannot be included in any generated random sample.
From this selection of data, a sample of the data is derived for display in the data grid. As needed, you can configure the random sample limit to include a larger or smaller volume of maximum data.
Note
Be careful making adjustments to this setting. If the volume of data is too large, you can crash the running environment.
Steps:
You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json
. For more information, see Platform Configuration Methods.
Locate the following setting, which is listed in terms of bytes. The default value listed below corresponds to 1 GB of data:
"webapp.sampleLoadLimit": 1073741824,
Save your changes and restart the platform.
Wrangling Limits
Maximum split limits
By default, a single split operation can break up a single column into 250 separate columns. As needed, you can change this maximum value.
Note
Increasing this limit consumes more resources and may overload the Trifacta Application or your browser. Adjust with caution.
Tip
This limit also to extraction of keys and values from Objects and Arrays.
Steps:
You can apply this change through the Admin Settings Page (recommended) or trifacta-conf.json
. For more information, see Platform Configuration Methods.
Locate the following setting, which represents the maximum number of columns that can be generated by one of the applicable steps:
"feature.delimSplitColumnLimit": 250,
Save your changes and restart the platform.
Relational limits
Miscellaneous limits
Date range limit
By default, the Designer Cloud Powered by Trifacta platform supports the following date range for Datetime data type validation:
January 1, 1400 - December 31, 2599
This date range is validated against the following default regular expression:
((?:1[4-9]|2[0-5])\d{2})
As needed, you can change the above regular expression to define your preferred date range for the Datetime data type. Your regular expression must be in the following format:
(<your_regular_expression>)
For example, the following regular expression allows dates up to December 31, 9999
:
((?:1[4-9]|[0-9][0-9])\d{2})
Note
Use of Alteryx patterns in this field is not supported. The entry must be a valid regular expression.
Steps:
You can apply this change through the Admin Settings Page (recommended) or
trifacta-conf.json
. For more information, see Platform Configuration Methods.Locate the following parameter:
webapp.yearFourDigitRegex
Insert your regular expression in the required format.
Save your changes and restart the platform.
You should check your new Datetime date range validation against some sample data.