AMP Engine Best Practices

Purpose of this document is to answer questions related to the new AMP Engine. The document is intended for both existing admins and new users.

For more information about AMP Engine, go to the Alteryx AMP Engine and Engine help pages.

For more information about Server system requirements, visit the System Requirements help page.

General Topics

Why do I want to enable AMP engine?

For most use cases AMP engine provides significant performance and efficiency improvements over the original Engine when provided with sufficient system resources. For more information on system resource requirements and recommendations, see the sections How to manage system resources with AMP? and What are the AMP engine system requirements?.
AMP is designed to work with larger volumes of data at a higher velocity and typically executes workflows faster, with more complete usage of machine resources compared to the original Engine.
The original Engine architecture allows for mostly single-threaded processing, where your data are processed record-by-record sequentially. On the other hand, the new AMP concept allows for massively multi-threaded processing. AMP processes records in packets to improve run times, and tools can run in parallel. AMP also uses more performant algorithms when grouping and sorting records which can affect the output record order.
The AMPlify Your Workflows article describes some of the performance benefits of using the AMP engine:
- The most commonly used tools will perform best on AMP.
- The benefit of AMP typically increases as data sizes become larger.
- Performance varies based on data sizes, underlying hardware, data center and network infrastructure, Alteryx Server configuration, and workflow construction.

What changes were made in Designer and Server?

Designer has enabled AMP by default for new workflows. For new Server installations or new workers on existing Servers, the default will be to allow both AMP and original Engine execution. Workflow settings determine which engine is used.

What if I want to preserve my current engine system settings?

Note

When you upgrade to Server version 2022.1, we recommend validating your engine choice settings and resource allocations. The new 'Allow Server to manage workflows running simultaneously' functionality, and the change to enable AMP by default, can result in settings changing in your environment.

If you have an existing Server and want to maintain your current system settings, please read these instructions before upgrading.

1. Controller > General > Enable AMP Engine

Before upgrading, note your current settings.
After upgrading, restore the selection to your desired value.

2. Worker > General > Allow Server to manage workflows running simultaneously

Before upgrading, note the number set for ‘workflows allowed to run simultaneously’.
After upgrading, deselect Allow Server to manage workflows running simultaneously.
Input the number you saved for ‘workflows allowed to run simultaneously’.

3. Engine > General > Engine

Before upgrading, note your current settings.
After upgrading, restore the selection to your desired value.

4. Engine > General > Run engine at a lower priority

Before upgrading, note your current settings.
After upgrading, restore the selection to your desired value.

How to manage system resources with AMP?

We recommend you use the new options Allow Server to manage workflows running simultaneously and Allow Server to manage engine resources. We've added logic to keep each engine instance working within the memory and logical CPU constraints defined in the System Settings. Admins must take care not to over-allocate if they set these values manually instead of allowing Server to manage them.

Calculations

When you enable the options to Allow Server to manage workflows running simultaneously and Allow Server to manage engine resources, Server calculates the number of simultaneous jobs, as well as the CPU threads (cores) and amount of memory to allocate per job at service startup. These calculations are based on total available CPU cores and total system memory resources on the host machine. They are also designed to optimize AMP performance for the available hardware based on our benchmarking results. The formulas for those calculations are as follows:

Calculation Formulas

Simultaneous Jobs

number of simultaneous jobs = floor(physical processor count/2)

Memory Limit

memory limit = IF(MongoDB Enabled On Node, (((Total Physical RAM/2) - 4096) / Number of Simultaneous Jobs) , (Total Physical RAM / (Number of Simultaneous Jobs +2)) )

For Server machines that act as both a worker and a controller with the embedded MongoDB, the Memory Limit (MB) is automatically calculated based on this formula:
(((Total Physical RAM/2) - 4096) / Number of Simultaneous Jobs)
For standalone workers, more memory is allocated to run workflows based on this formula:
(Total Physical RAM / (Number of Simultaneous Jobs +2))
If the formulas result in less than 2 GB, set the Memory Limit to the minimum of 2 GB to ensure the engine is able to execute.

Processing Threads

Default Num Processing Threads = [LogicalCores]

Recommendations for Manual Value Setting

We recommend you follow these guidelines for optimal performance when setting these values manually:

Memory per running workflow: 8 GB per physical core is the recommendation for optimal performance with AMP.
CPU per running workflow: 1 simultaneous running workflow per 2 physical cores.
Number of physical cores per node: For optimal performance, we recommend 8 physical cores per node and horizontal scaling to additional nodes. Typically, this means 4 simultaneous running workflows per node.
Maximum number of AMP engines run in parallel: This is entirely hardware dependent. In theory, you could run 16 AMP or mixed AMP and original Engine jobs at once if you had a worker with 128 logical cores and 160 GB of RAM. Although, at this point disk I/O and network bandwidth are more likely to end up being the bottleneck. Both the original Engine and AMP will be performance limited by disk I/O and network bandwidth depending on the size of the data and where it is coming from and being output to.
Max number of AMP engines run while also running E1: Server doesn’t differentiate between an AMP engine and an original Engine job. Server just sends the workflow to Engine and Engine determines if it needs to run via AMP or original Engine. As such Server assumes all jobs are AMP if AMP engine is enabled.

Our formulas to calculate resources already take these recommendations into consideration. For more information, go to Engine.

What are the AMP engine system requirements?

For the latest Server System Requirements, see the Server System Requirements help page. We’ve separated our recommendations into 2 different categories: Minimum Hardware Requirements and Recommended Hardware for Computational Intensive Workloads.

Minimum Hardware Requirements

Server minimum hardware requirements are defined as the minimum hardware needed to run a stable installation of Alteryx Server. If you don't meet the minimum requirements, you risk poor performance and random service shutdown on any node where the engine runs.

The following minimum hardware requirements are recommended for the desired number of concurrent workflows:

Desired # Concurrent Workflows	Minimum System Requirements
	Memory (GB RAM)	Physical Cores
2	32	4
3	48	6
4	64	8
5	80	10
6	96	12
7	112	14
8	128	16

Recommended Hardware for Computational Intensive Workloads

Server hardware recommendations for computationally intensive workloads are defined as the ideal specifications where Server can execute demanding workflows as efficiently as possible. This is essential for reducing congestion on busy systems.

The following hardware specifications are recommended for computational intensive workloads:

Desired # Concurrent Workflows	Computational Intensive Workload Recommendations
	Memory (GB RAM)	Physical Cores	Logical Cores*
2	64	8	16
3	96	12	24
4	128	16	32
5	160	20	40
6	192	24	48
7	224	28	56
8	256	32	64

*Logical cores are either vCPUs or logical cores within a physical core. The standardization to refer to logical cores is a way of comparing consistently across both physical on-prem servers and virtual servers in the cloud. Intel hyperthreading, AMD SMT, 2:1 ratio of vCPU to physical core.

What specific settings are now the default in version 22.1? What were the default settings before this change?

Prior to 2022.1, AMP was available on Server, but disabled by default.
For release 2022.1 and beyond, for new Server installations, the controller and worker have the default set to allow both AMP and original Engine execution. For existing Servers, the existing controller and worker settings might change, and new workers might have both AMP and original Engine execution enabled by default. If you wish to avoid this, see our note on how to preserve current settings.
The settings for enabling AMP engine in both the controller and worker nodes exist in the Alteryx System Settings. There are now additional settings for managing hardware allocations for each engine. There is also a recommended setting to Allow Server to Manage Engine Resources.
System Settings for existing Server Installations:
1. Controller > General > Enable AMP Engine If you have ever changed this value, no matter what value you changed it to, that value persists after the upgrade. If you have never changed this setting and always left the default state as unselected, then the checkbox will now be selected, which means AMP is enabled by default.
2. Worker > General > Allow Server to manage workflows running simultaneously defaults to True for all workers. When this setting is set to True, you are not able to set the number of workflows allowed to run simultaneously.
  1. Workflows allowed to run simultaneously is automatically calculated at service start-up based on the the total CPU and memory on this node.
3. Engine > General > Engine, if you have ever changed this value, no matter what value you changed it to, that value will be persistent after the upgrade. If you never changed this setting and always used the default original Engine option, then the new default is set to Both Engines.
4. Engine > General > Allow Server to manage engine resources is a new setting that defaults to False.
5. Engine > General > Memory Limit formula to calculate the default value has changed.
6. Engine > General > Default number of processing threads formula to calculate the default value has changed.
7. Engine > General > Run engine at a lower priority: If you have ever changed this value, no matter what value you changed it to, that value will be persistent after the upgrade. If you have always used the default value False, after the upgrade the new default will be set to True.
System Settings for new Server Installations:
1. Controller > General > Enable AMP Engine checkbox defaults to True.
2. Worker > General > Allow Server to manage workflows running simultaneously defaults to True for all workers.
  1. Workflows allowed to run simultaneously is automatically calculated at service start-up based on the total CPU and memory on this node.
3. Engine > General > Engine dropdown defaults to Both Engines.
4. Engine > General > Allow Server to manage engine resources defaults to False.
5. Engine > General > Memory Limit formula to calculate the default value has changed.
6. Engine > General > Default number of processing threads formula to calculate the default value has changed.
7. Engine > General > Run engine at a lower priority defaults to True.
Can users change the settings back (it means, turn off AMP in Server)?
1. If the admin doesn’t want to use AMP, they will have to turn it off manually. See the image below for the Engine setting in the Controller General Configuration section of System Settings.
2. If the admin wants to turn off AMP on some worker nodes, that can be done in the Engine Configuration section of System Settings. See below the Engine dropdown setting. In the image below, it’s set to Both Engines, but can be changed to select Original Engine only. Both Engines is the default in a new Server environment.

Is there a separate Designer memory limit than in Server?

There is not a separate memory limit. In System Settings, the Engine > General > Memory Limit field applies to the engine. It applies to both Designer, Server, basically any place engine is running.
Designer only runs one workflow at a time, so limitations are different and more tolerant.

How should a new system be set up based on these updates?

The system automatically has AMP enabled and all of the relevant settings configured by default. They would only need to make changes if they want to turn it off. Refer to answers in What changes were made in Designer and Server?

Ensure that the minimum hardware requirements are met to maintain a stable Server environment.

What is the implication of enabling AMP on Server for existing workflows? Will turning on AMP in Server cause any of my existing workflows to fail?

No, they will run in the exact same way as they did before.

Will allowing both AMP and original Engine execution on Server change the way my existing workflows run?

When a workflow is saved in Designer, the Runtime setting option is to Use AMP Engine or not. Whatever option is saved in Designer will be honored when run in Server. Server will never override the workflow’s engine option. Therefore, allowing both AMP and original Engine execution on Server will not cause any workflows that were saved as original Engine to run with the AMP engine. If a workflow is saved with AMP as the engine option and AMP is not enabled in Server, the workflow will NOT run as original Engine and will fail.
For a workflow that was previously saved as original Engine to run as AMP, the workflow must be re-saved in Designer with the Use AMP Engine setting selected.
Running workflows in AMP can change the order of the resulting output rows because some things are now done in parallel. Keep that in mind and verify if your processes rely on output ordering. If so, there are adjustments you can make to your workflows to ensure the original Engine ordering. For more information, go to Engine Compatibility Mode.

If I create new workflows and save them with the Use AMP Engine selected and turn on AMP in Server, when the new AMP workflows run alongside my existing original Engine workflows, how will it impact quality of service (QoS)?

You can expect changes in the run time of each workflow.
In general, AMP workflow jobs will run significantly faster with the right number of processing cores.
In some cases, AMP jobs can take longer than they did as original Engine jobs, especially if the workflows are CPU-intensive and the number of threads per workflow is low.
Quality of Service (QoS) will continue to operate the same way as it always has.

Customers want some workflow run times to be consistent because they need workflows to finish running by a certain time. When AMP is enabled, will customers be able to do this?

If it’s properly resourced, both AMP and original Engine workflows should be predictable (predictable using a new baseline, instead of going off of historic original Engine performance results only). The only time it would become unpredictable is if a worker’s hardware resources are under-allocated (original Engine and AMP competing for resources).

How can I compare an original Engine workflow to an AMP workflow so I can ensure the results are the same?

It’s possible to save the workflow to Server with AMP enabled, and then also save a copy of it with AMP disabled. Then run each workflow a few times to see which performs better. Also note that AMP workflows tend to run faster when running concurrently with other AMP workflows.

Quality of Service Concerns

Alteryx’s original Engine uses memory differently than the AMP does. How does memory management work with the 2 engines together?

The Engine Configuration Memory Limit applies to engine, regardless if it’s original or AMP. The difference is in how each engine handles that memory limit:

Original Engine will pre-allocate the entire limit.
AMP will allocate what it needs up to the memory limit.

For customers that have set up multiple workers that run in parallel to maximize thread and core usage, why should they use AMP if they already have more than sufficient resources set up, what is the appeal of turning on AMP at this point?

It’s a more efficient use of resources. Original Engine is multithreaded, but not highly multithreaded. AMP is much more proficient at running jobs in serial. The advantages are in total throughput, which is higher with AMP than it is when you use the original Engine.

Does Alteryx have a resource manager?

Server is now capable of analyzing your hardware and allocating the appropriate resources per engine. It’s not the same level of resource manager as found on an OS, but with this new feature we’ve implemented the ability for Server to manage your resources.

How are workers increased or decreased based on capacity?

We auto-allocate the number of jobs allowed to run simultaneously based on total hardware resources, if the Admin configures it to. For more information about the worker configuration, go to Worker help page.

Will this cause resource contention and processor swapping? If yes, does that negatively impact performance?

No, resources are allocated per engine job. Each job would have its own resources available to it, which means there shouldn’t be resource contention between jobs.