AMP Engine Best Practices

Version:
2022.3
Last modified: December 27, 2022

Purpose of this document is to answer questions related to the new AMP Engine. The document is intended for both existing admins and new users. 

For more information about AMP Engine, go to the Alteryx AMP Engine and Engine help pages. 

General Topics

Why do I want to enable AMP engine?
  • For most use cases AMP engine provides significant performance and efficiency improvements over the original Engine when provided with sufficient system resources. For more information on system resource requirements and recommendations, see the sections How to manage system resources with AMP? and What are the AMP engine system requirements?.
  • AMP is designed to work with larger volumes of data at a higher velocity and typically executes workflows faster, with more complete usage of machine resources compared to the original Engine. 
  • The original Engine architecture allows for mostly single-threaded processing, where your data are processed record-by-record sequentially. On the other hand, the new AMP concept allows for massively multi-threaded processing. AMP processes records in packets to improve run times, and tools can run in parallel. AMP also uses more performant algorithms when grouping and sorting records which can affect the output record order.
  • The AMPlify Your Workflows article describes some of the performance benefits of using the AMP engine:
    • The most commonly used tools will perform best on AMP. 
    • The benefit of AMP typically increases as data sizes become larger.
    • Performance varies based on data sizes, underlying hardware, data center and network infrastructure, Alteryx Server configuration, and workflow construction. 

Benchmark Results

We used the following measurements for benchmarking:

  • Workflow throughput = (workflows / runtime). Runtime is the total time to run a workflow. 
  • Engine throughput = (workflows / engine_runtime). Engine_runtime is the time when the engine runs a workflow. 
  • Workflows = number of workflows.

When there is only 1 workflow, there is no significant time difference between the engine throughput and the workflow throughput. If there are 2 workflows in a queue with just 1 worker, the runtime of the first workflow impacts the runtime of the second one.

Benchmark Results for a Typical Data Prep Task


Benchmark results for a typical data prep task.

Benchmark results for a typical data prep task.

Benchmark Results for a Typical CPU Heavy Predictive or Machine Learning Task


Benchmark results for a typical CPU heavy predictive or machine learning task.

Benchmark results for a typical CPU heavy predictive or machine learning task.

For more information, see documentation links at the bottom of this document. 

What changes were made in Designer and Server?

Designer has enabled AMP by default for new workflows. For new Server installations or new workers on existing Servers, the default will be to allow both AMP and original Engine execution. Workflow settings determine which engine is used. 

What if I want to preserve my current engine system settings?

1.    Controller > General > Enable AMP Engine 

      1.  Before upgrading, note your current settings.

      2. After upgrading, restore the selection to your desired value. 

2.    Worker > General > Allow Server to manage workflows running simultaneously 

      1. Before upgrading, note the number set for ‘workflows allowed to run simultaneously’.

       2. After upgrading, deselect ‘Allow Server to manage workflows running simultaneously’.

       3. Input the number you saved for ‘workflows allowed to run simultaneously’.

3.    Engine > General > Engine 

     1. Before upgrading, note your current settings.

     2. After upgrading, restore the selection to your desired value. 

4.    Engine > General > Run engine at a lower priority

      1. Before upgrading, note your current settings.

      2. After upgrading, restore the selection to your desired value. 

How to manage system resources with AMP?

We recommend you use the new options Allow Server to manage workflows running simultaneously and Allow Server to manage engine resources. We've added logic to keep each engine instance working within the memory and logical CPU constraints defined in the System Settings. Admins must take care not to over-allocate if they set these values manually instead of allowing Server to manage them. 

Calculations

When you enable the options to Allow Server to manage workflows running simultaneously and Allow Server to manage engine resources, Server calculates the number of simultaneous jobs, as well as the CPU threads (cores) and amount of memory to allocate per job at service startup. These calculations are based on total available logical CPU cores and total system memory resources on the host machine. They are also designed to optimize AMP performance for the available hardware based on our benchmarking results. The formulas for those calculations are as follows:

Simultaneous Jobs
number of simultaneous jobs = min(floor(logical processors / 8), (floor(total memory / 8000) - 2))

Memory Limit
memory limit = floor(total memory / (number of simultaneous jobs +2))

Processing Threads
threads = floor(total logical processors / number of simultaneous jobs)

Recommendations for Manual Value Setting

We recommend you follow these guidelines for optimal performance when setting these values manually:

  • Memory per running workflow: 8GB per workflow is the minimum recommendation for optimal performance with AMP.
  • CPU per running workflow: 1 simultaneous running workflow per 6-8 logical CPU cores.
  • Maximum number of AMP engines run in parallel: This is entirely hardware dependent. In theory, you could run 16 AMP or mixed AMP and original Engine jobs at once if you had a worker with 128 logical cores and 160GB of RAM. Although, at this point disk I/O and network bandwidth are more likely to end up being the bottleneck. Both the original Engine and AMP will be performance limited by disk I/O and network bandwidth depending on the size of the data and where it is coming from and being output to.
  • Max number of AMP engines run while also running E1: Server doesn’t differentiate between an AMP engine and an original Engine job. Server just sends the workflow to Engine and Engine determines if it needs to run via AMP or original Engine. As such Server assumes all jobs are AMP if AMP engine is enabled.

We recommend that you reserve 1 job's worth of memory for the operating system and an additional job's worth of memory to avoid a negative impact if you manually run a workflow from Designer running on the Server. Our formulas to calculate resources already take these recommendations into consideration.

What are the AMP engine system requirements?

For the latest Server System Requirements, see the Server System Requirements help page. We’ve separated our recommendations into 2 different categories: Minimum Hardware Requirements and Recommended Hardware for Optimal Performance

Minimum Hardware Requirements

Server minimum hardware requirements are defined as the minimum hardware needed to run a stable installation of Alteryx Server. If you don't meet the minimum requirements, you risk poor performance and random service shutdown on any node where the engine runs. 

The following minimum hardware requirements are recommended for the desired number of concurrent workflows:

Minimum hardware requirements for Server

 

The green highlighted line is the minimum recommended configuration. The line showing information for 1 concurrent workflow is helpful for you to understand how much you need to increase resources to add 1 additional job to the existing configuration.

Recommended Hardware for Optimal Performance

Server optimal performance hardware recommendations are defined as the sweet spot in hardware where Server can complete workflows as efficiently as possible when using the AMP engine. This is useful to eliminate congestion on busy systems and optimize AMP engine performance and throughput capabilities. 

We recommend the following hardware settings for optimal performance: 

Recommended Hardware for Optimal Performance of Server

 

*Logical cores are either vCPUs or logical cores within a physical core. The standardization to refer to logical cores is a way to consistently compare both physical on-prem servers and virtual servers in the cloud.

What specific settings are now the default in version 22.1? What were the default settings before this change?
  1. Prior to 2022.1, AMP was available on Server, but disabled by default.
  2. For release 2022.1 and beyond, for new Server installations, the controller and worker will have the default set to allow both AMP and original Engine execution. For existing Servers, the existing controller and worker settings may change, and new workers may have both AMP and original Engine execution enabled by default. If you wish to avoid this, see our note on how to preserve current settings. 
  3. The settings for enabling AMP engine in both the controller and worker nodes exist in the Alteryx System Settings. There are now additional settings for managing hardware allocations for each engine. There is also a recommended setting to Allow Server to Manage Engine Resources
  4. System Settings for existing Server Installations: 
    1. Controller > General > Enable AMP Engine If you have ever changed this value, no matter what value you changed it to, that value persists after the upgrade. If you have never changed this setting and always left the default state as unselected, then the checkbox will now be selected, which means AMP is enabled by default. 
    2. Worker > General > Allow Server to manage workflows running simultaneously defaults to True for all workers. When this setting is set to True, you are not able to set the number of workers allowed to run simultaneously. 
      1. Workers allowed to run simultaneously will be automatically calculated at service start-up based on the available CPU and memory in the Server environment. 
        Worker configuration in Alteryx System Settings.
    3. Engine > General > Engine, if the customer has ever changed this value, no matter what value they changed it to, that value will be persistent after the upgrade. If the customer never changed this setting and always used the default original Engine option, then the new default will be set to Both Engines. 
    4. Engine > General > Allow Server to manage engine resources is a new setting that defaults to False
      Both Engines configuration in Alteryx System Settings.
    5. Engine > General > Memory Limit formula to calculate the default value has changed. 
    6. Engine > General > Default number of processing threads formula to calculate the default value has changed. 
    7. Engine > General > Run engine at a lower priority: If the customer has ever changed this value, no matter what value they changed it to, that value will be persistent after the upgrade. If the customer has always used the default value False, after the upgrade the new default will be set to True.
  5. System Settings for new Server Installations:
    1. Controller > General > Enable AMP Engine checkbox defaults to True
    2. Worker > General > Allow Server to manage workflows running simultaneously defaults to True for all workers. 
      1. Workers allowed to run simultaneously will be automatically calculated at service start-up based on the available CPU and memory in the Server environment. 
    3. Engine > General > Engine dropdown defaults to Both Engines
    4. Engine > General > Allow Server to manage engine resources defaults to False
    5. Engine > General > Memory Limit formula to calculate the default value has changed. 
    6. Engine > General > Default number of processing threads formula to calculate the default value has changed. 
    7. Engine > General > Run engine at a lower priority defaults to True
  6. Can users change the settings back (it means, turn off AMP in Server)?
    1. If the admin doesn’t want to use AMP, they will have to turn it off manually. See the image below for the Engine setting in the Controller General Configuration section of System Settings.
      Controller configuration in Alteryx System Settings.
    2. If the admin wants to turn off AMP on some worker nodes, that can be done in the Engine Configuration section of System Settings. See below the Engine dropdown setting. In the image below, it’s set to Both Engines, but can be changed to select Original Engine only. Both Engines is the default in a new Server environment.
      Engine configuration in Alteryx System Settings.
Is there a separate Designer memory limit than in Server?
  • There is not a separate memory limit. In System Settings, the Engine > General > Memory Limit field applies to the engine. It applies to both Designer, Server, basically any place engine is running.
  • Designer only runs one workflow at a time, so limitations are different and more tolerant.
How should a new system be set up based on these updates?

The system automatically has AMP enabled and all of the relevant settings configured by default. They would only need to make changes if they want to turn it off. Refer to answers in What changes were made in Designer and Server?

Make sure to follow minimum hardware requirement guidelines for optimum performance and stability.

What is the implication of enabling AMP on Server for existing workflows?

Will turning on AMP in Server cause any of my existing workflows to fail? 

  • No, they will run in the exact same way as they did before.

Will allowing both AMP and original Engine execution on Server change the way my existing workflows run? 

  • When a workflow is saved in Designer, the Runtime setting option is to Use AMP Engine or not. Whatever option is saved in Designer will be honored when run in Server. Server will never override the workflow’s engine option. Therefore, allowing both AMP and original Engine execution on Server will not cause any workflows that were saved as original Engine to run with the AMP engine. If a workflow is saved with AMP as the engine option and AMP is not enabled in Server, the workflow will NOT run as original Engine and will fail. 
  • For a workflow that was previously saved as original Engine to run as AMP, the workflow must be re-saved in Designer with the Use AMP Engine setting selected. 
  • Running workflows in AMP can change the order of the resulting output rows because some things are now done in parallel. Keep that in mind and verify if your processes rely on output ordering. If so, there are adjustments you can make to your workflows to ensure the original Engine ordering.
If I create new workflows and save them with the Use AMP Engine selected and turn on AMP in Server, when the new AMP workflows run alongside my existing original Engine workflows, how will it impact quality of service (QoS)?
  • You can expect changes in the run time of each workflow.
  • In general, AMP workflow jobs will run significantly faster with the right number of processing cores.
  • In some cases, AMP jobs can take longer than they did as original Engine jobs, especially if the workflows are CPU-intensive and the number of threads per workflow is low.
  • While in general, your workflows will finish faster, a non-AMP workflow might take slightly longer to run, since a competing AMP workflow will take some resources away from it. However, it doesn’t change how Quality of Service (QoS) works and some workflows might run faster. QoS will continue to operate the same way as it always has.
Customers want some workflow run times to be consistent because they need workflows to finish running by a certain time.

When AMP is enabled, will customers be able to do this?

  • Time to completion should still be consistently better with a few exceptions where timing takes a little longer.

If turning on AMP loses the guarantee of consistent run times, do we then recommend users with this business case to turn AMP off?

  • If it’s properly resourced, both AMP and original Engine workflows should be predictable (predictable using a new baseline, instead of going off of historic original Engine performance results only). The only time it would become unpredictable is if a worker’s hardware resources are under-allocated (original Engine and AMP competing for resources).
How can I compare an original Engine workflow to an AMP workflow so I can ensure the results are the same?

It’s possible to save the workflow to Server with AMP enabled, and then also save a copy of it with AMP disabled. Then run each workflow a few times to see which performs better. Also note that AMP workflows tend to run faster when running concurrently with other AMP workflows.

Quality of Service Concerns

Alteryx’s original Engine uses memory differently than the AMP does. How does memory management work with the 2 engines together?

The Engine Configuration Memory Limit applies to engine, regardless if it’s original or AMP. The difference is in how each engine handles that memory limit:

  • Original Engine will pre-allocate the entire limit.
  • AMP will allocate what it needs up to the memory limit. 
For customers that have set up multiple workers that run in parallel to maximize thread and core usage, why should they use AMP if they already have more than sufficient resources set up, what is the appeal of turning on AMP at this point?

It’s a more efficient use of resources. Original Engine is multithreaded, but not highly multithreaded. AMP is much more proficient at running jobs in serial. The advantages are in total throughput, which is higher with AMP than it is when you use the original Engine.

Does Alteryx have a resource manager?

Server is now capable of analyzing your hardware and allocating the appropriate resources per engine. It’s not the same level of resource manager as found on an OS, but with this new feature we’ve implemented the ability for Server to manage your resources.

How are workers increased or decreased based on capacity?

We auto-allocate the number of jobs allowed to run simultaneously based on available hardware resources, if the Admin configures it to. For more information about the worker configuration, go to Worker help page. 

Will this cause resource contention and processor swapping? If yes, does that negatively impact performance?

No, resources are allocated per engine job. Each job would have its own resources available to it, which means there shouldn’t be resource contention between jobs.

Was This Page Helpful?

Running into problems or issues with your Alteryx product? Visit the Alteryx Community or contact support. Can't submit this form? Email us.