Skip to main content

AMP Engine Best Practices

Purpose of this document is to answer questions related to the new AMP Engine. The document is intended for both existing admins and new users.

For more information about AMP Engine, go to the Alteryx AMP Engine and Engine help pages.

For more information about Server system requirements, visit the System Requirements help page.

General Topics

  • For most use cases AMP engine provides significant performance and efficiency improvements over the original Engine when provided with sufficient system resources. For more information on system resource requirements and recommendations, see the sections How to manage system resources with AMP? and What are the AMP engine system requirements?.

  • AMP is designed to work with larger volumes of data at a higher velocity and typically executes workflows faster, with more complete usage of machine resources compared to the original Engine.

  • The original Engine architecture allows for mostly single-threaded processing, where your data are processed record-by-record sequentially. On the other hand, the new AMP concept allows for massively multi-threaded processing. AMP processes records in packets to improve run times, and tools can run in parallel. AMP also uses more performant algorithms when grouping and sorting records which can affect the output record order.

  • The AMPlify Your Workflows article describes some of the performance benefits of using the AMP engine:

    • The most commonly used tools will perform best on AMP.

    • The benefit of AMP typically increases as data sizes become larger.

    • Performance varies based on data sizes, underlying hardware, data center and network infrastructure, Alteryx Server configuration, and workflow construction.

Designer has enabled AMP by default for new workflows. For new Server installations or new workers on existing Servers, the default will be to allow both AMP and original Engine execution. Workflow settings determine which engine is used.

Note

When you upgrade to Server version 2022.1, we recommend validating your engine choice settings and resource allocations. The new 'Allow Server to manage workflows running simultaneously' functionality, and the change to enable AMP by default, can result in settings changing in your environment.

If you have an existing Server and want to maintain your current system settings, please read these instructions before upgrading.

1. Controller > General > Enable AMP Engine

  • Before upgrading, note your current settings.

  • After upgrading, restore the selection to your desired value.

2. Worker > General > Allow Server to manage workflows running simultaneously

  • Before upgrading, note the number set for ‘workflows allowed to run simultaneously’.

  • After upgrading, deselect Allow Server to manage workflows running simultaneously.

  • Input the number you saved for ‘workflows allowed to run simultaneously’.

3. Engine > General > Engine

  • Before upgrading, note your current settings.

  • After upgrading, restore the selection to your desired value.

4. Engine > General > Run engine at a lower priority

  • Before upgrading, note your current settings.

  • After upgrading, restore the selection to your desired value.

We recommend you use the new options Allow Server to manage workflows running simultaneously and Allow Server to manage engine resources. We've added logic to keep each engine instance working within the memory and logical CPU constraints defined in the System Settings. Admins must take care not to over-allocate if they set these values manually instead of allowing Server to manage them.

Calculations

When you enable the options to Allow Server to manage workflows running simultaneously and Allow Server to manage engine resources, Server calculates the number of simultaneous jobs, as well as the CPU threads (cores) and amount of memory to allocate per job at service startup. These calculations are based on total available CPU cores and total system memory resources on the host machine. They are also designed to optimize AMP performance for the available hardware based on our benchmarking results. The formulas for those calculations are as follows:

Calculation Formulas

Simultaneous Jobs

number of simultaneous jobs = floor(physical processor count/2)

Memory Limit

memory limit = IF(MongoDB Enabled On Node, (((Total Physical RAM/2) - 4096) / Number of Simultaneous Jobs) , (Total Physical RAM / (Number of Simultaneous Jobs +2)) )

  • For Server machines that act as both a worker and a controller with the embedded MongoDB, the Memory Limit (MB) is automatically calculated based on this formula:

    (((Total Physical RAM/2) - 4096) / Number of Simultaneous Jobs)

  • For standalone workers, more memory is allocated to run workflows based on this formula:

    (Total Physical RAM / (Number of Simultaneous Jobs +2))

  • If the formulas result in less than 2 GB, set the Memory Limit to the minimum of 2 GB to ensure the engine is able to execute.

Processing Threads

Default Num Processing Threads = [LogicalCores]

Recommendations for Manual Value Setting

We recommend you follow these guidelines for optimal performance when setting these values manually:

  • Memory per running workflow: 8 GB per physical core is the recommendation for optimal performance with AMP.

  • CPU per running workflow: 1 simultaneous running workflow per 2 physical cores.

  • Number of physical cores per node: For optimal performance, we recommend 8 physical cores per node and horizontal scaling to additional nodes. Typically, this means 4 simultaneous running workflows per node.

  • Maximum number of AMP engines run in parallel: This is entirely hardware dependent. In theory, you could run 16 AMP or mixed AMP and original Engine jobs at once if you had a worker with 128 logical cores and 160 GB of RAM. Although, at this point disk I/O and network bandwidth are more likely to end up being the bottleneck. Both the original Engine and AMP will be performance limited by disk I/O and network bandwidth depending on the size of the data and where it is coming from and being output to.

  • Max number of AMP engines run while also running E1: Server doesn’t differentiate between an AMP engine and an original Engine job. Server just sends the workflow to Engine and Engine determines if it needs to run via AMP or original Engine. As such Server assumes all jobs are AMP if AMP engine is enabled.

Our formulas to calculate resources already take these recommendations into consideration. For more information, go to Engine.

For the latest Server System Requirements, see the Server System Requirements help page. We’ve separated our recommendations into 2 different categories: Minimum Hardware Requirements and Recommended Hardware for Computational Intensive Workloads.

Minimum Hardware Requirements

Server minimum hardware requirements are defined as the minimum hardware needed to run a stable installation of Alteryx Server. If you don't meet the minimum requirements, you risk poor performance and random service shutdown on any node where the engine runs.

The following minimum hardware requirements are recommended for the desired number of concurrent workflows:

Desired # Concurrent Workflows

Minimum System Requirements

Memory (GB RAM)

Physical Cores

2

32

4

3

48

6

4

64

8

5

80

10

6

96

12

7

112

14

8

128

16

  1. Prior to 2022.1, AMP was available on Server, but disabled by default.

  2. For release 2022.1 and beyond, for new Server installations, the controller and worker have the default set to allow both AMP and original Engine execution. For existing Servers, the existing controller and worker settings might change, and new workers might have both AMP and original Engine execution enabled by default. If you wish to avoid this, see our note on how to preserve current settings.

  3. The settings for enabling AMP engine in both the controller and worker nodes exist in the Alteryx System Settings. There are now additional settings for managing hardware allocations for each engine. There is also a recommended setting to Allow Server to Manage Engine Resources.

  4. System Settings for existing Server Installations:

    1. Controller > General > Enable AMP Engine If you have ever changed this value, no matter what value you changed it to, that value persists after the upgrade. If you have never changed this setting and always left the default state as unselected, then the checkbox will now be selected, which means AMP is enabled by default.

    2. Worker > General > Allow Server to manage workflows running simultaneously defaults to True for all workers. When this setting is set to True, you are not able to set the number of workflows allowed to run simultaneously.

      1. Workflows allowed to run simultaneously is automatically calculated at service start-up based on the the total CPU and memory on this node.

        Worker configuration in Alteryx System Settings.
    3. Engine > General > Engine, if you have ever changed this value, no matter what value you changed it to, that value will be persistent after the upgrade. If you never changed this setting and always used the default original Engine option, then the new default is set to Both Engines.

    4. Engine > General > Allow Server to manage engine resources is a new setting that defaults to False.

      Both Engines configuration in Alteryx System Settings.
    5. Engine > General > Memory Limit formula to calculate the default value has changed.

    6. Engine > General > Default number of processing threads formula to calculate the default value has changed.

    7. Engine > General > Run engine at a lower priority: If you have ever changed this value, no matter what value you changed it to, that value will be persistent after the upgrade. If you have always used the default value False, after the upgrade the new default will be set to True.

  5. System Settings for new Server Installations:

    1. Controller > General > Enable AMP Engine checkbox defaults to True.

    2. Worker > General > Allow Server to manage workflows running simultaneously defaults to True for all workers.

      1. Workflows allowed to run simultaneously is automatically calculated at service start-up based on the total CPU and memory on this node.

    3. Engine > General > Engine dropdown defaults to Both Engines.

    4. Engine > General > Allow Server to manage engine resources defaults to False.

    5. Engine > General > Memory Limit formula to calculate the default value has changed.

    6. Engine > General > Default number of processing threads formula to calculate the default value has changed.

    7. Engine > General > Run engine at a lower priority defaults to True.

  6. Can users change the settings back (it means, turn off AMP in Server)?

    1. If the admin doesn’t want to use AMP, they will have to turn it off manually. See the image below for the Engine setting in the Controller General Configuration section of System Settings.

      Controller configuration in Alteryx System Settings.
    2. If the admin wants to turn off AMP on some worker nodes, that can be done in the Engine Configuration section of System Settings. See below the Engine dropdown setting. In the image below, it’s set to Both Engines, but can be changed to select Original Engine only. Both Engines is the default in a new Server environment.

      Engine configuration in Alteryx System Settings.
  • There is not a separate memory limit. In System Settings, the Engine > General > Memory Limit field applies to the engine. It applies to both Designer, Server, basically any place engine is running.

  • Designer only runs one workflow at a time, so limitations are different and more tolerant.

The system automatically has AMP enabled and all of the relevant settings configured by default. They would only need to make changes if they want to turn it off. Refer to answers in What changes were made in Designer and Server?

Ensure that the minimum hardware requirements are met to maintain a stable Server environment.

  • When a workflow is saved in Designer, the Runtime setting option is to Use AMP Engine or not. Whatever option is saved in Designer will be honored when run in Server. Server will never override the workflow’s engine option. Therefore, allowing both AMP and original Engine execution on Server will not cause any workflows that were saved as original Engine to run with the AMP engine. If a workflow is saved with AMP as the engine option and AMP is not enabled in Server, the workflow will NOT run as original Engine and will fail.

  • For a workflow that was previously saved as original Engine to run as AMP, the workflow must be re-saved in Designer with the Use AMP Engine setting selected.

  • Running workflows in AMP can change the order of the resulting output rows because some things are now done in parallel. Keep that in mind and verify if your processes rely on output ordering. If so, there are adjustments you can make to your workflows to ensure the original Engine ordering. For more information, go to Engine Compatibility Mode.

  • You can expect changes in the run time of each workflow.

  • In general, AMP workflow jobs will run significantly faster with the right number of processing cores.

  • In some cases, AMP jobs can take longer than they did as original Engine jobs, especially if the workflows are CPU-intensive and the number of threads per workflow is low.

  • Quality of Service (QoS) will continue to operate the same way as it always has.

If it’s properly resourced, both AMP and original Engine workflows should be predictable (predictable using a new baseline, instead of going off of historic original Engine performance results only). The only time it would become unpredictable is if a worker’s hardware resources are under-allocated (original Engine and AMP competing for resources).

It’s possible to save the workflow to Server with AMP enabled, and then also save a copy of it with AMP disabled. Then run each workflow a few times to see which performs better. Also note that AMP workflows tend to run faster when running concurrently with other AMP workflows.

Quality of Service Concerns

The Engine Configuration Memory Limit applies to engine, regardless if it’s original or AMP. The difference is in how each engine handles that memory limit:

  • Original Engine will pre-allocate the entire limit.

  • AMP will allocate what it needs up to the memory limit.

It’s a more efficient use of resources. Original Engine is multithreaded, but not highly multithreaded. AMP is much more proficient at running jobs in serial. The advantages are in total throughput, which is higher with AMP than it is when you use the original Engine.

Server is now capable of analyzing your hardware and allocating the appropriate resources per engine. It’s not the same level of resource manager as found on an OS, but with this new feature we’ve implemented the ability for Server to manage your resources.

We auto-allocate the number of jobs allowed to run simultaneously based on total hardware resources, if the Admin configures it to. For more information about the worker configuration, go to Worker help page.

No, resources are allocated per engine job. Each job would have its own resources available to it, which means there shouldn’t be resource contention between jobs.

AMP Articles

AMP Engine Webinar (32 Minutes)

Alteryx AMP Engine

Alteryx Engine and AMP: Main Differences

AMP Memory Use

Tool Use with AMP

Accelerate Your Analytic Processes with the New AMP Engine

AMPlify Your Workflows

The Alteryx AMP Engine: Explained

AMP Engine Technical Deep Dive | Part 1 | Why AMP?

AMP Engine Technical Deep Dive | Part 2 | Key concepts of the AMP Engine