Skip to main content

AMP Engine Best Practices

Purpose of this document is to answer questions related to the new AMP Engine. The document is intended for both existing admins and new users.

For more information about AMP Engine, go to the Alteryx AMP Engine and Engine help pages.Alteryx AMP Engine

For more information about Server system requirements, visit the System Requirements help page.

General Topics

  • For most use cases AMP engine provides significant performance and efficiency improvements over the original Engine when provided with sufficient system resources. For more information on system resource requirements and recommendations, see the sections How to manage system resources with AMP? and What are the AMP engine system requirements?.

  • AMP is designed to work with larger volumes of data at a higher velocity and typically executes workflows faster, with more complete usage of machine resources compared to the original Engine.

  • The original Engine architecture allows for mostly single-threaded processing, where your data are processed record-by-record sequentially. On the other hand, the new AMP concept allows for massively multi-threaded processing. AMP processes records in packets to improve run times, and tools can run in parallel. AMP also uses more performant algorithms when grouping and sorting records which can affect the output record order.

  • The AMPlify Your Workflows article describes some of the performance benefits of using the AMP engine:

    • The most commonly used tools will perform best on AMP.

    • The benefit of AMP typically increases as data sizes become larger.

    • Performance varies based on data sizes, underlying hardware, data center and network infrastructure, Alteryx Server configuration, and workflow construction.

Benchmark Results

Anmerkung

We used the following measurements for benchmarking:

  • Workflow throughput = (workflows / runtime). Runtime is the total time to run a workflow.

  • Engine throughput = (workflows / engine_runtime). Engine_runtime is the time when the engine runs a workflow.

  • Workflows = number of workflows.

When there is only 1 workflow, there is no significant time difference between the engine throughput and the workflow throughput. If there are 2 workflows in a queue with just 1 worker, the runtime of the first workflow impacts the runtime of the second one.

Benchmark Results for a Typical Data Prep Task

Benchmark results for a typical data prep task.
Benchmark results for a typical data prep task.

Benchmark Results for a Typical CPU Heavy Predictive or Machine Learning Task

Benchmark results for a typical CPU heavy predictive or machine learning task.
Benchmark results for a typical CPU heavy predictive or machine learning task.

For more information, see documentation links at the bottom of this document.

Designer has enabled AMP by default for new workflows. For new Server installations or new workers on existing Servers, the default will be to allow both AMP and original Engine execution. Workflow settings determine which engine is used.

1. Controller > General > Enable AMP Engine

1. Before upgrading, note your current settings.

2. After upgrading, restore the selection to your desired value.

2. Worker > General > Allow Server to manage workflows running simultaneously

1. Before upgrading, note the number set for ‘workflows allowed to run simultaneously’.

2. After upgrading, deselect ‘Allow Server to manage workflows running simultaneously’.

3. Input the number you saved for ‘workflows allowed to run simultaneously’.

3. Engine > General > Engine

1. Before upgrading, note your current settings.

2. After upgrading, restore the selection to your desired value.

4. Engine > General > Run engine at a lower priority

1. Before upgrading, note your current settings.

2. After upgrading, restore the selection to your desired value.

We recommend you use the new options Allow Server to manage workflows running simultaneously and Allow Server to manage engine resources. We've added logic to keep each engine instance working within the memory and logical CPU constraints defined in the System Settings. Admins must take care not to over-allocate if they set these values manually instead of allowing Server to manage them.

Calculations

When you enable the options to Allow Server to manage workflows running simultaneously and Allow Server to manage engine resources, Server calculates the number of simultaneous jobs, as well as the CPU threads (cores) and amount of memory to allocate per job at service startup. These calculations are based on total available logical CPU cores and total system memory resources on the host machine. They are also designed to optimize AMP performance for the available hardware based on our benchmarking results. The formulas for those calculations are as follows:

Anmerkung

Simultaneous Jobs

number of simultaneous jobs = min(floor(logical processors / 8), (floor(total memory / 8000) - 2))

Memory Limit

memory limit = floor(total memory / (number of simultaneous jobs +2))

Processing Threads

threads = floor(total logical processors / number of simultaneous jobs)

Recommendations for Manual Value Setting

We recommend you follow these guidelines for optimal performance when setting these values manually:

  • Memory per running workflow: 8GB per workflow is the minimum recommendation for optimal performance with AMP.

  • CPU per running workflow: 1 simultaneous running workflow per 6-8 logical CPU cores.

  • Maximum number of AMP engines run in parallel: This is entirely hardware dependent. In theory, you could run 16 AMP or mixed AMP and original Engine jobs at once if you had a worker with 128 logical cores and 160GB of RAM. Although, at this point disk I/O and network bandwidth are more likely to end up being the bottleneck. Both the original Engine and AMP will be performance limited by disk I/O and network bandwidth depending on the size of the data and where it is coming from and being output to.

  • Max number of AMP engines run while also running E1: Server doesn’t differentiate between an AMP engine and an original Engine job. Server just sends the workflow to Engine and Engine determines if it needs to run via AMP or original Engine. As such Server assumes all jobs are AMP if AMP engine is enabled.

We recommend that you reserve 1 job's worth of memory for the operating system and an additional job's worth of memory to avoid a negative impact if you manually run a workflow from Designer running on the Server. Our formulas to calculate resources already take these recommendations into consideration.

For the latest Server System Requirements, see the Server System Requirements help page. We’ve separated our recommendations into 2 different categories: Minimum Hardware Requirements and Recommended Hardware for Optimal Performance.

Minimum Hardware Requirements

Server minimum hardware requirements are defined as the minimum hardware needed to run a stable installation of Alteryx Server. If you don't meet the minimum requirements, you risk poor performance and random service shutdown on any node where the engine runs.

The following minimum hardware requirements are recommended for the desired number of concurrent workflows:

Minimum hardware requirements for Server

Anmerkung

The green highlighted line is the minimum recommended configuration. The line showing information for 1 concurrent workflow is helpful for you to understand how much you need to increase resources to add 1 additional job to the existing configuration.

Quality of Service Concerns

The Engine Configuration Memory Limit applies to engine, regardless if it’s original or AMP. The difference is in how each engine handles that memory limit:

  • Original Engine will pre-allocate the entire limit.

  • AMP will allocate what it needs up to the memory limit.

It’s a more efficient use of resources. Original Engine is multithreaded, but not highly multithreaded. AMP is much more proficient at running jobs in serial. The advantages are in total throughput, which is higher with AMP than it is when you use the original Engine.

Server is now capable of analyzing your hardware and allocating the appropriate resources per engine. It’s not the same level of resource manager as found on an OS, but with this new feature we’ve implemented the ability for Server to manage your resources.

We auto-allocate the number of jobs allowed to run simultaneously based on available hardware resources, if the Admin configures it to. For more information about the worker configuration, go to Worker help page.

No, resources are allocated per engine job. Each job would have its own resources available to it, which means there shouldn’t be resource contention between jobs.

AMP Articles

AMP Engine Webinar (32 Minutes)

Alteryx AMP EngineAlteryx AMP Engine

Alteryx Engine and AMP: Main DifferencesAlteryx Engine und AMP: Hauptunterschiede

AMP Memory UseAMP-Speicherauslastung

Tool Use with AMPTool-Verwendung mit AMP

Accelerate Your Analytic Processes with the New AMP Engine

AMPlify Your Workflows

The Alteryx AMP Engine: Explained

AMP Engine Technical Deep Dive | Part 1 | Why AMP?

AMP Engine Technical Deep Dive | Part 2 | Key concepts of the AMP Engine