Understanding AWS Lambda Concurrency

2 min read3 days ago

Concurrency is the number of in-flight requests that your AWS Lambda function is handling at the same time.

What is AWS Lambda Concurrency?

Concurrency in AWS Lambda means how many instances of your function can run at the same time. For each concurrent request, Lambda provisions a separate instance of your execution environment. As your functions receive more requests, Lambda automatically handles scaling the number of execution environments until you reach your account’s concurrency limit.

By default, Lambda provides your account with a total concurrency limit of 1,000 concurrent executions across all functions in an AWS Region.

Types of Concurrency:

Reserved Concurrency:

Sets a limit on how many times a specific function can run at once.
Ensures your critical functions always have capacity.

2. Provisioned Concurrency:

Keeps a certain number of function instances ready to handle requests instantly.
Reduces start-up time (cold starts) for your function.

Why it Matters:

Reserved Concurrency: Prevents other functions from using up all the available instances.
Provisioned Concurrency: Improves response times for important functions.

To check your current account level concurrency quota, use the AWS Command Line Interface (AWS CLI) to run the following command:

aws lambda get-account-settings

{
    "AccountLimit": {
        "TotalCodeSize": 80530636800,
        "CodeSizeUnzipped": 262144000,
        "CodeSizeZipped": 52428800,
        "ConcurrentExecutions": 1000,
        "UnreservedConcurrentExecutions": 900
    },
    "AccountUsage": {
        "TotalCodeSize": 410759889,
        "FunctionCount": 8
    }
}

ConcurrentExecutions is your total account-level concurrency quota. UnreservedConcurrentExecutions is the amount of reserved concurrency that you can still allocate to your functions.

How can you configure it?

Go to your function in the AWS Console.
Under “Configuration,” find “Concurrency.”
Set your desired limits for reserved or provisioned concurrency.

Limits:

AWS limits the total number of concurrent executions per account, which can be increased if needed.
If your function hits its limit, extra requests will wait or be rejected (throttled).

Tips:

Monitor Usage: Use AWS CloudWatch to keep an eye on how many instances your functions use.
Optimize Functions: Make your functions run faster to avoid hitting limits.
Handle Throttling: Set up a queue or retry mechanism for requests that get throttled.

Handling Throttling

When a Lambda function exceeds its concurrency limit, additional invocation requests are throttled. The throttled requests can either be retried or handled via a dead-letter queue (DLQ) or an Amazon SQS queue.

Best Practices:

Monitor Concurrency: Use AWS CloudWatch to monitor Lambda function concurrency and adjust settings as needed.
Optimize Code: Reduce execution time to minimize the chance of hitting concurrency limits.
Use DLQs: Implement dead-letter queues to handle throttled requests gracefully.
Leverage Provisioned Concurrency: For critical, latency-sensitive functions, use provisioned concurrency to reduce cold starts.

In simple terms, AWS Lambda concurrency helps you control how many times your function can run simultaneously, making sure critical functions always have the resources they need and improving response times for important tasks.