A run-away AWS Lambda function may hog all available concurrency of your account and cause unexpected charges in the AWS bill. But Lambda can now detect and stop certain types of recursive or infinite loops.

In this post, we'll explore more on this feature with a video demo.

🖋️
Currently, this feature is available for Lambda integrations with Amazon SQS, Amazon SNS or direct invocation using Lambda invoke API. If any other service, such as Amazon S3 or Amazon DynamoDB forms that loop, then AWS Lambda can not detect and stop the recursive loop.

Reproduce:

For the purpose of the demo, we'll introduce a infinite loop in a Lambda function (e.g. LambdaRecursiveLoopDetectionTest). A message in Amazon SQS (e.g. RecursiveLoopDetectionTestQueue) will trigger our sample Lambda function. The lambda function will read new message from SQS and send another message to the same queue that will trigger the same lambda again. It will result a recursive loop.

Lambda function code:

var AWS = require('aws-sdk');
var sqs = new AWS.SQS();

exports.handler = async (event) => {
    let message = event.Records[0];
    
    console.log('traceId --> %s', process.env._X_AMZN_TRACE_ID);
    console.log('SQS message received %s: %j', message.messageId, message.body);

    // Send message to SQS
    var params = {
        MessageBody: message.body,
        QueueUrl: "https://sqs.us-west-2.amazonaws.com/123456942166/LambdaRecursiveTestQueue"
    };

    let queueRes = await sqs.sendMessage(params).promise();
    const response = {
        statusCode: 200,
        body: queueRes,
    };

    return response;
};

Lambda execution role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "logs:CreateLogGroup",
            "Resource": "arn:aws:logs:us-east-1:123456942166:*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": [
                "arn:aws:logs:us-east-1:123456942166:log-group:/aws/lambda/LambdaRecursiveLoopDetectionTest:*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "sqs:DeleteMessage",
                "sqs:GetQueueAttributes",
                "sqs:ReceiveMessage",
                "sqs:SendMessage"
            ],
            "Resource": "arn:aws:sqs:*"
        }
    ]
}

Let's drop the first message to initiate the recursive loop.

Detection:

To track the count of recursive invocations, Lambda use AWS X-Ray tracing headers ( X-Amzn-Trace-Id) appended with Lineage (e.g. Lineage=5799cab8:0 - hash of the lambda function and a counter starting from 0). For each invocation, counter gets incremented by 1, and Lambda terminates the loop after 16 invocations.

2023-07-20T03:36:04.059Z	eac2df47-5066-5027-a024-8275fdb247a5	INFO	traceId --> Root=1-64b8aba4-7decb90c422647122a9bc4f1;Parent=0a17ef1c0457a573;Sampled=0;Lineage=5799cab8:0

Monitoring:

In AWS CloudWatch Management Console, search for the metric name 'RecursiveInvocationsDropped' for the lambda function. We'll see how many recursive invocations were dropped by AWS Lambda.

We can see the same CloudWatch metrics directly using the "Monitoring" tab of the lambda function.

Notification:

Once the recursive loop gets detected and stopped, AWS Lambda notifies you by email and through AWS Health Dashboard notification ("Lambda runaway termination notification")along with troubleshooting steps.

Email Notification:

AWS Health Dashboard Notification:

References