AWS Lambda's new feature, Response Streaming, can enhance user experience, responsiveness, and search engine rankings of your web applications by lowering Time to First Byte (TTFB). Additionally, it supports a larger payload (soft limit of 20 MB) compared to a traditional buffered response (max 6 MB).

Demo

What is Time to First Byte (TTFB)?

TTFB is a foundational web performance metric for measuring the elapsed time between the request made and when the first byte of the response has arrived. So, low TTFB means web application will be ready faster for user.

ℹ️
"TTFB is often used by web search engines like Google and Yahoo to improve search rankings since a website will respond to the request faster and be usable before other websites would be able to." - Wikipedia

In a typical request-response model, the client must wait until a complete response gets generated and buffered. This wait time is longer for larger payloads.

Response streaming, a new invocation pattern, aids in the progressive delivery of response payloads to the client. The client will begin to receive partial and incremental responses as they become available.

Response streaming is currently supported in Node.js 14.x and subsequent managed runtimes. But you can use custom runtime to stream responses using custom Runtime API integration. Responses can be streamed through Lambda function URL, including as an Amazon CloudFront origin and using Lambda's invoke API & AWS SDK. Amazon API Gateway and Application Load Balancer do not support response streaming, but it can be used to deliver larger payloads (up to 10 MB limit of API Gateway).

ℹ️
Refer Lambda HTTP Response Stream Pricing for additional cost.

In this post, I'll use Lambda function URL to demo this.

Let's assume a use case where the Lambda function invokes the downstream rest endpoint (e.g. https://fakestoreapi.com/products/id) a few times, aggregate responses, and finally send the complete response payloads (list of products) to the client.

Lambda implementation will be as below.  Invoke mode will be "BUFFERED (default)" for the Function URL of this Lambda.

export const handler = async(event) => {
    // Count of products requested by client.
    const count = event?.queryStringParameters?.count;
    let responseBody = `\nProduct Count:${count}\n`;
       
    // Aggregate all products.
    for(let i=1; i <= count; i++) {
        const product = await getProduct(i);
        responseBody += product.title + "\n\n";
    }
   
    const response = {
        statusCode: 200,
        body: responseBody,
    };
    
    // Return full response.
    return response;
};

/**
 * Get product by id
 */
async function getProduct(id) {
    let url = 'https://fakestoreapi.com/products/' + id;
    const dummyResponse = await fetch(url);
    //Delay 0.5 sec
    await new Promise(resolve => setTimeout(resolve, 500));
    return await dummyResponse.json();;
}
Invoke mode: BUFFERED (default)

Increase the timeout for lambda function to something big (e.g. 30 seconds) from default 3 seconds as we know our lambda will take more than the default time here.

Increase Timeout

Now, call the lambda function url using curl.

abhijit@AwsJunkie:~$ curl https://5iu75oqiny3t4atepr5df5goha0mrqbb.lambda-url.us-west-2.on.aws/?count=10 --user AKIATYEVMR2DBQZ35AUS:dlx7x04lCn4/EqNcRHA/qgHnH1SuPKdk+MkEjalZ --aws-sigv4 'aws:amz:us-west-2:lambda'

The whole product list will appear at the end after waiting for approx. 5 seconds (product count 10 * 0.5 s delay).

Product Count:10
Fjallraven - Foldsack No. 1 Backpack, Fits 15 Laptops

Mens Casual Premium Slim Fit T-Shirts

Mens Cotton Jacket

Mens Casual Slim Fit

John Hardy Women's Legends Naga Gold & Silver Dragon Station Chain Bracelet

Solid Gold Petite Micropave

White Gold Plated Princess

Pierced Owl Rose Gold Plated Stainless Steel Double

WD 2TB Elements Portable External Hard Drive - USB 3.0

SanDisk SSD PLUS 1TB Internal SSD - SATA III 6 Gb/s

Now, we'll implement the same use case using Response Streaming to deliver each response progressively as it becomes available from downstream.

Response streaming-enabled function has to be wrapped with awslambda.streamifyResponse() decorator. And responseStream.end() signals no more data should be written to the response stream. Configure Invoke mode as  "RESPONSE_STREAM" for the Function URL of the Lambda.

export const handler = awslambda.streamifyResponse(
    async (event, responseStream, context) => {
        const responseMetadata = {
            statusCode: 200,
            headers: {
                "Content-Type": "text/plain"
            }
        };

        responseStream = awslambda.HttpResponseStream.from(responseStream, responseMetadata);
       
       // Count of products requested by client.
        const count = event?.queryStringParameters?.count;
        responseStream.write(`\nProduct Count:${count}\n`);
       
        for(let i=1; i <= count; i++) {
            const product = await getProduct(i);
            
            // Write to the stream
            responseStream.write(product.title + "\n\n");
        }
       
       // Properly end the stream
        responseStream.end();
    }
);

/**
 * Get product by id
 */
async function getProduct(id) {
    let url = 'https://fakestoreapi.com/products/' + id;
    const dummyResponse = await fetch(url);
    //Delay 0.5 sec
    await new Promise(resolve => setTimeout(resolve, 500));
    return await dummyResponse.json();;
}
Invoke mode: RESPONSE_STREAM

Increase the Timeout to 30 seconds as well.

To test, call the new function url.

abhijit@AwsJunkie:~$ curl https://erx4rj7n77vgfirt3kci7upkty0jgdik.lambda-url.us-west-2.on.aws/?count=10 --user AKIATYEVMR2DBQZ35AUS:dlx7x04lCn4/EqNcRHA/qgHnH1SuPKdk+MkEjalZ --aws-sigv4 'aws:amz:us-west-2:lambda' --no-buffer

Product Count:10
Fjallraven - Foldsack No. 1 Backpack, Fits 15 Laptops

Mens Casual Premium Slim Fit T-Shirts

Mens Cotton Jacket
:
:

Almost immediately, we'll start receiving the streamed response. A new product title will appear every 0.5 seconds (0.5 delay added in the lambda function). It is a far superior user experience to staring at a blank screen or "Loading..." gif.

References