A year of running a hotel booking application on AWS Serverless services for $0.8/month
I built a hotel booking web application, serving 500k requests/month, 3000 daily users, 100-200 bookings/day, using AWS Serverless services which costs just ~$0.8/month to operate last year
Disclaimer: The purpose of this article is to show how designing applications with Serverless architecture help us solve business problems and minimize costs, while still allowing high availability, scalability, ease of development and extensibility. It does not mean you should migrate your architecture to Serverless services right away. Whatever solution you’re using now, still works for your business. It just shows how far we can go in cost optimization of cloud services, when designing applications with Serverless in mind
TLDR:
The main billing cost was on AWS API Gateway ($1/million requests), SES ($0.10/1000 emails) and S3 ($0.004 per 10,000 GET requests).
We serve 500k requests/month via API Gateway ($0.5), sending > 3000 booking confirmation emails/month via SES ($0.3, without free tier) and serving 100k GET requests/month on S3 ($0.04).
We have a public booking website (the main application), an admin website (internal application). We used Wix for CMS and a local PMS provider to manage hotel operations.
We use React/TypeScript for both the booking website and admin website
We use @vendia/serverless-express, NodeJS, Typescript, AWS Lambda for API services, and puppeteer with Chromium on Lambda Layers for SSO integrations with PMS provider.
We added NewRelic in Apr 2023 for real-time monitoring of AWS Lambda invocation, Browser SPA, error reporting and alerts, and optimized NewRelic costs after a few months.
We monitor Lamda cold start durations using CloudWatch Lambda Insights, (
NewRelic does not support monitoring cold start durationsNewRelic metrics stream integration can also forward Lamda Insights init duration)Our cold start durations are ~400ms without NewRelic, and ~800ms when using NewRelic Lambda Layer
[Update 2023/11/08] It seems the awful cold start metrics is due to our code with Serverless Express, and NOT related to NewRelic extensions.[Update 2023/11/09] I could confirm again NewRelic did increase cold start duration (benchmarked with popular Lambda middlewares: @vendia/serverless-express, middy.js, serverless/http)We deploy and manage our infrastructure using AWS CDK for Javascript.
Our entire code base is managed in a single Git monorepo using NX and Yarn Workspaces
Currently, 30% of our booking revenue comes from our website, which is a great return for just ~$0.8/month operation cost.
Business Overview
We have a hotel business that is currently managed via a Wix website, 3rd party PMS (Property Management Service) provider. The business problem we're having is:
We need to be able to book our hotel rooms on hourly rate, over-night rate or daily rate. Most booking engines only support daily rates. Our PMS support setting hourly check-in/check-out time, but it provides no booking engine.
We have a lot of manual works for hotel receptionists in checking room availability, hotel prices to answer to potential customers via chat, especially at weekend when rooms are fully booked.
We need to build our own booking service in order to let customers book on our Wix website, integrate the booking with our 3rd party PMS service, and notify our receptionists of new bookings.
Application Architecture
Below is the current architecture of our hotel booking application using AWS Serverless services
Disclaimer: In the beginning, we only have a single Booking API Lambda and Booking Website, with no CI/CD (similar to a monolithic app). As we add more features and optimizations, the architecture evolves to what it is today (more similar to micro-services). We didn’t design the whole platform from the beginning, but we always prefer using Serverless services whenever we add new features and components
API Services (Lambda, DynamoDB, API Gateway)
Our booking application use AWS Lambda integrations with API Gateway HTTP APIs to handle HTTP requests, which costs only ~$1/millions requests, so in the end it costs just $0.5/month for serving our entire user bases.
We use NodeJS, Typescript and @vendia/serverless-express which offers easy integrations with API Gateway HTTP APIs, and seamless local development experience using ExpressJS, as opposed to alternatives such as AWS SAM, Serverless Framework or Local Stack.
For CI/CD, swagger-autogen is used to auto-generate route definitions which can be mapped into AWS CDK Stack for API Gateway Lambda Integrations.
AWS Lambda costs are measured by GB-seconds (memory usage and executed durations), but we found that at 500k requests/month, we only reach 33% of AWS Lambda Free Tier offer of 400k GB-seconds and 1 million requests (~$3 without Free Tier)
We use DynamoDB for storing booking reservations and application settings. DynamoDB Free Tier offers 25 RCU and 25 WCU, which can handles hundreds of millions of read/write requests per month.
We probably only use no more than 5 RCU/WRU accross 3 DynamoDB tables, since we don’t have much concurrent users, and our request per seconds is mostly in single digit.
Booking Websites & CMS (React, S3, CloudFront)
Integrating our booking APIs with Wix CMS is not quite straight forward. We ended up using Wix embed components to embed our booking website in iframes inside Wix pages. Static assets and website settings are managed in a separate admin website. Both the booking website and admin website are built using ReactJS, Typescript, and served via Cloudfront with S3 origins.
An interesting problem with iframe is these embed documents are cached like static assets in browsers, so we must define
Cache-Control
policy on S3/Cloudfront forindex.html
file to allow automatic cache busting.
Browser Automation with Lambda
Integrating with our local PMS provider is also not straight forward, since it does not offer 3rd-party API endpoints. We only have a PMS website that our hotel receptionists can log in to manage booking reservations.
So we ended up being creative and automated this manual process in browser using puppeteer and Chromium on AWS Lambda Layers. To optimize performance, only login process needs puppeteer and Chromium.
Once the SSO access token is retrieved and securely stored in SSM Parameter Store, the remaining integrations with our PMS providers are done via HTTP requests/responses. We only need to refresh the SSO access token once a month, using AWS Event Bridge scheduled event rules.
Application Performance Monitoring
We use New Relic Serverless Monitoring to monitor AWS Lambda performance. We do not need long data retention since these are already stored in CloudWatch Logs & Metrics, so we only use New Relic Standard Plan with 7-days data retention for real-time monitoring. We also use New Relic Browser SPA monitoring to monitor our React websites, as well as error reportings and alerts in real-time.
Initially we forwarded CloudWatch metrics to New Relic for metrics integration, which incurs costs to poll for CloudWatch metrics. However, the costs increases when we add more AWS Lambda functions for more integrations, and these metrics already exists in CloudWatch. We ended up using NewRelic environment variables and NewRelic Lambda Layers to integrate with AWS Lambda, and avoid unnecessary Cloud Watch & Secrets Manager costs.
After some months experimenting with New Relic, we were able to get back to below $1/month billing costs with this extra cost optimization. While this small cost optimization might not be necessary for most applications, it was interesting to see how far we can go to cut unnecessary spends on cloud services that we don’t need.
AWS Lambda Cold Start Metrics
NewRelic does not monitor Cold Start time (Init Durations), so we use good old CloudWatch Lambda Insights to monitor Cold Start times, using a custom CloudWatch metrics and dashboard for init durations (NewRelic metrics stream integration can also forward Lamda Insights init duration)
With our package size averaged between ~1-3MB (including source maps), we found our cold start duration as below:
With NewRelic Lambda Layer: ~800ms
Without NewRelic Lambda Layer: ~400ms
[Update 2023/11/08] It seems the awful cold start metrics is due to our code with Serverless Express, and NOT related to NewRelic extensions.[Update 2023/11/09] I could confirm again NewRelic did increase cold start duration (benchmarked with popular Lambda middlewares: @vendia/serverless-express, middy.js, serverless/http), will share bechmark details and Github repo later
Since our applications are not sensitive to cold start time, and most of the performance problems come from our local PMS providers and not cold starts, we did not optimize cold start time further. In the future, we will add an Event Bridge rule to warm up our Booking API Lambda functions to reduce cold start impacts on our users.
[Update 2023/11/10] Lambda also supports Provisioned Concurrency , which pre-warm Lambda functions at a fixed cost (e.g: ~$6/month per 1 concurrency of 512MB Lambda) and can also be managed with AWS Auto Scaling to only run at a fixed schedule. Detailed explanation about Lambda Concurrency
[Update 2023/23/11] Using a warmup Lambda function with EventBridge schedule is very simple and much more cost effective than Provisioned Concurrency, especially for functions with low concurrent executions (<100).
CI/CD using AWS CDK
We use AWS CDK for Javascript to manage our infrastructure as code, using Typescript. Each Lambda function is built separately to optimize AWS CDK hash evaluation when deploying a CDK Stack and its dependencies.
We bundle Lambda functions into zip packages using Webpack and zip-webpack-plugin, using a single Webpack config for all of our Lambda functions (organized as multiple entries and outputs).
AWS CDK offers a first-class experience of building AWS infrastructure, with easy to learn construct APIs. Although AWS CDK supports most AWS Services at its level 2 constructs API (service oriented), some newly released services are only supported at level 1 (CloudFormation constructs), such as Event Bridge schedules, so I ended up using alternatives until level 2 API is supported
One disadvantage of AWS CDK when compared to Terraform & Terraform CDK is its import functions.
While Terraform imports work by generating Terraform codes based on imported AWS services and synchronize the services to Terraform state, AWS CDK imports simply ignore the imported services when deploying CloudFormation stacks and only pass them as resource references.
I would recommend AWS CDK when starting new projects, but not for existing projects with lots of existing infrastructure to import.
Summary
In retrospective, it was a great 12-month excercise on cloud cost optimization, designing serverless application architectures, CI/CD and digital transformation. We truly believe AWS Serverless services is especially suitable for small & medium sized businesses, such as our local hotel business. AWS Lambda and API Gateway offers a cost-optimized and highly scalable solutions, without the complexity of managing traditional web applications and micro-services. We managed to operate our entire cloud infrastructure at only $0.8/month for the last 12 months, and contributed to ~30% booking revenue of our hotel business.
E mới xem podcast của a với Yan. Thấy rất quen hoá ra là anh Hiếu ở Lê Lợi. Cảm ơn a về bài viết, mong có thể contact với a để trao đổi thêm :D
Thanks for the write up, but those are awful cold start times. You'd be better off with Middy as opposed to Express and using Powertools for Lambda instead of New Relic.