Serverless Cloud Architecture – Secure data distribution with AWS S3, CloudFront and Lambda@Edge
When searching for an easier to scale, stand-alone data distribution platform, which once set up, does not require a lot of maintenance, we found an interesting solution in the environment of the Amazon Web Services. In this short article we introduce the interaction of Amazon’s S3 service with CloudFront and the feature Lambda@Edge.
Simple Cloud Storage Service (S3)
This is an object-based storage solution from Amazon that allows users to make files available over the internet. Managed centrally by Amazon, the service offers a highly scalable, fast and cost-effective data storage infrastructure. In addition to the central AWS user interface, S3 also offers a dedicated API for managing the service and the files, which can be addressed by any system.
S3 will be the central file store for our project, it will be configured as a private storage location and only allow connections from a specific CloudFront distribution. In this case, we are using a private S3 bucket in the AWS region eu-central-1 (Frankfurt).
Content Delivery Network (CloudFront)
With CloudFront, Amazon offers a service which distributes any type of data in a short period of time from a central location – in our case S3 – worldwide to all common regions. Amazon calls these locations, “edge locations”. This creates a way to retrieve low-latency files worldwide. CloudFront will takeover the global distribution of the files from the S3 bucket for us, and implement an additionally SSL-secured intermediate layer between the client and the actual storage location of the data to be distributed.
Lambda@Edge is an individual program code, which is executed event-related with each call up of Cloudfront resource. From a developer’s perspective, this is a simple Node.js function that can analyse and modify individual requests at the time they arrive at a CloudFront Edge.
Lambda@Edge will provide us with worldwide access to files in S3 via CloudFront with an additional http basic authentication. To do this, it is necessary to create a Lambda@Edge function, and provide it with the appropriate CloudFront distribution with the event-type view-request as input. It must be noted that Lambda@Edge is currently only available in the AWS region “us-east-1”.
The actual program code that provides access control may look something like this.
The following diagram shows the high-level architecture and interaction of the individual services. With each viewer request to a CloudFront resource such as for example https://d2w3doht2h3uzp.cloudfront.net/file_sab42bff7.zip, a Lambda@Edge function is executed which requests authentication from the client.
If this is successful, the relevant CloudFront Edge server on which the request arrives, will provide the corresponding file. If the requested file is not cached, as it has never been requested by that region, it is retrieved from the central S3 storage location, cached, and then delivered.
In addition to the well-known extension of the file access on S3 via SSL-secured CloudFrond connections, the Lambda@Edge function allows us to insert individual program codes to extend the existing standard functions. Without having to worry about exhausted computing capacity or geographically-related latency problems, many more use cases can be realised for worldwide file access.
Other conceivable use cases
- Automated communication and protection via JWT/oAuth Tokens
- Routing/ URL rewrite for A/ B testing or staging
- Extension of CloudFront Standard http status codes
- Add, discard or modify headers
- Querying AWS-external resources, to further enhance the request