Sunday, August 9, 2015

How AWS Lambda works - if I'd implement it

Disclaimer: I don't work at AWS or Amazon nor do I have access to any privileged information. This is purely an exercise of imagination, if Lambda actually works like this then yey me :)

AWS Lambda is a new toy from AWS which promises you that you won't have to care about infrastructure anymore when running code. It's the next level after BeanStalk and even AppEngine and the closest thing we have to a "Code As A Service" that we have to date.

Since AWS is pretty sure not to release anything about the inner most working of it anytime soon, everyone is up for guessing what's under the hood, so I asked myself: How would I implement it?

The description says:
AWS Lambda is a compute service that runs your code in response to events and automatically manages the compute resources for you, making it easy to build applications that respond quickly to new information. 
AWS Lambda starts running your code within milliseconds of an event such as an image upload, in-app activity, website click, or output from a connected device. 
You can also use AWS Lambda to create new back-end services where compute resources are automatically triggered based on custom requests. 
With AWS Lambda you pay only for the requests served and the compute time required to run your code. Billing is metered in increments of 100 milliseconds, making it cost-effective and easy to scale automatically from a few requests per day to thousands per second.
The developer in me is happy with this. I get to write code only, deployment is as simple as uploading a zip file to S3 and done, things run, people get their data, I don't manage the infrastructure and everyone's happy and sleeps at night.

The only thing that I have to worry about is getting my code to run as fast as possible, as the billing is done in increments of 100s of ms, and use as little amount of memory, that's being billed as well.

Can we make the specifications out for Lambda based on the information we have?

What we know about Lambda?
  • it runs in an isolated environment
  • it runs quickly in response to events
  • it scales in response to events
  • it has constraints for memory (which also controls how much CPU you get)
  • it has a timeout of 60 seconds for the function it runs
  • it has a "temporary" space you can use but you are not supposed to rely on it
  • you should not rely on the fact that subsequent requests will run on the same machine

What can we dig up about Lambda?
  • first run is a little bit slower
  • subsequent runs will be fast if they are in a within a certain interval
  • CloudWatch logs seems to indicate that it will favor reusing the existing instances vs spawning new ones regardless of the throughput levels if it scaled enough to accommodate for the throughput
  • it scales horizontally
  • it's between t2.micro and t2.small in terms of maximum memory (or t1.micro and m1.small if you prefer older instances types)

So where does this gets us?
Based on the above capabilities and requirements, if I were to implement Lambda over the AWS I'd use:
  • ECS
  • some smart container scheduler, tailored for the workload type that Lambda has
  • communication adapter: this one acts as a "translator" from the various input sources to RPC (or HTTP?) and back
    • definitely with ELB in front of it for mediating the traffic
    • SNS for anything that can react to events inside AWS
  • one dedicated container built for every function that you upload which packs the communication adapter
  • CloudWatch: monitoring is needed, no? No?

Lets look at the information we have from a different perspective.
Aside from the dedicated instances, Lambda potentially has the whole AWS infrastructure to run on.

It's not hard to imagine that even with probably an extremely intelligent scheduler for placing instances there are still empty "spots" in the infrastructure. Those spots means servers are not used to their maximum and given the above description for how to build Lambda, they would be perfect to run it as well.

Looking out on how the AWS rolls out the services, ECS and Lambda, I think it only goes forward into proving that they went for a similar approach described above.

What does it miss?
Lambda would be beautiful to put in front of medium or even large workloads with quick response times but there's a critical part missing. Due to the way it works, if you want to use persistent connections to RDS there's no way to do this yet. If I were to implement it, I would definitely have an AWS DB connection pool somewhere in the middle between Lambda and RDS (AWS if you read this, *wink* *wink*).

To sum it up:
I can totally build it from the building blocks AWS provides, as well as any other cloud provider. And I hope who had the idea to do it got a fat bonus for it as I didn't (had the idea before it). Lambda is also a good way for AWS to better utilize their hardware and for people who don't need a complicated backend to just care about the code.

I think this might be the future for small business or very young / incubation period startups or hackers and tinkerers who just want a quick way to make their code run over the web without headaches and I'm definitely going to use it pretty soon in production.

I'd be happy to hear what other people think, if someone digs this one up.

A thank you goes to +Onur for providing suggestions on how to improve the readability of my post: Thank you!

Reddit thread: https://www.reddit.com/r/aws/comments/3gih4a/how_aws_lambda_works_if_id_implement_it/
HackerNews: https://news.ycombinator.com/item?id=10037476
Tweet: https://twitter.com/dlsniper/status/630853292768251904

No comments :