Notes on Amazon Lambda

                        December 11, 2020

                Notes on Amazon Lambda

                        First, a callback to an older post: Itamar Turner-Trauring did a neat writeup on using Cachegrind to deterministic performance analysis, inspired by my post on the challenges of stable benchmarking.

Last week, I wrote about gg, a cool research paper that focused on using Amazon Lambda as a compute engine. I mentioned that I’ve been playing with Lambda myself; this week I want to preview what I’ve been hacking on, and share some thoughts on Amazon Lambda from that experience.
llama: Use Lambda from the shell
I’ve been building a tool I call “llama,” which aims to make it easy to drive compute in Lambda from the UNIX shell, which is where I spend most of my time. Among other pieces, you can think of llama as a Lambda runtime for UNIX commands — it bridges between the Lambda Invoke API and running UNIX command lines. It also includes a frontend driver that lets you invoke those commands from your local shell.
Suppose we have a lot of PNG files we want to optimize for the web using optipng. OptiPNG is computationally expensive, so it can take a while to do this locally. With llama, we can build a Docker container containing optipng along side the llama runtime, and create a Lambda function for it. We can then invoke that function from the command-line using llama:
llama invoke -file squirtle.png -output squirtle-optimized.png \
  optipng optipng squirtle.png -out squirtle-optimized.png

The -file and -output flags instruct llama to copy the local squirtle.png into the image before running the command, and to copy the squirtle-optimized.png back afterwards.
Of course, the advantage of Lambda isn’t running single commands; optimizing a single PNG file is much faster locally in most cases. For this purpose, llama includes a llama xargs command, which lets us invoke a Lambda over multiple inputs in parallel. We can optimize an entire directory of PNGs like so:
ls -1 *.png | llama xargs optipng optipng \
  '.I .Line' -out '.O (printf &quot;optimized/%s&quot; .Line)'

xargs uses the Go template language and some helper methods to let your command-line interpolate input, and annotate various arguments as referring to files flowing to or from the Lambda. xargs dispatches requests in parallel; Lambda supports thousands of concurrent function executions, which lets us scale our optimization across thousands of cores, and only pay for precisely the compute we consume.
I consider llama very much a demo/experiment right now. If it continues to go well, I might polish it and “release” it in some form, but for now you can find the git repository, and some basic documentation in the README, here: https://github.com/nelhage/llama/
llamacc: Compiling code in Lambda
I originally built Llama to support self-play experiments in Taktician, my AI for the game of Tak. I wanted to be able to run thousands of games between different versions of Taktician, which is an embarassingly parallel data-light shared-nothing problem that Lambda and llama xargs  are perfect for.
Once I had llama working, though, I realized it would be trivial to build a distcc-like driver that does compilation in the cloud on top of it. So I also built llamacc, which runs compilation in Lambda using the traditional distcc strategy of running cpp locally. This limits it somewhat in how much concurrency it can extract, but it is still worth at least a 2x speedup compiling large projects on my underpowered Pixelbook, and I believe there is more gain to be had with some additional optimizations.
llama vs gg
gg is a much more sophisticated tool than llama. It models the entire compute graph, while llama only outsources single commands. While I think that llama does have some neat practical advantages, if I’m honest I built it in large part out of a sense of NIH, and a desire to better understand the AWS and Lambda APIs.
However, I do think it has some nice properties. The sheer simplicity makes it pleasant to work with for simple tasks, like the optipng  or taktician examples above. It builds on top of Amazon Lambda’s newly-released Docker container to support, which makes packaging code simple and uses technology familiar to virtually every developer these days. llama xargs fits well into shell pipelines and is really easy for interactive use or experiments from the UNIX command-line.
llamacc currently struggles to achieve impressive speedups, and gg will probably always have more impressive results on large builds. However, if I can make it work, I think the “drop-in” nature holds a lot of promise. gg needs to fully-understand a build process in order to run it, meaning that custom tools need special support in some way. With llamacc, like distcc, my goal is that you can build an arbitrary project using CC=llamacc make -j60 and get a successful build and a decent — if not necessarily overwhelming — speedup.
Thoughts on Lambda
I came into this project with a vaguely negative impression of Lambda. I found the docs quite confusing, and was frustrated by the lack of clarity about the underlying execution model. The only time I had actually tried to use Lambda for a project had been in conjunction with Amazon’s API Gateway, which is one of their most confusing and poorly-documented products. An extreme lack of clarity about all the different abstractions involved infuriated me confused me.
However, after digging in and understanding more deeply and building Llama, I’m deeply impressed by Lambda, and believe I’d been underestimating it.
Almost-infinitely scalable compute, with minimal setup and no need to provision standing infrastructure really feels like the dream of the cloud come true. The first time I ran a build with llamacc and saw a speedup relative to local compilation felt magical; I had successfully outsourced computation to The Cloud™, billed by the millisecond, and there was no infrastructure I was responsible for. I had previously done things like spinning up a 64-core behemoth in ec2 in order to build LLVM, but that always felt clunky and high-overhead, and ran the risk of sizeable surprise bills if I forgot to suspend it when I was done. Lambda really is the dream of “cloud computing as fungible utility,” and it’s basically here.
I’m not the first to notice this, of course. In addition to the gg paper last week, I’ve since come across the work of Eric Jonas et al. at Berkeley; they have papers where they lay out their vision of Lambda as a platform for “Distributed Computing for the 99%” and demonstrate efficient implementations of linear algebra kernels on top of Lambda. The vision is there and while the actual implementations are in the early days, I’m really excited about the prospect (and really excited about the prospect of contributing to the experiment with llama).
Lambda as a platform
I think there’s also the potential for Lambda to grow to become suitable for a much wider range of use cases. My current assessment of Lambda as a platform for building applications or data infrastructure is something like “Really valuable for jobs where you need fine-grained burst compute, but if you have a steady baseline of volume or need a standing service, you’re better off provisioning (virtual) infrastructure, using EC2 or EMR or whatever.” However, on reflection, I realized that that also almost exactly described the consensus view of EC2 itself in its early days when I first used it, circa 2009 or so (replacing virtual infrastructure with “dedicated servers in colo”). Now, of course, EC2 has matured and cost has come down, and it’s completely uncontroversial to build your web app or any other infrastructure on EC2 directly, at nearly any scale of business.
The parallelism, of course, doesn’t guarantee that Lambda will become cheaper and viable as a platform for larger classes of applications over time, but it’s enough to give me pause. And as I stop to think about the vision of Lambda as application platform, of data pipelines that run in Lambda, and HTTP applications that run as Lambda functions backed by API gateway, there’s a lot to like in the vision. The strict shared-nothing model has very real costs, but it also simplifies reasoning immensely, and the promise of rapid, fine-grained, scale-out with granular billing and no infrastructure to maintain is pretty appealing. Right now, building data pipelines or HTTP applications on Lambda is a bit of a pain in the ass in part because the tooling is immature and it requires a lot of glue. I can definitely imagine a future where the tooling improves, Lambda gets cheaper and continues to gain features, and where it just becomes totally normal to deploy new products on top of Lambda. We’ll see!
Amazon investment in Lambda
The future of Lambda can only be as bright as Amazon’s support for the product. If they let it languish, if they don’t invest in it and in integrations built around it, in continuing to migrate it to newer hardware, and so on, it will decay, although there is always the potential for another product, or a competitor, like Google Cloud Functions, to pick up the slack.
However, as best I can tell, Amazon is very invested in the Lambda product, and sees it as a major part of the future of the AWS ecosystem, which gives me optimism in its future.
For one, a week after I started working on llama, Amazon announced a new batch of Lambda features at this year’s (virtual) re:Invent. Most excitingly, to me, they increased the maximum size of Lambda functions to 10GB of memory and 6 cores, as well as adding support for building Lambda functions from Docker container images. The larger images open up the possibility of running larger jobs in Lambda and increasing efficiency for operations on larger-sized problems, if we can run larger chunks in individual functions. And Docker container support is absolutely huge for solving the problem of distributing code into Lambda, something that was previously a bit of a confusing dark art involving manually-preparing zip  files and spelunking dependencies a bit by hand. Both of these changes feel like huge steps forward for Lambda usability to me.
Finally, I find it really interesting how it seems like Amazon has concentrated effort on Lambda as the preferred solution for solving the integration problems between AWS’ billion different poorly-integrated product offerings. It seems clear to me that the future of a lot of application development involves stitching together a dozen different AWS services that sorta-of-but-not-quite integrate together, with a lot of glue around the edges. And Amazon, it seems to me, has decided to embrace that pattern not by integrating their products closer together, but by pushing “bring your own glue inside Lambda functions” as the generic solution. Their awslabs github account is full of example service-to-service integrations using Lambda, for instance.
As distasteful as I find this state of affairs personally, in a bunch of ways, it’s clearly very successful for a lot of customers, and I think it will continue to accelerate in popularity. And insofar as I understand Lambda as a core part of AWS’ product strategy in this way, it seems likely to me they will continue to invest in it heavily.
Conclusion
As you can perhaps tell, I’ve gotten inadvertently pretty excited about Lambda! I’m really optimistic about it as a tool to bring, to paraphrase one of the papers above, “distributed computing to the masses.” And, while I still wouldn’t try to build a new, say, REST API on top of Lambda just yet, I’m much more open to the prospect than I was a few months ago, and I’m going to be keeping a close eye on that landscape in the future.
Are there any other really cool papers or demos built on Lambda I should see, or other applications or comparable programs to llama? I’d love to hear about them!

                        Don't miss what's next. Subscribe to Musing in Computer Systems: