Battle of the Serverless — Part 2: AWS Lambda Cold Start Times
This experiment continues the work done in our pretend suite of microservices exposed via API Gateway to form an API with a code name of Slipspace in a mock company called STG. Slipspace drives are how the ships in the Halo universe travel so quickly to different sectors of the galaxy through something called Slipstream Space, so thought it was cool for a name requiring awesome warp API speeds.
Rust wins, with Go, TypeScript/Node.js and Python close behind, all bringing in ~1–2secs cold start durations for Lambda functions. Kotlin is decent, and just stay away from C# and F# if cold start times are important for your use case.
What is a cold start?
Serverless is awesome. You don’t have to worry at all about scaling, right? Wrong. Functions in Lambda are run on demand when they are called, and thrown away when no longer required. This cycle of “spin up and destroy” leads to an event called a Cold Start. Cold starts are one such consideration serverless architectures need to address.
A cold start literally is the first time your function has been executed in a while (~10 minutes as of October 2019). Getting your function downloaded, containerized and bootstrapped all need to happen before your code can run in a cold start. After your code has been bootstrapped, your function is then considered a “warm” function until it is destroyed.
Testing for this round of cold start times is based on Lambda functions written in Rust, Go, Kotlin, F#, C#, Python, and TypeScript/Node.js. In “worst to best” order, here are the average results from Charles Proxy for cold start times based on ~200K requests hitting each function over the period of a week, every 12 minutes. The sweet spot for determining when a cold start would happen was right around 10 minutes +/- 1mins, meaning our function containers tended to live in a “warm state” for a maximum of 10 minutes without requests hitting them.
For the top 4 performers in cold starts, being Rust, Go, TypeScript/Node.js and Python, here are more details showing you where durations tended to cluster. Note that this includes the entire HTTP handshake, including TLS, DNS, etc.
This benchmarking experimentation is a blast. I learned a ton about cold starts in the process, how often they occur, how to mitigate them, when they typically occur. Fairly obviously, the interpreted languages (Python and TypeScript/Node.js) have quick spin up times. But not so obviously, the compiled languages (Rust and Go) beat them by milliseconds. I love that compiled language runtimes are improving so rapidly, and I LOVE that BYOR runtimes like Rust are competitive.