Azure Functions: Cold Starts in Numbers
Auto-provisioning and auto-scalability are the killer features of Function-as-a-Service cloud offerings, and Azure Functions in particular.
One drawback of such dynamic provisioning is a phenomenon called "Cold Start". Basically, applications that haven't been used for a while take longer to startup and to handle the first request.
The problem is nicely described in Understanding Serverless Cold Start, so I won't repeat it here. I'll just copy a picture from that article:
Based on the 4 actions which happen during a cold start, we may guess that the following factors might affect the cold start duration:
- Language / execution runtime
- Azure Functions runtime version
- Application size including dependencies
I ran several sample functions and tried to analyze the impact of these factors on cold start time.
All tests were run against HTTP Functions, because that's where cold start matters the most.
All the functions were just returning "Hello, World" taking the "World" value from the query string. Some functions were also loading extra dependencies, see below.
I did not rely on execution time reported by Azure. Instead, I measured end-to-end duration from client perspective. All calls were made from within the same Azure region, so network latency should have minimal impact:
When Does Cold Start Happen?
Obviously, cold start happens when the very first request comes in. After that request is processed, the instance is kept alive in case subsequent requests arrive. But for how long?
The following chart gives the answer. It shows values of normalized request durations across different languages and runtime versions (Y axis) depending on the time since the previous request in minutes (X axis):
Clearly, an idle instance lives for 20 minutes and then gets recycled. All requests after 20 minutes threshold hit another cold start.
How Do Languages Compare?
I'll start with version 1 of Functions runtime, which is the production-ready GA version as of today.
The following chart shows some intuition about the cold start duration per language. The languages are ordered based on mean response time, from lowest to highest. 65% of request durations are inside the vertical bar (1-sigma interval) and 95% are inside the vertical line (2-sigma):
C# Script is slower but somewhat comparable. F# Script presented a really negative surprise though: it's much slower. It's even slower than experimental Python support where no performance optimization would be expected at all!
Functions Runtime: V1 vs V2
Version 2 of Functions runtime is currently in preview and not suitable for production load. That probably means they haven't done too much performance optimization, especially from cold start standpoint.
Can we see this on the chart? We sure can:
V2 is massively slower. The fastest cold starts are around 6 seconds, but the slowest can come up to 40-50 seconds.
Java is noticeably slower, even though the deployment package is just 33kB, so I assume I didn't overblow it.
Does Size Matter?
OK, enough of Hello World. A real-life function might be more heavy, mainly because it would depend on other third-party libraries.
To simulate such scenario, I've measured cold starts for a .NET function with references to Entity Framework, Automapper, Polly and Serilog.
Here are the results:
As expected, the dependencies slow the loading down. You should keep your Functions lean, otherwise you will pay in seconds for every cold start.
Funcpack preprocessor. The package
contained the single
js file with Webpack-ed dependency tree. Without that, the mean
cold start time of the same function is 20 seconds!
Here are some lessons learned from all the experiments above:
- Be prepared for 1-3 seconds cold starts even for the smallest Functions
- Stay on V1 of runtime until V2 goes GA unless you don't care about perf
- Minimize the amount of dependencies, only bring what's needed
Do you see anything weird or unexpected in my results? Do you need me to dig deeper on other aspects? Please leave a comment below or ping me on twitter, and let's sort it all out.
Like this post? Please share it!
|Tweet||Vote on HN|