Reducing Cold Start Duration in Azure Functions
Back in February, I published the first version of Cold Starts in Azure Functions—the detailed analysis of cold start durations in serverless Azure. The article showed the following numbers for C# and JavaScript functions:
Typical cold start durations per language (February 2019)
Note that I amended the format of the original chart: the range shows the most common 67% of values, and the dot shows the median value. This change makes the visual comparison easier for the rest of today’s post.
My numbers triggered several discussions on twitter. In one of them, Jeff Hollan (a program manager on Functions team) responded:
The numbers in @MikhailShilkov blog are significantly higher than what I see in our automated reports. I'm following up as well with @MikhailShilkov to validate all settings. But the rest of convo is valid :)
— Jeff Hollan (@jeffhollan) February 26, 2019
The team collects the stats of cold starts internally, and their numbers were lower than mine. We started an e-mail thread to reconcile the results. I won’t publish any messages from the private thread, but I’m posting the main findings below.
Check the deployment method
For my tests, I’ve been using “Run from external package” deployment method, where the function deployment artifact is stored as a zip file on blob storage. This method is the most friendly for automation and infrastructure-as-code pipelines.
Apparently, it also increases the cold start duration. I believe the situation already improved since my original article, but here are the current numbers from mid-March.
.NET:
Cold start durations per deployment method for C# functions (March 2019)
Node.js:
Cold start durations per deployment method for JavaScript functions (March 2019)
Run-from-external-zip deployment increases the cold start by approximately 1 second.
Application Insights
I always configured my Function Apps to write telemetry to Application Insights. However, this adds a second to the cold start:
Cold start durations with and without Application Insights integration
I can’t really recommend “Don’t use Application Insights” because the telemetry service is vital in most scenarios. Anyway, keep this fact in mind and watch the corresponding issue.
Keep bugging the team
While you can use the information above, the effect is still going to be limited. Obviously, the power to reduce the cold starts lies within the Azure Functions engineering team.
Coincidence or not, the numbers have already significantly improved since early February, and more work is in progress.
I consider this to be a part of my mission: spotlighting the issues in public gives that nudge to prioritize performance improvements over other backlog items.
Meanwhile, the data in Cold Starts in Azure Functions and Comparison of Cold Starts in Serverless Functions across AWS, Azure, and GCP are updated: I’m keeping them up-to-date as promised before.
P.S.
Nice post and appreciate you circling back and helping keep us honest.
— Jeff Hollan (@jeffhollan) March 27, 2019
Responses
Great work Mikhail, this stuff is important and your in depth analysis really shines a light on it - thank you.
Thanks for this article. I must confess I'm still slightly confused by the various deployment options and which you are talking about .
When I previously used JS Function's on Windows (before Linux option) I found the node-modules folder had to be bundled to get a sensible cold start (there's a Microsoft repo for this somewhere).
I recently asked if this still applied or is perhaps different on Linux hosted functions as the slow Windows share seemed to be the problem - https://github.com/MicrosoftDocs/azure-docs/issues/27181 The answer suggested I should use the package deployment which appears to be different to the old zip method in Kudu and is perhaps the same as you mention as zip - or might be different. Can you shed any light? Thanks
I describe these 3 deployment methods in the main article.
Basically, Local Zip and External Zip are "Run from Package" deployment as described here. I believe that's what you got recommended. As I mentioned, in theory, it should have reduced the cold start, but it hasn't.
All my Node.js tests are on Windows. I actually believe you can't get Node.js Consumption on Linux as of today.
Thanks for clarifying and it's good to hear there aren't even more options. I guess your 'local' is
WEBSITE_RUN_FROM_PACKAGE=1
. My next concern was as you found out it was not optimal when using blob storage. I guess that as well as the extra time loading the zip it has to unpack at cold start and perhaps local only does it at deploy time. I really should chek th ecode though :)Re Lunix, I'm sure I read an announcement of GA but now can't find it. However, this docs article specificaly mentions JS on Linux (which is only via nodejs AFAIK). I would assume available on consumption as that is the entire point of Functions. But to be honest, I don't care if Windows or Linux hosted, unless have native modules or bash npm scripts (which are a pain on Windows npm as it assumes cmd).
Update: Ah, Linux on consumption plan is a limited preview right now.