100ms delays with Rust on Lambda
As a beginner to deploying Rust code on AWS Lambda, I hit a speed bump using Rusoto to call AWS APIs during the function run. Calling AWS APIs from Lambda functions is of course not required, just typical.
Why is it taking so long to call anything? Especially with both the client and server in AWS datacenters, it should be very fast. Calls to DynamoDB (DDB) were taking well over 100ms, not the typical single-digit milliseconds. It didn't matter if it was a cold start or not, each invocation was taking this long.
I was able to zero in on the time being spent in creation of HttpClient
.
My first thought, unfortunately, was to try and cache the creation costs. This would be akin to a database connection pool, something that happens quite often in Lambda functions. Because I knew that pattern from Lambda and databases, I was quick to jump into making that happen. But it was quite painful (I will spare you the details of that tangent).
Digging deeper into why it was even happening, thanks to open-source 💥, I was able to find the separate native-tls
and rustls
paths in the HttpClient
source. Ahh.
The Rusoto docs have a whole page on configuring rustls here. I wasn't experiencing any of the errors mentioned there, just a performance problem which it does not mention.
Making the required changes didn't work. There were compiler failures ("previous import of the module tls
here").
Often when configuring feature flags in Rust, you forget to add default_features = false
and the default flags complicate what you're doing. No, that was configured correctly. Some docs on the internet distinguish between default_features
and default-features
but that wasn't the problem either, Cargo handles both variations.
You can use the excellent cargo tree tool to investigate feature flags. The -e features
flag is helpful, e.g. cargo tree -e features -i rusoto_core
.
And that made things clear. There was a different package that depended on Rusoto that was forcing the compilation to continue to use the native-tls
feature instead of rustls
. This was serde_dynamodb (a great library, btw). After looking at its Cargo.toml
file, it also has a rustls
feature. Great.
Configuring that to also use rustls did the trick (default-features = false, features=["rustls"]
). Now native-tls
was eradicated from the build.
And that worked. The 100ms was gone.