aumetra's Scribbles


05/05/2023 8 min read

Kitsune on Shuttle

A few months ago I got approached via GitHub issues to make Kitsune run on Shuttle, and I thought it might be interesting if I wrote some things down here that I did to get Kitsune to run on Shuttle.

The Shuttle team was really kind throughout the whole process and answered all my questions, big shout out to them!

What on what?

I’m gonna give some context for people unfamiliar with Kitsune and/or Shuttle.

Kitsune is one of my current pet projects. A Mastodon API-compatible ActivityPub implementation written in Rust. Our main features are:

  1. Ease of deployment
  2. Speed
  3. Fuzzy full-text search engine for public posts

Especially considering our first point (ease of deployment), made Shuttle compatibility an interesting proposition.

Shuttle is a service that allows you to easily deploy your Rust projects within a few minutes.

Their self-proclaimed goal is to, quote, “Make Rust the next language for cloud-native”. They are essentially “Vercel for backends” (another quote from their landing page).

They are managing hosting and provisioning databases (PostgreSQL and MongoDB), while offering a free tier.
This would make free deployments of Kitsune very accessible, which is part of the reason why I decided to give it a try.

Starting out

When I first read the issue, I was intrigued and decided to skim their docs first.

Their example of running an Axum server seemed pretty straight forward.
An annotation macro over the main function, and to get Axum to run you basically just return the router.

And off we go.

Configuration

The configuration is one of the uglier parts of this whole port.

Kitsune is configured via a configuration language called Dhall. It is basically Haskell minus turing-completeness.

Shuttle has a way to provide secrets to your deployment via their shuttle-secrets crate. Accessing variables via this crate works very similar to accessing environment variables.

Here is an example:

#[shuttle_runtime::main]
async fn main(
    #[shuttle_secrets::Secrets] secrets: SecretStore,
) {
    let s3_token = secrets.get("S3_TOKEN").expect("no token? :(");
}

What I did to get it off the ground quickly was to put the entire configuration file into a secret with the name config and then parse it as a Dhall file.

This unfortunately makes configuration a bit more difficult since the user has to install the dhall command line utility to resolve the configuration file, but not by much.

An example on how the Secrets.toml could look like can be found on GitHub.

Database connection

Like I mentioned in the beginning, Shuttle provisions PostgreSQL databases and provides a handy resource implementation for returning an SQLx connection pool.

Even though Kitsune is using SeaORM, which is an abstraction over SQLx and offers some utilities to convert an SQLx connection pool into the SeaORM DatabaseConnection type.
I wanted to ideally obtain a SeaORM pool from the get-go and have the database already migrated.

Another reason why I wanted to avoid shuttle-shared-db is that it forces you to use the native-tls feature of SQLx.
Kitsune already uses rustls which means that this would cause a conflict (SQLx only lets you use one or the other).

This could have been worked around by using crate features throughout all of Kitsune to switch between native-tls and rustls, but I would prefer to not have to do that.

There is an issue about making the TLS backend for shuttle-shared-db configurable.

So I decided to dig a bit further into Shuttle.

The shuttle-service crate

Shuttle enables you to do something akin to dependency injection in your annotated main function.

An example would look like this:

#[shuttle_runtime::main]
async fn main(
    #[shuttle_shared_db::Postgres] pool: PgPool, // Magic!
)

Pretty magic, right?

Well, when taking a closer look at the example, it turns out that the shuttle_shared_db::Postgres type isn’t actually a proc-macro (as the syntax might imply), it is a regular old struct.

The thing that gives it its dependency-injecting superpowers is the shuttle_service::ResourceBuilder trait (you can read more about it in their docs.rs Documentation).

This trait essentially builds some resource somehow, and to aid it, Shuttle gives it a trait object via which you can read all sorts of things; namely a database connection string.

Using this connection string, we established a connection to the PostgreSQL database directly via SeaORM and immediately ran the migrations.

Our custom resource ended up looking like this:

#[derive(Serialize)]
pub struct PostgresResource;

#[async_trait]
impl ResourceBuilder<DatabaseConnection> for PostgresResource {
    const TYPE: shuttle_service::Type =
        shuttle_service::Type::Database(Type::Shared(SharedEngine::Postgres));

    type Output = DatabaseReadyInfo;

    fn new() -> Self {
        Self
    }

    async fn build(conn_data: &Self::Output) -> Result<DatabaseConnection, shuttle_service::Error> {
        let db_conn = Database::connect(conn_data.connection_string_public())
            .await
            .map_err(|err| shuttle_service::Error::Custom(err.into()))?;

        Migrator::up(&db_conn, None)
            .await
            .map_err(|err| shuttle_service::Error::Custom(err.into()))?;

        Ok(db_conn)
    }

    async fn output(
        self,
        factory: &mut dyn Factory,
    ) -> Result<Self::Output, shuttle_service::Error> {
        let conn_data = factory
            .get_db_connection(Type::Shared(SharedEngine::Postgres))
            .await?;

        Ok(conn_data)
    }
}

And we were able to use it just like you’d use the SQLx pool resource:

#[shuttle_runtime::main]
async fn main(
    #[PostgresResource] pool: DatabaseConnection, // Our custom magic!
)

Caching

Kitsune makes use of caching to alleviate some load from the database, and since it is highly configurable, we have multiple modes for caching.

  1. Redis-backed caching
  2. In-memory caching
  3. No caching at all

We don’t recommend no caching at all, since we also use the cache constructs for non-caching-related topics at the moment (such as storing the OIDC-specific data; this will probably change).

On Shuttle, if you don’t use a Redis-as-a-Service host (such as Upstash), you can use the in-memory caching.
Don’t worry, it has a maximum size, so it won’t crash your deployment.

Searching

Kitsune is highly configurable in this area as well, and has multiple search backends/configurations.

  1. Custom search service based on Tantivy
  2. Meilisearch
  3. SQL-based (simple LIKE queries)
  4. No search at all

So this was one part we didn’t have to modify, since people deploying on Shuttle could either use the SQL-based search or get themselves a free Meilisearch Cloud account and use their free tier.

tonic-build and protoc

We use gRPC to call out to our custom search service. Since Kitsune builds with pretty much all of its features by default, the gRPC portion of the code has to be compiled as well.

We use tonic-build to build Rust protobuf definitions in a build script.
Tonic (or rather prost-build) used to ship its own protobuf compiler binaries. They dropped support for that a while ago though, so if you want to generate protobufs at compile time, the build machine needs to have a protobuf compiler installed.

Unfortunately Shuttle’s build container does not have a protobuf compiler installed. There is a tracking issue on their project board for adding “native dependencies”, but this is only in the “Considering” stage.

To work around this, I added a vendored feature to the crate that contains the proto definitions, utilising the protoc-bin-vendored crate. This crate bundles protoc for a bunch of major platforms.

This is a nice solution for both Shuttle deployments and people that don’t want to install the protobuf compiler.

UUIDs and Cargo configurations

We use UUID v7 for our database primary keys, because they give us rough temporal sortability, which is really great for stuff like timelines.
Version 7 is still considered a “draft” but is most likely going to get stabilised without changes as the official standards-track RFC.

But since it’s still technically a draft, the uuid crate only supports it behind an unstable cfg flag.
We are using a Cargo configuration file to pass the required Rust flags to the compiler to make it build the crate with the unstable features.

This is where the problems started.

Shuttle is using a “global” Cargo configuration (i.e. a configuration at $CARGO_HOME/config.toml) to add their mirror and patch some crates.

Unfortunately Cargo has a limitation where this global configuration has precedence over the per-project configurations, and the per-project configurations will be ignored completely!

For the time being I forked the uuid crate and patch it into the project using [patch.crates-io].

Tracing

Just a small side-note for people who are porting an existing application to Shuttle, like I did.

Shuttle installs a global tracing subscriber. If you try to install a global subscriber via something like tracing_subscriber::fmt::init(), your deployment will panic with a really bad error message.
To be fair, this is mentioned in the docs. I just overlooked it.

So this section is just a small heads up!

Memory leaks

When I finally deployed, it worked! It ran, people could register, all fun and good.
Then the next day, I received a DM from Shuttle’s engineering team on Discord, apparently Kitsune exhausted its allocated memory resources (6GB!)

So what was going on here?

My guess was that the container was, for some reason, not reclaiming memory correctly (especially since we haven’t hit these issues on bare-metal, even with deployments that have ran longer than the Shuttle deployment).

What could we try in this case? Well, just throw jemalloc at it (or, in our case, mimalloc). Switching allocators sometimes works wonders.
And, afterall, we are using the allocator in the standalone deployment, so why not on Shuttle?

And that fixed it! No more containers going haywire and hogging up all the resources!

Finishing things up

At this point, Kitsune was starting up successfully. The database got migrated, I could create an account and log in with Pinafore. If you are interested in seeing the code, check out Kitsune’s shuttle project.

The changes outlined here were basically it, they were all pretty trivial.

I personally think Kitsune deployments via Shuttle could be really interesting, especially when combined with something like Backblaze B2 (with their free storage offering).

Personally I feel like this is a great way for people to just “try out” Kitsune for themselves.
And if they ever hit the point where the 500MB database Shuttle provisions for free isn’t enough, they can either contact Shuttle and work things out with them or pull themselves a backup and deploy it on their own infrastructure!

Shuttle has no vendor lock-in in that regard. You can connect to your PostgreSQL database from your local machine.

Thanks

Thanks to @erlend-sh for opening the issue regarding Shuttle compatibility, and @joshua-mo-143 and @oddgrd for answering my questions!