􀀂􀀟􀀍􀀄 􀀂􀀙􀀍􀀄 Running Wiki.js on Fly.io with Litestream backups

I need to run a wiki for some personal stuff. I created a Docker container that runs Wiki.js supervised by Litestream and running on Fly.io. Here is how I did it.

The stack

Fly.io

Fly runs Docker containers on Firecracker VMs for you.

There is a bunch of cool magic you get from them on top:

  • A nice flyctl command that is a really great CLI design
  • No host maintenance; just keep your app up to date, don’t worry about the host OS
  • Direct access to each container with flyctl ssh
  • Fast logs that feels like just running Docker locally with flyctl logs
  • Let’s Encrypt certificates with just a CNAME and an extra command
  • Persistent volumes for your app

There is a free tier that lets you run a few apps and store some data. Note that Wiki.js requires 1GB of RAM, which I think means this particular service can’t be deployed to Fly for free. That is ok for me, though, as my other option is deploying to Digital Ocean or similar, which also would cost money. See their pricing page for details.

Litestream

Litestream will constantly send the contents of a sqlite database to S3. This provides copies of the database at arbitrary checkpoints which Litestream can restore to a regular sqlite database file on demand.

Fly.io is now employing the creator of Litestream, and appears to be planning for tight integration in future products.

Wiki.js

Wiki.js is perfect for Fly.io + Litestream, because all data is stored in the database, even attachments like photos and PDFs.

Backing up the data is a one step task with Litestream and Wiki.js.

How to deploy Wiki.js to Fly.io under Litestream

Prerequisites

This assumes you already have an S3 bucket created. (Some notes on that later.) You’ll need:

  • The S3 bucket itself
  • An IAM user with permission to write to the bucket
  • Access keys for the IAM user (an “access key ID” and its corresponding “secret access key” in AWS IAM parlance)

You may also want to use your own custom domain name. I am using wiki.micahrl.com for mine. (If not, you can just use your-unique-app-name.fly.dev for your wiki.)

I also recommend a git repository to keep yours in. This will likely be different for everyone; I keep mine in the repo for my psyops project.

Assuming you have those requirements, let’s get started.

Create a Dockerfile

To run Wiki.js on Fly.io without Litestream, you can use the official requarks/wiki Docker container. However, to use Litestream, you must add Litestream to that container. Litestream has support for running as a wrapper around another command which makes adding it to another Docker container very easy – just have your container run litestream replicate --exec 'the full command to wrap'.

My Dockerfile looks like this:

# Dockerfile for running Litestream + Wiki.js on Fly.io

FROM litestream/litestream AS litestream
FROM ghcr.io/requarks/wiki:2

COPY --from=litestream /usr/local/bin/litestream /usr/local/bin/litestream
COPY litestream.yml /etc/litestream.yml
COPY start.sh /usr/local/bin/start.sh

USER root
RUN true \
    && chmod 755 /usr/local/bin/litestream \
    && chmod 755 /usr/local/bin/start.sh \
    && chown node /etc/litestream.yml \
    && touch /testfile \
    && true

USER node

ENTRYPOINT []
CMD ["/usr/local/bin/start.sh"]

Some notes about that:

  • We use the official Litestream Docker container just so we can copy the litestream binary from it. The binary is statically linked and it just works!
  • The ENTRYPOINT and CMD are modified from the values defined in requarks/wiki. I based this value on the Litestream documentation mentioned earlier combined with the command from the official Wiki.js Docker image; see below for how I found that.

Our Dockerfile copies a litestream.yml file from its directory, which looks like this:

dbs:
  - path: ${DB_FILEPATH}
    replicas:
      - type: s3
        bucket: ${LITESTREAM_S3_BUCKET}
        path: ${LITESTREAM_S3_PATH}
        region: ${LITESTREAM_S3_REGION}

All of those are environment variable names, which we will tell Fly to provide when it deploys, and which Litestream will read at runtime.

Finally, we run start.sh from the Dockerfile:

#!/bin/sh
set -eu

# Restore the database from S3 if and only if there is no local copy of the database
/usr/local/bin/litestream restore -if-db-not-exists "$DB_FILEPATH"

# Run the Wiki.js docker-entrypoint.sh script, supervised by Litestream
/usr/local/bin/litestream replicate -exec "/usr/local/bin/docker-entrypoint.sh node server"

Deploy to Fly.io

First, decide on an app name. I chose com-micahrl-wiki for mine, because I like reverse DNS style names, and names cannot contain a dot.

Log in and create your application, data volume, and secrets

# Create a directory to hold your configuration
# (Can be the root of a new git repo, or a subfolder of an existing one, whatever)
mkdir wiki.micahrl.com
cd wiki.micahrl.com

# Log in
flyctl auth login

# Create fly.toml
flyctl launch --no-deploy --name com-micahrl-wiki

# Make a 1GB volume
flyctl volumes create data --app com-micahrl-wiki -s 1

# The S3 access key you created in advance
flyctl secrets set \
    LITESTREAM_ACCESS_KEY_ID=XXX \
    LITESTREAM_SECRET_ACCESS_KEY=YYY

The flyctl launch command will have created a fly.toml file. You’ll need to edit this by making the env section look like this:

[env]
  # Use https://your-app-name.fly.dev for now - you can change it to a custom domain later
  url = "https://com-micahrl-wiki.fly.dev"

  # Wiki.js variables
  DB_TYPE = "sqlite"
  DB_FILEPATH = "/mrldata/wikijs.sqlite"

  # Handled by Litestream itself
  LITESTREAM_S3_BUCKET = "com-micahrl-wiki-litestream-bucket"
  LITESTREAM_S3_PATH = "wikijs.sqlite"
  LITESTREAM_S3_REGION = "us-east-2"

And setting the internal port to 3000 (the port that Wiki.js uses):

[[services]]
  internal_port = 3000

And mounting the data volume you created:

[mounts]
  source = "data"
  destination = "/mrldata"

The full version of my fly.toml is on GitHub.

Now deploy your app:

# Deploy the app itself
flyctl deploy

# 1GB RAM is the Wiki.js minimum requirement
fly scale memory 1024

At this stage, you should be able to visit https://your-app-name.fly.dev and log in to the wiki, but if you want to use a custom domain name, don’t do the first-run wiki configuration until we set the domain name and get certificates working.

  • Create a CNAME for your custom domain name (I used wiki.micahrl.com) to your fly.dev hostname (mine is com-micahrl-wiki.fly.dev).
  • Run flyctl certs add wiki.micahrl.com (using your own name) to provision certificates
  • Change the url to https://wiki.micahrl.com (using your own name) in fly.toml
  • Run flyctl deploy again to pick up the change

Now you can log in to the wiki using your custom domain name and do the first-run configuration.

That’s it! Your wiki is now up and running.

Maintenance

A few tasks you will need to know how to do over time.

Restoring the database on the commandline

This is very easy – just set up your AWS credentials and run a single command.

export AWS_ACCESS_KEY_ID=AKIAxxxxxxxxxxxxxxxx
export AWS_SECRET_ACCESS_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/xxxxxxxxx
litestream restore -o wikijs.sqlite s3://com-micahrl-wiki-litestream-bucket/wikijs.sqlite

(Of course, substitute your own bucket name and backup path.)

You can examine the result with sqlite:

bash> sqlite3 wikijs.sqlite
sqlite> .tables
_litestream_lock  commentProviders  pageHistory       settings
_litestream_seq   comments          pageHistoryTags   storage
analytics         editors           pageLinks         tags
apiKeys           groups            pageTags          userAvatars
assetData         locales           pageTree          userGroups
assetFolders      loggers           pages             userKeys
assets            migrations        renderers         users
authentication    migrations_lock   searchEngines
brute             navigation        sessions
sqlite> select * from pages;
1|home|b29b5d2ce62e55412776ab98f05631e0aa96597b|Wiki Home||0|1||||# Wiki Welcome

See also <https://me.micahrl.com>
|<h1 class="toc-header" id="wiki-welcome"><a href="#wiki-welcome" class="toc-anchor">_</a> Wiki Welcome</h1>
<p>See also <a class="is-external-link" href="https://me.micahrl.com">https://me.micahrl.com</a></p>
|[{"title":"Wiki Welcome","anchor":"#wiki-welcome","children":[]}]|markdown|2022-05-21T23:29:32.844Z|2022-05-21T23:43:25.915Z|markdown|en|1|1|{"js":"","css":""}
2|sandbox|385919f3575186b3410ad6e08ca821b49496413f|Sandbox|Stuff in here is just for fucking around|0|1||||# Sandbox

Stuff in here is just for fucking around
|<h1 class="toc-header" id="sandbox"><a href="#sandbox" class="toc-anchor">_</a> Sandbox</h1>
<p>Stuff in here is just for fucking around</p>
|[{"title":"Sandbox","anchor":"#sandbox","children":[]}]|markdown|2022-05-21T23:51:09.803Z|2022-05-21T23:51:11.761Z|markdown|en|1|1|{"js":"","css":""}

Restoring the database after losing your data volume

Let’s say that something went wrong with the data volume. Perhaps your application and its data volume are accidentlly deleted from Fly. How do you recover?

This Dockerfile restores for you automatically. start.sh runs /usr/local/bin/litestream restore -if-db-not-exists "$DB_FILEPATH" before starting replication and running your app. This means that if the database is not present in the data volume but there is a replication in S3, Litestream copies it from S3 first. If it’s already on the data volume, Litestream just manages your app as normal.

You can test this by deleting your application and then re-deploying, but in my testing the Let’s Encrypt certificate management system got confused when I deleted and re-deployed my app with the same hostname. Instead, it’s less error-prone to make a copy of your data and use it to deploy a temporary copy of your app.

Of course, make sure to change the wiki homepage or make some other obvious edit, so that you can tell it gets restored properly.

  • Copy the data on S3 to a new folder. I used Cyberduck to do this in a GUI. I called the new folder wikijs-copy.sqlite in S3.
  • Copy all the files in your directory to a new temporary location – mkdir testing-app/ && cp Dockerfile fly.toml litestream.yml start.sh testing-app/
  • Change to that new temporary – cd testing-app
  • Edit fly.toml inside the temporary direcory
    • Change LITESTREAM_S3_PATH = "wikijs-copy.sqlite"
    • Change the URL to a new application name like url = "https://com-micahrl-wiki-2.fly.dev"
  • Deploy the new temporary copy of the app
flyctl launch --no-deploy --name com-micahrl-wiki-2
flyctl volumes create data --app com-micahrl-wiki-2 -s 1
flyctl secrets set \
    LITESTREAM_ACCESS_KEY_ID=XXX \
    LITESTREAM_SECRET_ACCESS_KEY=YYY
flyctl scale memory 1024
flyctl deploy

If you do this it will come up with a distinct copy of your wiki’s data! Try editing both copies; note that they are now independent of each other.

When you’re done testing, destroy your temporary app. This will also delete the secrets and the data volume.

flyctl destroy com-micahrl-wiki-2

Upgrading Wiki.js

I am living somewhat dangerously and using FROM ghcr.io/requarks/wiki:2 in my Dockerfile. This means that every time I run flyctl deploy, it will get the latest version of the Wiki.js container in the 2.x series and base my wiki’s container on that. The upside is that I don’t have to think about upgrades; the downside is that a new version may break something.

You could instead set a specific version like FROM ghcr.io/requarks/wiki:2.5.283, and it would always use that version. You can see a a list of all tags on Dockerhub, and be in full control of when to upgrade. If you did this, you could also test in a staging environment before upgrading the main site.

Growing the data volume size

At the time of this writing, Fly does not let you grow your own data volumes :(. They can do it for you if you open a support ticket.

However, since Litestream restores are automatic, you can destroy your application and re-deploy it. If you are using a custom domain name, this may cause temporary (up to 24 hour) problems with the HTTPS certificate for your site, so this is best kept as a last resort.

Other notes

A few auxiliary things that might be helpful.

Creating an S3 bucket and an IAM user with Terraform

Litestream backs up to S3 (and several other data stores). To use this, you’ll need to create an S3 bucket, a user that can write to it, and security credentials for that user.

I wanted to use Terraform to do most of that, so I wrote this Terraform configuration that creates:

  • The S3 bucket
  • An IAM policy that allows writing to the bucket
  • An IAM group, that the policy attaches to
  • An IAM user, in that group

Then I went to the AWS web console to create the security credentials for the user I created.

If Terraform isn’t your thing, you could do this with CloudFormation or just in the AWS web console directly.

Require authentication in Wiki.js for viewing pages

I want my wiki to be private by default, but allow users to mark some pages for public viewing. To do that:

  • Log in as a user with admin privileges to Wiki.js
  • Navigate to the admin area -> Groups -> Guest -> Page Rules tab
  • Allow read:pages and read:assets but not read:comments, and only if the tag matches public
  • Go to your home wiki page (and/or any other page you want to be publically visible) and tag is as public

Unfortunately, you must prohibit reading comments, or else logged-out users will get an error on every page load that says “An unexpected error occurred”. Perhaps the Wiki.js team will fix this in a future release.

How to find the correct CMD for the Wiki.js Docker container?

Our Dockerfile’s CMD runs start.sh, which runs litestream replicate -exec '...'. How do we know the right value to pass to -exec?

In general, to find this, look in the Dockerfile and combine the ENTRYPOINT and CMD values. (Some containers have only one or the other; if a container has both, as Wiki.js does, ENTRYPOINT comes first and then CMD.)

I could not find the production Dockerfile for Wiki.js though; unless I just missed it somewhere, I figure it is probably built as part of their build system. Rather than figure that out, I cheated by running this command (via):

dockcer pull requarks/wiki
docker inspect --format='{{range $e := .Config.Env}}
ENV {{$e}}
{{end}}{{range $e,$v := .Config.ExposedPorts}}
EXPOSE {{$e}}
{{end}}{{range $e,$v := .Config.Volumes}}
VOLUME {{$e}}
{{end}}{{with .Config.User}}USER {{.}}{{end}}
{{with .Config.WorkingDir}}WORKDIR {{.}}{{end}}
{{with .Config.Entrypoint}}ENTRYPOINT {{json .}}{{end}}
{{with .Config.Cmd}}CMD {{json .}}{{end}}
{{with .Config.OnBuild}}ONBUILD {{json .}}{{end}}' requarks/wiki

Which returned these results:

ENV PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
ENV NODE_VERSION=16.15.0
ENV YARN_VERSION=1.22.18
EXPOSE 3000/tcp
EXPOSE 3443/tcp
VOLUME /wiki/data/content
USER node
WORKDIR /wiki
ENTRYPOINT ["docker-entrypoint.sh"]
CMD ["node","server"]

Note that this is not an exact copy of the input Dockerfile, which might have been much more complicated. However, it has enough for our purposes here, telling us:

  • What ports it uses
  • Where wiki content is by default (although I override that with DB_FILEPATH anyway)
  • What user is running the server
  • The ENTRYPOINT and CMD

See that in our Dockerfile we unset ENTRYPOINT and set CMD to call litestream, and then pass the upstream ENTRYPOINT + CMD to the litestream command:

ENTRYPOINT []
CMD ["/usr/local/bin/litestream", "replicate", "--exec", "/usr/local/bin/docker-entrypoint.sh node server"]

Responses

Comments are hosted on this site and powered by Remark42 (thanks!).

Webmentions are hosted on remote sites and syndicated via Webmention.io (thanks!).