Lessons Learned from Writing a Webhook

December 27, 2022 in programming

I've recently wanted to set up a small webhook for a small project that I am working on, and in the process, learned a few little things. Therefore, I decided to write down my little journey, to remind myself of my new knowledge, and to share it with you.

The Goal: Automated Documentation Building

The starting position is relatively simple: I have a small Python project, hosted on GitLab, that has its documentation built with Sphinx. This documentation is hosted on ReadTheDocs, however, since I also enjoy self-hosting my things, I wanted a copy on my own server. [1]

Previously, I kept the documentation up to date manually, by doing a local tox -e sphinx followed by a rsync. This process is not too bad, but it does mean that I have to think about it every time I change the docs, and I need to be doing changes on a machine on which I can build and upload the docs.

The common way to keep the code repository and the documentation in sync is to use Webhooks, which is a mechanism that uses HTTP requests to trigger remote actions when a push to a git repository is made. Thus, I wanted to implement a webhook "receiver" that would re-build the docs from the latest master branch and put them into the webroot.

CGI and Its Limitations

The first approach that I followed was to use CGI — for the simple reasons that I already have a web server running that can execute CGI scripts and handles the SSL stuff. I expected this to be the easiest approach, given that I only wanted a single webhook.

The resulting first try looked a bit like this:

#!/usr/bin/python3
import os
import sys

# Verify that the request is genuine
if os.getenv("HTTP_X_GITLAB_TOKEN") != "secret sauce":
    print("Content-type: text/plain")
    print("")
    print("Forbidden")
    sys.exit(0)

def generate_docs():
    ...

def send_mail(exc):
    ...

# Send an early response to the client
print("Content-type: text/plain")
print("")
print("Job queued")

# Make ourselves independent of the web server
if os.fork() == 0:
    os.setsid()
    if os.fork() == 0:
        # Here we can do the actual work
        try:
            generate_docs()
        except Exception as exc:
            send_mail(exc)

We can see multiple concepts at play here already:

The webhook guidelines state that the response should be returned fast, so the actual generation of the documentation is done in the background, using the double fork trick. A small response is sent to the client to make it happy.
Since the HTTP response does not indicate whether the generation succeeded, I added a quick-and-dirty email notifcation in case there were any errors.
The X-Gitlab-Token header is used with a secret value to verify that the request actually originated from Gitlab, to prevent DoS attacks.

I tested my script locally, and it seemed to work. But when I threw it onto my server and set up the webhook, it didn't produce any new documentation. This brings me to the limitations that I ran into:

The CGI specification does not actually state how custom headers should be passed from server to script. Apache does provide them as HTTP_* environment variables, but that already left a bit of a sour taste when wanting to verify the X-Gitlab-Token header.
The version of tox that I have on my server does not match the version of tox that I use locally. This brought up issues, which I ended up working around by using a virtual environment with the correct tox version.
The script runs as the www-data user, which is not a proper user account. This caused poetry to fail since a cache could not be built.

The last two points indicate that the build should probably be done inside a container, such that the build environment can be kept the same locally and on the server, and I don't have to fiddle around with virtual environments or differing Python/tox versions. That however brought a new problem: The httpd user is not allowed to start docker containers.

Now while I am sure that there would be a way to work around this, it also became clear that a CGI script is not the right tool for the job, and the pile of workarounds that have stacked up screamed that the right solution would look different — especially if it should be robust against future changes or extensions.

Sketching a "Documentation Build Daemon"

The idea has now transformed into a split design, in which I would have one daemon running in the background, responsible for building the documentation, and then one script to trigger it. The daemon would run with all privileges it needs, while the triggering script could be kept very simple. Communication is done via a Unix socket, or any other IPC method that you want to use.

By this point however, I decided again to check the documentation for Apache's mod_proxy, to see how much effort it would be to proxy a single endpoint to a different server, and it turns out that it is quite easy:

ProxyPass "/webhook" http://localhost:1337/

And not just that, it turns out that mod_proxy can even use Unix sockets:

ProxyPass "/webhook" unix:/var/run/webhook.socket|http://localhost/

Now the pieces start falling into place: Instead of a custom made "triggering" protocol and CGI script, I could simply treat the backend as a HTTP server. This is how most modern HTTP frameworks operate anyway. This would also free me of the "header inaccessibility" of CGI, as I would get access to the whole HTTP request. And I could implement proper queueing of the incoming build requests in the build daemon. A win-win situation!

The only thing left to do then was to implement the daemon. I chose Rust as the language to do that with, as I am very familiar with it, and the safety that it provides is good for a daemon that has root access. I picked the following crates to build upon:

hyper as the base for the HTTP server. I didn't need a full-fledged web framework, and I've used hyper before.
hyperlocal to keep the HTTP server on a Unix socket, which gives it a nice name and doesn't use up a TCP port.
lettre to send the email notification if the build fails.

With those, I came up with the following code:

#![feature(try_blocks)]
use std::{
    error::Error, fs, path::Path, thread, process::Command,
    ffi::OsStr, os::unix::prelude::PermissionsExt,
};
use color_eyre::eyre::{bail, eyre, Result};
use hyper::{
    service::{make_service_fn, service_fn},
    Body, Request, Response, Server, http::HeaderValue, StatusCode,
};
use hyperlocal::UnixServerExt;
use lettre::{Message, SmtpTransport, Transport};
use tokio::sync::mpsc::{self, Sender, Receiver};

// The configuration is hard-coded into the binary, for the sake of
// simplicity:
const SOCKET_PATH: &str = "/var/run/webhook.sock";
const BACKLOG: usize = 10;
const TOKEN: &str = "SECRET GITLAB TOKEN";
// Email notifications
const RECIPIENT: &str = "me@localhost";
const SENDER: &str = "Webhook <webhook@localhost>";

// The "job" is a simple trigger for now, later this can be extended to
// more (like containing the branch name to build).
type Job = ();

#[derive(Debug, Clone, Default)]
struct Runner {
    stdout: Vec<u8>,
    stderr: Vec<u8>,
}

impl Runner {
    fn run<I, S>(&mut self, name: S, args: I) -> Result<()>
    where
        S: AsRef<OsStr>,
        I: IntoIterator<Item=S>,
    {
        let name = name.as_ref();
        let output = Command::new(name)
            .args(args)
            .output()?;
        self.stdout.extend(format!("\ncmd: {name:?}\n").bytes());
        self.stdout.extend(output.stdout);
        self.stderr.extend(format!("\ncmd: {name:?}\n").bytes());
        self.stderr.extend(output.stderr);

        if !output.status.success() {
            bail!("Command {name:?} was not successful");
        }
        Ok(())
    }
}

fn send_mail(err: Box<dyn Error>, runner: Runner) -> Result<()> {
    let body = format!(
        "Error building documentation:\n{}\n\nSTDOUT:\n{}\n\nSTDERR:\n{}",
        err,
        String::from_utf8_lossy(&runner.stdout),
        String::from_utf8_lossy(&runner.stderr),
    );
    let msg = Message::builder()
        .from(SENDER.parse()?)
        .to(RECIPIENT.parse()?)
        .subject("Documentation Build Failure")
        .body(body)?;

    let smtp = SmtpTransport::unencrypted_localhost();
    smtp.send(&msg)?;
    Ok(())
}

fn worker(mut receiver: Receiver<Job>) -> Result<()> {
    loop {
        let _ = receiver.blocking_recv()
            .ok_or_else(|| eyre!("Channel empty, quitting"))?;

        let mut runner = Runner::default();
        let result = try {
            runner.run("docker", ["run", "..."])?;
            runner.run("rsync", ["..."])?;
            runner.run("chown", ["..."])?;
        };

        if let Err(e) = result {
            if let Err(e) = send_mail(e, runner) {
                eprintln!("{e}");
            }
        }
    }
}

async fn handle(enqueuer: Sender<Job>, req: Request<Body>) -> Result<Response<Body>> {
    let headers = req.headers();

    if headers.get("X-Gitlab-Token") != Some(&HeaderValue::from_static(TOKEN)) {
        let response = Response::builder()
            .status(StatusCode::FORBIDDEN)
            .header("Content-type", "text/plain")
            .body("Access denied".into())?;
        return Ok(response);
    }

    // Not much to do here, except for signalling the worker
    enqueuer.send(()).await?;

    Ok(Response::builder()
        .status(StatusCode::OK)
        .header("Content-type", "text/plain")
        .body("Job enqueued".into())?)
}


#[tokio::main]
async fn main() -> Result<(), Box<dyn Error + Send + Sync>> {
    let path = Path::new(SOCKET_PATH);

    if path.exists() {
        fs::remove_file(path)?;
    }

    let (sender, receiver) = mpsc::channel::<Job>(BACKLOG);

    thread::spawn(move || {
        worker(receiver)
    });

    let make_service = make_service_fn(move |_| {
        let sender = sender.clone();
        let service = service_fn(move |req| {
            handle(sender.clone(), req)
        });
        async move { Ok::<_, eyre::Error>(service) }
    });

    let server = Server::bind_unix(path)?;

    let mut permissions = fs::metadata(SOCKET_PATH)?.permissions();
    permissions.set_mode(0o666);
    fs::set_permissions(SOCKET_PATH, permissions)?;

    server.serve(make_service).await?;

    Ok(())
}

The daemon was a joy to write, and with a quick systemd unit file it was armed and ready to go.

Using musl To Avoid glibc Version Issues

The final roadblock to my up-to-date documentation came when I tried to run my daemon on the server. Usually, Rust programs are easy to deploy, as they are statically linked and therefore do not require any additional libraries to be installed. You can simply copy the binary to the target system and run it.

This however is only true for Rust dependencies, and only if they themselves don't introduce dynamic links. There is also one dependency that usually links dynamically, and that is glibc. Sadly, my server still has version 2.28, which was a bit too old for the program. Luckily, Rust can build a program with a statically linked version of libc, using musl:

rustup target add x86_64-unknown-linux-musl
cargo build --release --target=x86_64-unknown-linux-musl

I should note that this "easy" way only works if there are no native dependencies that are used. For my simple documentation generator, that is the case. However, for more complex applications that link to more native libraries, you might have to use something like rust-musl-builder. That way, not only libc is statically linked, but also all dependencies.

I also turned on Link Time Optimization to reduce the binary size for the build:

# In Cargo.toml
[profile.release]
lto = true

The result was a (completely) statically linked 4 MiB big binary that contains my webhook executor:

$ du -h webhook
4.0M        webhook
$ ldd webhook
        statically linked

Granted, that size does exclude the additional Docker image that I used to build the documentation, but that is a four-liner that just sets up tox and the right entrypoint.

Conclusion

The webhook is working great and I feel good that the end product is not a house of cards that would collapse with any small change in the pipeline. There are basically no wonky workarounds in the final product, and the containerization makes it easy to adjust to new build processes (such as a new Python version) or to extend the setup to run other tasks (such as tests).

When it comes to learning things, I can say the following:

CGI, as simple as it is, may quickly run into limitations when you try to do more than just dyamically generate some content. System-level administrative tasks should probably not be done as part of the script, and if you have to use double-fork tricks you are probably in too deep.
mod_proxy is better suited for this use case than I first expected. Adding proxies for paths and even doing so through Unix sockets is really useful, and could be the key to replacing more of my "quick and dirty" CGI scripts with "proper" solutions. So far, I had only used it to proxy whole subdomains.
Nothing new, but Rust continues to be an amazing language with a great ecosystem:
- hyper and hyperlocal make for a nice combination, and I've also added lettre to the crates that I should keep in mind for future projects.
- rustup and the ability to easily link against musl is also great and helps the easy deployment, even if it causes bigger binaries in the end.
curl supports sending HTTP requests over Unix sockets: curl --unix-socket /var/run/webhook.socket http://localhost.

For the future, there are still a few things that can be done:

The daemon could support proper configuration (via toml files) and proper logging (via tracing).
The daemon could be socket-activated, so that it doesn't need to linger around in the background.
The daemon could do better queuing of incoming jobs, especially for pushes in quick succession.
We could use the payload of the webhook to select the right git commit, instead of building the latest master. This could also help in building versioned docs, similar to how readthedocs can keep multiple versions around.

[1]	This also explains why I didn't go for an existing tool like webhook, even if it might have worked just as well and with less effort: I simply like to hack on my own tools and setup.