Clustering Let's Encrypt with Apache

2019-06-17 - Progress

A few months ago I wrote about bootstrapping Let's Encrypt on Debian. I am now using Let's Encrypt certificates on the live DNS web servers.

Clustering

I have a smallish number of web servers (currently 3) and a smallish number of web sites (also about 3). I would like any web server to be able to serve any site, and dynamically change which site is on which server for failover, deployment canaries, etc.

If server 1 asks Let's Encrypt for a certificate for site A, but site A is currently hosted on server 0, the validation request will not go to server 1 so it won't get the correct response. It will fail unless server 0 helps server 1 to validate certificate requests from Let's Encrypt.

Validation servers

I considered various ways that my servers could co-operate to get certificates, but they all required extra machinery for authentication and access control that I don't currently have, and which would be tricky and important to get right.

However, there is a simpler option based on HTTP redirects. Thanks to Malcolm Scott for reminding me that ACME http-01 validation requests follow redirects! The Let's Encrypt integration guide mentions this under "picking a challenge type" and "central validation servers".

Decentralized validation

Instead of redirecting to a central validation server, a small web server cluster can co-operate to validate certificates. It goes like this:

  • server 1 requests a cert for site A

  • Let's Encrypt asks site A for the validation response, but this request goes to server 0

  • server 0 discovers it has no response, so it speculatively replies with a 302 redirect to one of the other servers

  • Let's Encrypt asks the other server for the validation response; after one or two redirects it will hit server 1 which does have the response

This is kind of gross, because it turns 404 "not found" errors into 302 redirect loops. But that should not happen in practice.

Apache mod_rewrite

My configuration to do this is a few lines of mod_rewrite. Yes, this doesn't help with the "kind of gross" aspect of this setup, sorry!

The rewrite runes live in a catch-all port 80 <VirtualHost> which redirects everything (except for Let's Encrypt) to https. I am not using the dehydrated-apache2 package any more; instead I have copied its <Directory> section that tells Apache it is OK to serve dehydrated's challenge responses.

I use Ansible's Jinja2 template module to install the configuration and fill in a couple of variables: as usual, {{inventory_hostname}} is the server the file is installed on, and in each server's host_vars file I set {{next_acme_host}} to the next server in the loop. The last server redirects to the first one, like web0 -> web1 -> web2 -> web0. These are all server host names, not virtual hosts or web site names.

Code

<VirtualHost *:80>
 ServerName {{inventory_hostname}}

 RewriteEngine on
 # https everything except acme-challenges
 RewriteCond %{REQUEST_URI} !^/.well-known/acme-challenge/
 RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [L,R=301]
 # serve files that exist
 RewriteCond /var/lib/dehydrated/acme-challenges/$1 -f
 RewriteRule ^/.well-known/acme-challenge/(.*) \
             /var/lib/dehydrated/acme-challenges/$1 [L]
 # otherwise, try alternate server
 RewriteRule ^ http://{{next_acme_host}}%{REQUEST_URI} [R=302]

</VirtualHost>

<Directory /var/lib/dehydrated/acme-challenges/>
 Options FollowSymlinks
 Options -Indexes
 AllowOverride None
 Require all granted
</Directory>