Last month I received an email from Let's Encrypt stating that my certificate would have expired in 20 days. I immediately thought it was weird because I set automatic renewal in a
crontab job running every night with the following command (suggested by the official guide):
certbot-auto renew --quiet --no-self-upgrade --pre-hook "service nginx stop" --post-hook "service nginx start"
As a consequence, 30 days before the expiration, certificate is renewed and nginx restarted to load the new certificate.
This has worked very well for months, except last one, why?
The first step is always to check the logs, they are our source of truth, so I did. The relevant part was:
2017-05-14 03:00:24,785:INFO:certbot.reporter:Reporting to user: The following errors were reported by the server: Domain: runningcodes.net Type: connection Detail: Could not connect to runningcodes.net To fix these errors, please make sure that your domain name was entered correctly and the DNS A record(s) for that domain contain(s) the right IP address. Additionally, please check that your computer has a publicly routable IP address and that no firewalls are preventing the server from communicating with the client. If you're using the webroot plugin, you should also verify that you are serving files from the webroot path you provided. [...] FailedChallenges: Failed authorization procedure. runningcodes.net (http-01): urn:acme:error:connection :: The server could not connect to the client to verify the domain :: Could not connect to runningcodes.net
In a nutshell, Let's Encrypt couldn't connect to my website. As the message suggests, I checked that my DNS
A record is correctly setup (it was) and of course I have a public IP address with no firewalls in-between. Moreover, I'm not using Cloudflare, my nginx configuration is correct and the option
--dry-run does not report any error.
After some investigation, I found that the problem was inherent my
crontab command. Indeed
--pre-hook stops nginx making my website unreachable for Let's Encrypt server.
The most straightforward solution is to restart nginx only after
certbot has finished:
certbot-auto renew --quiet --no-self-upgrade --post-hook "service nginx restart"
After this small modification, everything worked again! This solution is trivial, but at the same time really counterintuitive because, as I said, certificates have been renewed successfully for months.
In the same documentation section, I found another interesting hook parameter that executes its command only after an effective renewal of a certificate (and not after every attempt):
This allows you to limit your downtime even more (we're talking about milliseconds!):
certbot-auto renew --quiet --no-self-upgrade --renew-hook "service nginx restart"
If someone wants to shed some light on why this happened, please let me know.