Let's Encrypt DNS challenge fails

We've got a few servers using Let's Encrypt for testing, and one of them failed today with the renewal, with an error much like the one below:

An error occurred requesting a new certificate for domain.com from Let's Encrypt : Failed to request certificate :

Parsing account key...  
Parsing CSR...  
Registering account...  
Already registered!  
Verifying domain.com...  
Traceback (most recent call last):  
  File "/usr/share/webmin/webmin/acme_tiny.py", line 235, in   
    main(sys.argv[1:])  
  File "/usr/share/webmin/webmin/acme_tiny.py", line 231, in main  
    signed_crt = get_crt(args.account_key, args.csr, args.acme_dir, args.dns_hook, args.cleanup_hook, log=LOGGER, CA=args.ca)  
  File "/usr/share/webmin/webmin/acme_tiny.py", line 184, in get_crt  
    domain, challenge_status))  
ValueError: domain.com challenge did not pass: {u'status': u'invalid', u'keyAuthorization': u'HdSP6peLcMNfomcCVO6ZDnvcDFtd3OebZ-F7GZ9U6Bk.U9mfm7ctbcH8vOR_gc5B8Nh0gg0UvQnRzTISBBkDYxM', u'uri': u'https://acme-v01.api.letsencrypt.org/acme/challenge/FvOF_rgAU_-ePEqnHXPIy06Or4cAp9LImtFQK4vWpdw/1288167328', u'token': u'HdSP6peLcMNfomcCVO6ZDnvcDFtd3OebZ-F7GZ9U6Bk', u'error': {u'status': 400, u'type': u'urn:acme:error:connection', u'detail': u'DNS problem: NXDOMAIN looking up TXT for _acme-challenge.domain.com'}, u'type': u'dns-01'}  

This is using Virtualmin 5.07, since no further updates so far has been issued.

Status: 
Active

Comments

We just released version 5.99 - give that a try, and see if it helps.

ronnikc's picture
Submitted by ronnikc on Wed, 06/07/2017 - 02:36 Pro Licensee

I am unable to download 5.99 from your Ubuntu 14.04.5 repositories. Is it available there yet?

I'm receiving an identical error on automatic renewals that have previously (once every month) renewed automatically without any issues. Updating to 5.99 did not help.

Virtualmin attempts renewal every five minutes for all the failed domains even when I set the renewal to manual only. It seems like the only way to stop the failed renewal emails and Let's encrypt temporary throttle/ban is to stop webmin/virtualmin.

@mandresen - is the DNS domain hosted on your system?

Yes, Virtualmin handles DNS for all my domains on the server.

Are you using DNSSEC there? One issue that I ran into recently was the Let's Encrypt service being very sensitive to misconfiguration with DS records.

Yep, DNSSEC is enabled on my domains, but there's been no manual editing of the records or zones. (It's a bit difficult to test disabling DNSSEC on a single domain before the hailstorm of "renewal failed" emails start ticking in across all domains)

Is it possible to configure Virtualmin to continue to use the previous way of renewing the certificates that worked fine?

Virtualmin will try both the web and DNS-based validation. Unfortunately in the current release, there is a bug that prevents the details of the web-based failure from being displayed. Make sure that you don't have any http to https redirect enabled on your domain, as this is the most common cause of Let's Encrypt failures.

Thanks for the tip regarding http to https redirect. Removing these allowed the renewals to take place.

However, removing this redirection kind of negates the purpose of the certificates to encrypt all traffic by default. My domains performed multiple renewals without issue with this redirection turned on, before failing in May. Is this part of the bug you mention in the current release, or are the other steps that can be taken to fix this issue?

Unfortunately, it's the Lets Encrypt service that has trouble with the redirects, which limits what we can do about this in Virtualmin. However, the 6.00 release will temporarily disable the redirect during the cert refresh, as a work around for this issue.

OK – thanks for the feedback. As long as the renewals can hum along at their own pace without causing problems that sounds good :)

Would you consider taking a look at increasing the 5 minute frequency for reattempts at certificate renewals when problems are encountered? It makes troubleshooting difficult that the throttle/ban is reached so quickly. And/Or maybe shifting/staggering when certificates are requested for each domain on the server, so they don't clobber the let's encrypt server at the same time and then produce identical failure emails. If we're renewing once every month and the validity of a cert is three months, there's really no rush :D

Yes, I'm going to re-do the logic for retrying renewals in the next release to ensure that retrying every 5 minutes doesn't happen.