[SRU] Backport letsencrypt 0.14.2

Bug #1640978 reported by Peter Eckersley on 2016-11-11
64
This bug affects 7 people
Affects Status Importance Assigned to Milestone
python-acme (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
Yakkety
Undecided
Unassigned
Zesty
Undecided
Unassigned
python-certbot (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
Yakkety
Undecided
Unassigned
Zesty
Undecided
Unassigned
python-certbot-apache (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
Yakkety
Undecided
Unassigned
Zesty
Undecided
Unassigned
python-certbot-nginx (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
Yakkety
Undecided
Unassigned
Zesty
Undecided
Unassigned
python-letsencrypt (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
Yakkety
Undecided
Unassigned
python-letsencrypt-apache (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
Yakkety
Undecided
Unassigned

Bug Description

This bug contains a list of known major and other issues fixed between upstream letsencrypt 0.4.1 and the latest version, certbot 0.9.3 (the project has also been renamed to avoid confusion between the python client software and the Let's Encrypt CA service).

[Impact]

MAJOR BUGS FIXED

https://github.com/certbot/certbot/issues/2750
letsencrypt < 0.5.0 was not compatible with future configuration files, so users who run certbot-auto then downgrade to the Xenial packages will encounter errors.

https://github.com/certbot/certbot/issues/2709
Failure to remember choices of authenticator plugins for renewal operation. This would essentially make "letsencrypt renew" useless on Xenial. Numerous less severe automated renewal-related bugs fixed in subsequent releases:
https://github.com/certbot/certbot/issues?utf8=%E2%9C%93&q=is%3Aissue%20milestone%3A0.5.0%20is%3Aclosed%20label%3Arenewal%20
https://github.com/certbot/certbot/issues?q=is%3Aissue+milestone%3A0.7.0+is%3Aclosed+label%3Arenewal
https://github.com/certbot/certbot/issues?utf8=%E2%9C%93&q=is%3Aissue%20milestone%3A0.6.0%20is%3Aclosed%20label%3Arenewal%20
https://github.com/certbot/certbot/issues?utf8=%E2%9C%93&q=is%3Aissue%20milestone%3A0.8.1%20is%3Aclosed%20label%3Arenewal%20
https://github.com/certbot/certbot/issues?utf8=%E2%9C%93&q=is%3Aissue%20milestone%3A0.9.0%20is%3Aclosed%20label%3Arenewal%20

https://github.com/certbot/certbot/issues/2613
Failure to handle IPv6 Virtual hosts in Apache configurations

https://github.com/certbot/certbot/issues/2320
Erroneous behaviour with Apache configs that have multiple vhosts in a single file (these are still not supported for cert installation in 0.9.3, but at least produce clear error messages)

https://github.com/certbot/certbot/issues/2768
Incompatibility with the specified version of the ACME protocol, preventing the Let's Encrypt serverside code from following it correctly

https://github.com/certbot/certbot/issues/2731
Failure to parse Plesk's apache config files

https://github.com/certbot/certbot/issues/1243
Apache plugin errors out when transformations to a configuration turn out to be a no-op.

https://github.com/certbot/certbot/issues/3210
Incorrect handling of RewriteCond directives when trying to avoid Apache inifinite redirect loops

https://github.com/certbot/certbot/issues/1833
Problems running Apache renewal in cron due to cron's default PATH

UX: fail to re-ask for email address if the first one seems invalid:
https://github.com/certbot/certbot/issues/2675

UX: when re-running is a NOOP (due to renewal not being needed yet), print an explanation:
https://github.com/certbot/certbot/issues/1918

OTHER BUGS FIXED

Reduce the risk of incorrect or corrupt state in case of control-C interrupts:
https://github.com/certbot/certbot/issues/3219

Failure to correctly parse certain rewrite directives in Apache configs:
https://github.com/certbot/certbot/issues/2735

Failure to correctly enable HTTP -> HTTPS redirects in some Apache configs:
https://github.com/certbot/certbot/issues/3003

Failure to provide a sensible error if the user requests a Unicode domain:
(support for those is being added in 0.10.0)
https://github.com/certbot/certbot/issues/2661

Directory deletion permission errors are fatal when using the webroot plugin for non-root users (but shouldn't be):
https://github.com/certbot/certbot/issues/2678

UX: provide helpful guidance for people who want to run Certbot as a non-root user:
https://github.com/certbot/certbot/issues/2306

SIGNIFICANT NEW FEATURES WARRANTING AN SRU:

Support --quiet / -q

https://github.com/certbot/certbot/issues/2512

User interface for requesting certificates for multiple domain names with the
webroot plugin:
https://github.com/certbot/certbot/issues/1393

Support for DNS based authentication:
https://github.com/certbot/certbot/issues/1826

[Test Case]

All or almost all of the pull requests for the bugs above include unit test coverage.
Some also include integration or compatibility test coverage.

[Regression Potential]

The Certbot team has viewed breakage of existing workflows (especially ones that may be automated) as a serious issue, has strived to avoid them, and has treated workflow changes as regressions where it has occurred.

We have the following test suites in place for Certbot:

* Nosetest unit tests with coverage for each module between 97% and 100%; *test.py in the relevant tree.
* Integration tests that run Certbot against the current copy of Let's Encrypt's serverside boulder codebase. These require docker and are a little more involved to run. See tests/boulder_integration.sh for instructions.
* "Compatibility tests" that run the Apache and Nginx plugins against corpora of configuration files for those webservers; these live in certbot-compatibility-test/
* Test farm tests, which we use to check that our releases run correctly on a wide range of platforms. These spin up Amazon EC2 instances for numerous OSes and run various tests on them. They live in tests/letstest

We recommend that Ubuntu run the first of these test suites during build (but we believe the Debian packages already do that).

All of these tests mitigate the risk of regressions in our releases; nonetheless, some regressions do slip past. Because many of our users auto-update, these tend to be reported and fixed quickly in point releases. For instance, regressions in 0.9.0 were fixed in 0.9.1, 0.9.2 and 0.9.3. Certbot 0.9.3 has been used to issue hundreds of thousands of Certs in the field, so we are fairly confident that no further significant regressions exist in it, and that release is likely to be safe as a Xenial SRU.

At least two changes in functionality between 0.4.1 and 0.9.3 do bear specific consideration for Xenial though:

Debian has added a "certbot renew" twice-daily cron job to their packages between 0.4.1 and 0.9.3; we believe this is low regression risk (having secondary renewal mechanisms in place is a NOOP) but Xenial packages may want to increase the debconf verbosity to get consent for this from Xenial users who are upgrading?

We had a custom log rotation scheme (rotate logs after every run), we now act like a more typical daemon, so packages need to be rotating our logs:
https://github.com/certbot/certbot/issues/3382

[Other Info]

RAOF has offered to sponsor 0.9.3 into Xenial.

Peter Eckersley (pde-lists) wrote :

Also fixes: Launchpad bug #1608214

Peter Eckersley (pde-lists) wrote :

On IRC, rbasak asked:

<rbasak> From an SRU review perspective, what I most want spelled out the exact list of any behaviour changes you expect Xenial users to receive, and your confirmation that you don't believe that there are any other changes.
<rbasak> Your word as upstream carries a great deal of weight on this.
<rbasak> What you've written is great and sound like it covers most of this already - it's just a clear summary of what the final upload to Xenial will do is what I feel is missing - though of course I know you haven't finalised that yet.

Debian's certbot 0.9.3 packages are a close approximation to what we expect users to get. They Provide: and Replace: letsencrypt, and include aliases for all previous "letsencrypt" commands.

With the exception of the two (an auto-renewal cron job and slightly different log rotation) points noted above, we believe that the functionality and workflows supported by certbot 0.9.3 should be an but less buggy, super-set of the functionality and workflows supported by letsencrypt 0.4.1.

Peter Eckersley (pde-lists) wrote :

* "a compatible but less buggy"

I should also note that most of our users have been running the certbot-auto script which automatically upgrades them to the latest release of Certbot as soon as it's public, so the odds of us getting bug reports about workflow changes is fairly high.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in python-letsencrypt (Ubuntu):
status: New → Confirmed

Hello everyone -- I'm (the primary, but not only) Debian maintainer of the certbot suite of packages. pde asked me to weigh in here to help answer some of rbasak's questions in a definitive manner.

Since the 0.4.1 release, there have been a handful of user-facing changes, but only a few are workflow changing ones.

The largest is the change in package name from letsencrypt to certbot. This was done for trademark reasons, however, we continue to ship both a transitional package and a symlink for the binary that should preserve users' workflows unmodified.

The second largest change is the addition of an automatic renewal of any certificates, through both a cronjob installed into cron.d and a systemd timer. The certbot binary will only act to renew the certificate, however, when it is "near" to expiring (<15 days). Users who have manually setup a cron already shouldn't experience any failures.

Other than that, there is new functionality available in upstream, but pde has already mentioned all of those. I'm happy to answer any questions or concerns the stable release managers have.

Peter Eckersley (pde-lists) wrote :

@hlieberman My understanding is that the package name change shouldn't affect any workflows, because "sudo apt-get install letsencrypt" will still work and the "letsencrypt" command will still work. Anything I might have missed?

Hi Harlan,

Thank you for taking the time to help us out!

On Sun, Nov 20, 2016 at 02:47:14AM -0000, Harlan Lieberman-Berg wrote:
> The second largest change is the addition of an automatic renewal of any
> certificates, through both a cronjob installed into cron.d and a systemd
> timer.

Does this mean that users who currently won't get auto-renewed will
start getting auto-renewed after this proposed update? If so, will that
also include users who currently have expired (languishing) old
certificates?

All,

Is there any progress on the preparation of the update itself? I hope
you're not waiting on my approval. To be clear, my opinion is that
we should update letsencrypt/certbot in 16.04, and I appreciate all the
interest in helping us do this; it's just a matter of the details
(documenting, understanding and mitigating any behavioural changes as
appropriate, and agreeing on the QA plan).

Robie

> Does this mean that users who currently won't get auto-renewed will
> start getting auto-renewed after this proposed update? If so, will that
> also include users who currently have expired (languishing) old
> certificates?

Yes. Each certificate (or more precisely, each lineage of certificates, where a lineage is a series of certificates that replace each other with new validity dates and possibly new domains added) gets a renewal configuration file in /etc/letsencrypt/renewal/ ; the "certbot renew" command walks through those and tries to renew any that are within 30 days of expiry. The Debian packages run that task twice a day out of the box.

I think we've concluded that we'll add a note-upon-installation telling the sys admin that that's going to start happening, and point to where it can be turned off or tweaked.

On Mon, Nov 28, 2016 at 09:22:28PM -0000, Peter Eckersley wrote:
> I think we've concluded that we'll add a note-upon-installation telling
> the sys admin that that's going to start happening, and point to where
> it can be turned off or tweaked.

I'm afraid that users who automatically update (most of them in 16.04
since it's default) will never see such a message.

While you do this for v16.04, could you also publish certbot v0.9.x for v16.10 at the same time? Or need a new ticket for that?

Hello everyone.

I'm not sure exactly what form you wanted the SRU in, but... I've included a tarball of the source changes files and dsc's, signed, along with the files included in them.

This upload has two changes from the version in sid. First, I tweaked the versions in python-certbot to reduce the dependency on pyopenssl. The version requirement was to work around an undeclared Breaks relationship between a version of pyopenssl and python-cryptography in a later version (16.0.0) than is in Xenial.

Second, I added a new dummy package to assist the upgrade of python-letsencrypt-apache => python-certbot-apache. The dummy package was never used in Debian, as letsencrypt didn't make it to stable before the name-change occurred.

Please let me know if you have any questions, or if you would like the SRU in some other manner.

Robie Basak (racb) wrote :

The format sounds fine, thanks. We need one uploader and SRU team member to review. I suggest we figure out who is doing what and both people review before uploading to avoid confusion.

Since you mentioned that RAOF could sponsor, I subscribed him to the bug. Since both of us are on the SRU team, we should figure out who will wear which hat. RAOF, any comments?

>>> import random
>>> x = ["RAOF", "rbasak"]
>>> random.shuffle(x)
>>> print "{0} should upload and {1} should review".format(x)
RAOF should upload and rbasak should review

Peter Eckersley (pde-lists) wrote :

.format(*x), sorry :)

Will it be reasonable to wait for certbot 0.10.0 (due on Jan 4, according to https://github.com/certbot/certbot/milestones), and go ahead then?

Peter Eckersley (pde-lists) wrote :

I don't think so. We would want 0.10.0 to have at least several weeks of field testing before stable distributions include it as an update, and any regressions in that release would mean we do 0.10.x bugfix releases that would stretch the timeline further.

Those are good arguments, thanks for explaining!

Changed in python-letsencrypt (Ubuntu):
status: Confirmed → Fix Released
Changed in python-acme (Ubuntu):
status: New → Fix Released
Changed in python-letsencrypt-apache (Ubuntu):
status: New → Fix Released
Chris Halse Rogers (raof) wrote :

Hm. It was unclear to me that this requires updating python-acme, python library expected to be used by 3rd parties.

Have any python-acme APIs changed? What are the chances that the update will break downstream users of python-acme?

Changed in python-certbot-nginx (Ubuntu):
status: New → Fix Released
Peter Eckersley (pde-lists) wrote :

The changes in python-acme were comparatively minor. A few bugs were fixed; one new feature was added to the API (support for DNS-01 challenges); some protocol messages were removed (because they were believed to have security problems, were never used by Let's Encrypt, and were removed from the IETF ACME draft).

None of those changes should negatively impact non-Certbot ACME clients that use python-acme. It also does not appear that any of those ACME clients are packaged in Debian unstable:

apt-cache rdepends python-acme
python-acme
Reverse Depends:
  python-certbot
  python-certbot-nginx
  python-certbot-apache

It's the clients that are *unpackaged* that are the most concerning. Packaged clients we can update if necessary; random ad-hoc clients not so much ๐Ÿ˜€

* nod. We haven't changed any of the preexisting API calls or semantics, so even unpackaged clients should be fine.

Packages uploaded to the unapproved queue, awaiting review.

I changed the version numbers (so that the version in Xenial < Yakkety < Zesty), and added bug links.

I'm confused on a couple of points. Here's my understanding:

AFAICT, there are three source packages in play here: src:python-acme, src:python-letsencrypt, and src:python-certbot.

src:python-acme appears relatively straightforward. A version bump is needed for new features used by this proposed update, but there is no other package renaming going on.

src:python-certbot supersedes src:python-letsencrypt, though of course apt doesn't see that.

src:python-letsencrypt created binaries letsencrypt, python-letsencrypt and python-letsencrypt-doc in Xenial. It doesn't exist in Yakkety or Zesty.

src:python-certbot creates binaries letsencrypt (transitional to certbot), certbot, python-certbot and python-certbot-doc from Yakkety onwards, and doesn't currently exist in Xenial.

The proposed backport of src:python-certbot to Xenial makes some backport-related dependency changes (so it seems) but I don't see anything that changes the structure of package names or transitional packages from what is in sid (and Yakkety, etc).

So to my questions:

1) What happens to a Xenial user who is using the library provided by python-letsencrypt after this update? Is it intentional that src:python-letsencrypt in Xenial will continue to provide it?

2) This is more of a question for an archive admin. Is it acceptable in Xenial for src:python-certbot to take over the letsencrypt binary name, but for us to retain src:python-letsencrypt that also produces it?

3) What happens if a future update (say a security update) is needed against src:python-letsencrypt for users still using the python-letsencrypt binary package in Xenial (from question 1)? This update would still produce the letsencrypt binary, but we'd have a higher version from src:python-certbot in the archive. So would the subsequent binary upload fail?

Does this mean that we also need to SRU src:python-letsencrypt at the same time to stop producing the letsencrypt binary package at the same time?

This is for Xenial only. I haven't considered Yakkety yet, nor have I completed my review since I want to understand and fix this point before considering any other further interactions (which may change if we change how we're doing this here).

Robie Basak (racb) wrote :

Based on https://irclogs.ubuntu.com/2017/01/12/%23ubuntu-release.html#t13:47, can you upload a python-letsencrypt for Xenial that drops the letsencrypt package please? We'll need to accept these all together.

Please remember to run the update-maintainer field when preparing these source packages. See https://wiki.ubuntu.com/DebianMaintainerField for the background. I think this needs doing on some of the existing uploads in the queue also, but let's wait to see if anything else needs fixing.

Peter Eckersley (pde-lists) wrote :

I think it's my job to gently nudge whoever it is that's making the slightly tweaked packages here, but I'm not sure whether it should be Chris or Harlan... if all else is equal perhaps it should be Chris, since we also have a new 0.10.1 release that needs to be packaged for sid and that would be in Harlan's wheelhouse.

Robie Basak (racb) wrote :

Any news? I feel that changing python-letsencrypt as well exceeds the bounds of the SRU hat I'm wearing - it should have review from a different uploader as well.

Sorry; LCA and then email backlog. I'll upload new packages tomorrow.

That's great Chris!

We had some discussions yesterday about whether this has taken long enough that 0.10.x should be considered for the SRU instead. 0.10.1 has been in the field for 13 days; people found a couple of bugs in it, but they only affected new 0.10.x functionality rather than being regressions.

0.10.2 was released yesterday to fix those things and Harlan is working to get it into Debian in time to make the stretch freeze 11 days from now.

Our conclusion was the best thing to do is probably make the 0.9.3 SRU now, and start work on a 0.10.2+ SRU in around 3 weeks, once the 0.10.2 debian packages have received some testing and we've had time to sweep for any final bugs.

Robie Basak (racb) wrote :

Can I suggest that you maintain these packages in a git tree somewhere (if you aren't doing so already), create branches for the Ubuntu stable releases you're targeting, and then get reviews straight from there? I'll be happy to do an SRU review straight from a git branch. Then review changes could get committed there presumably by more people, so we could move faster. Chris will just need to review new commits that land since his last review, and upload when everyone is happy. And I can allow through any upload after just verifying that it is identical to the thing I already reviewed and accepted in git.

It'll save confusion with multiple versions in the queue this way, too.

Peter Eckersley (pde-lists) wrote :

Harlan and the other Debian developers have git trees here:

https://alioth.debian.org/plugins/scmgit/cgi-bin/gitweb.cgi?a=project_list;pf=letsencrypt

Maybe Chris should also get access to those repos?

Chris Halse Rogers (raof) wrote :

python-letsencrypt dropping the letsencrypt binary uploaded.

If you'd like me to have access to the git repos you'll need to add raof-guest to the letsencrypt group (or, alternatively, make the git repository accessible to collab-maint).

Robie Basak (racb) wrote :

python-letsencrypt and python-letsencrypt-apache don't exist in Yakkety or Zesty, so let's mark those Invalid as there is no action to take for these packages neither in Yakkety nor Zesty (only Xenial).

Changed in python-letsencrypt-apache (Ubuntu):
status: Fix Released → Invalid
Changed in python-letsencrypt-apache (Ubuntu Yakkety):
status: New → Invalid
Changed in python-letsencrypt (Ubuntu Yakkety):
status: New → Invalid
Changed in python-letsencrypt (Ubuntu):
status: Fix Released → Invalid
Robie Basak (racb) wrote :

Setting all tasks for which I see an upload in the queue as "In Progress" so I can more easily see what's what. Here's my current view of the queues:

python-acme
queue/xenial/unapproved/cb94bae
queue/yakkety/unapproved/15d7282

python-certbot
queue/xenial/new/e67c69a
queue/yakkety/unapproved/c6e7c0b

python-certbot-apache
queue/xenial/new/a42f913
queue/yakkety/unapproved/b496663

python-certbot-nginx
queue/xenial/new/b15997a
queue/yakkety/new/480f5d3

python-letsencrypt
queue/xenial/unapproved/21d1a2f

python-letsencrypt-apache
[empty]

Changed in python-acme (Ubuntu Xenial):
status: New → In Progress
Changed in python-acme (Ubuntu Yakkety):
status: New → In Progress
Changed in python-certbot (Ubuntu Xenial):
status: New → In Progress
Changed in python-certbot-apache (Ubuntu Xenial):
status: New → In Progress
Changed in python-certbot-apache (Ubuntu Yakkety):
status: New → In Progress
Changed in python-certbot-nginx (Ubuntu Xenial):
status: New → In Progress
Changed in python-certbot-nginx (Ubuntu Yakkety):
status: New → In Progress
Changed in python-letsencrypt (Ubuntu Xenial):
status: New → In Progress
Robie Basak (racb) wrote :

No changes needed to python-letsencrypt-apache: this will be left as-is in Xenial, will use python-letsencrypt which is only being changed to drop the letsencrypt package for the transition (libraries stay all the same).

Changed in python-certbot (Ubuntu Yakkety):
status: New → In Progress
Changed in python-certbot (Ubuntu):
status: New → Fix Released
Changed in python-certbot-apache (Ubuntu):
status: New → Fix Released
Changed in python-letsencrypt-apache (Ubuntu Xenial):
status: New → Invalid
Robie Basak (racb) wrote :

I've verified all the versions of the queue assuming we are backporting in SRUs <= 0.10.2-1 of python-{acme,certbot{,-apache,-nginx}} only and leaving behind python-letsencrypt and python-letsencrypt-apache. So let's retitle for exactly this.

summary: - letsencrypt 0.4.1 contains numerous bugs fixed upstream
+ [SRU] Backport letsencrypt 0.9.3

I spent some time on reviewing this on Friday, but I got interrupted. There was also some discussion on IRC (https://irclogs.ubuntu.com/2017/01/27/%23ubuntu-release.html#t12:31)

My notes:

Can we put a note that there will be a new cronjob active now in debian/NEWS and debian/changelog as suggested on IRC?

What happens if user has installed letsencrypt outside of packaging (for example, to get a newer version while leaving the system one installed)? Will adding a cronjob in the SRU cause any issues?

Can we nominate a person or a team to handle bugs as a consequence of this SRU, and subscribe to package bugs to make sure?

update-maintainer needs running (but I can do this from the queue if no other changes are needed).

Changed in python-certbot-apache (Ubuntu Yakkety):
status: In Progress → Incomplete
Changed in python-certbot (Ubuntu Yakkety):
status: In Progress → Incomplete
Changed in python-acme (Ubuntu Yakkety):
status: In Progress → Incomplete
Peter Eckersley (pde-lists) wrote :

This has been stuck for a while, I suspect because it hasn't been clearly on anyone's plate :(.

Let's fix that:

* Brad Warren on the Certbot team is going to construct a retrospective changelog.txt, and post a link here.
* RAOF should probably revise the packages to include that and the news file.
* Anything else?

Separately, this has taken so long that it might be advisable to switch to an 0.10.2 SRU, since that version is now well-tested, and is also the one that's frozen into Debian Stretch. Thoughts for and against? What extra steps would we need for an 0.10.2 SRU?

Robie Basak (racb) wrote :

I'd like to suggest again that everyone involved does all packaging changes in git, and gets all approvals (from a suitable Ubuntu uploader and the Ubuntu SRU team) in git, before uploading. I really think this will speed things up.

I'm not sure I see everything in https://alioth.debian.org/plugins/scmgit/cgi-bin/gitweb.cgi?pf=letsencrypt, though that may be because repositories have not been renamed following source package renames.

I'm not too bothered about who can push where. Since git is nicely distributed, as long as we can all get to the "current" branch, it shouldn't matter as long as we all agree on where that "current" branch is (as it changes).

RAOF, if you're struggling for time, let me know and I'll see if a colleague on the server team is available to help (since I can only wear the one hat).

Chris Halse Rogers (raof) wrote :

OK. So, my understanding of the current state is that:
a) I shall update the SRU packages (in git and upload to the queue):
 *) Run update-maintainer
 *) Add a NEWS entry for the new cron job
 *) Possibly update to 0.10.2
b) Robie will then do the SRU-review for them.

That leaves the question: โ€œWhat happens if a user has installed letsencrypt outside of packaging? Will the new cron job cause any problems?โ€. Someone want to answer that? :)

Chris Halse Rogers (raof) wrote :

Aaargh. Time!

Robie, if someone on the server team has some spare cycles to prepare the packages, that'd be great. Sorry!

There's still the question - โ€œWhat happens if a user has installed letsencrypt outside of packaging? Will the new cron job cause any problems?โ€ to be resolved, too.

Peter Eckersley (pde-lists) wrote :

Hi Chris!

I think your todo list looks accurate.

On the question of cron jobs, here are the answers as we understand them upstream:

What happens if the user runs two multiple cron jobs?

Answer 0: probably nothing. "certbot renew" is designed to be run as often as you like, and is normally a no-op.

Answer 1: with some small probability, the user might have two "certbot renew" commands that are executed at the same time. In that case, it would be fairly common for one or both of those to fail with an error, that would produce cron email. The baseline probability of this is collision is about once per 5,000 cert renewals if the hour in the user's cron job is uniform-random, and one in 200 cert renewals if they picked the same two hours (noon and midnight) that are baked into Debian's cron job.

Answer 2: with some much smaller probability, two overlapping "certbot renew" commands could experience a race condition in writing cert lineage files in /etc/letsencrypt/archive, or symlinks in /etc/letsencrypt/live. These would cause a configuration problem (certs and privkeys don't match) about 75 - 80% of the time. I just measured this race condition window on a AWS tiny instance, and estimate that a cert writing race might happen about once every 36,000 cert renewals if the cron job hours line up, or once every 864,000 renewals if users have cron jobs at uniform-random hours.

It is however possible that there are other race conditions in some of our plugins (apache, nginx) that are more likely to occur.

We have a few mitigation options:

Mitigation 0: write a patch to add locking to Certbot 0.10.2 / 0.10.3. This would add a new dependency on python-filelock, and we'd have to make a choice about how much field testing we want for this patch before SRUing it.

Mitigation 1: change the cron job, which picks random times in the hours after noon and midnight, to a systemd timer that runs at two uniform-random hours, or a cron job that has two hours that are less likely to be chosen by sys admins. We can probably use LE serverside data to pick the two least common hours.

Mitigation 2: study the plugin code to ensure that problematic race conditions are really as rare as we think. We could probably tolerate a temporary risk of failure that's one in a million cert renewals on the subset of systems which have two cron processes and where the admin ignored the notice about it -- hard disks fail faster than that.

I think the upstream team favours mitigation 0 :)

Brad Warren (bradmwarren) wrote :

Small update: If we go with mitigation 0, we won't be using python-filelock. We'll either use an existing Python file locking module packaged in Ubuntu or add our own code that implements lockfiles to Certbot.

Nish Aravamudan (nacc) wrote :

From a fear of breaking existing deployments in an update -- I would also prefer #0.

Robie is out this week, so maybe RAOF, you can you provide your input here?

Mitigation 0 gets a +1 from me. That seems the sensible path.

We have the mitigation in our git master tree (https://github.com/certbot/certbot/pull/4394/files) and are shipping it in an 0.12.1 release today to get field testing. Once that patch has been used to issue ~100K certs I'd be okay with it going into an SRU.

Peter Eckersley (pde-lists) wrote :

Here's a slightly better link: https://github.com/certbot/certbot/pull/4369

Peter Eckersley (pde-lists) wrote :

Apologies for the delay here :(

The Certbot locking patch turned out to be more subtle to implement correctly than we had expected, but we finalised and version and shipped it in Certbot 0.14.0 last week. The patch is here: https://github.com/certbot/certbot/pull/4449#issuecomment-299802507

Since that release, there have around 200,000 certificates issued with Certbot 0.14.0. We have had one user report that the locking patch caused a problem for them; that user was intentionally running multiple Certbot instances in parallel for performance reasons. There are probably no great solutions for such users, since their current practices are subject to race conditions that might eventually cause corruption of cert files or even webserver configs.

My instinct is that we should apply the locking patch (perhaps augmenting the error message to explain that users who want to run multiple Certbots safely in parallel should supply --config-dir, --work-dir and --log-dir arguments to each instance), and ship Certbot 0.10.2 to Xenial users ASAP.

summary: - [SRU] Backport letsencrypt 0.9.3
+ [SRU] Backport letsencrypt 0.14.1

The Certbot team has had a couple of calls with rbasak to sort out progress on this.

Lask week we met and concluded that we should SRU 0.14.1 because it's well-tested and has additional bugfixes, as well as creating a streamlined process for the Certbot team to get well-tested releases SRU'd quickly in the future. Rbasak took the task of preparing packages; he would work with Nish to get them reviewed and uploaded; Peter will write a draft/template SRU document for Certbot updates; and Brad will see if he can recruit a member of the upstream Certbot team to be more involved in Ubuntu packaging in the future.

This week, we met again. Rbasak had packages for us to test: https://launchpad.net/~racb/+archive/ubuntu/experimental/+packages ; he's going to ping Harlan and offer those to Debian experimental, as well as updating them to include a notice about the renewal cron job.

We'll meet again in a week to coordinate the actual SRU.

Robie Basak (racb) wrote :

I've prepared backport branches for 0.14.2 for Xenial, Yakkety and Zesty against python-acme, python-certbot, python-certbot-apache and python-certbot-nginx. That's 12 branches. There are an additional 2 branches that handle the source package renames in Xenial. I've pushed all of these to Alioth git prefixed "rbasak-guest/".

Next step: review.

Managing the branches by hand was a nightmare, so I ended up writing a script that creates all the branches. This allowed me to iterate. It's probably easier to review my generator script than review each branch individually. I've pushed the generator scripts (and support files) to a branch called "rbasak-guest/backport-tools" in the Alioth python-certbot repository: https://anonscm.debian.org/cgit/letsencrypt/python-certbot.git/log/?h=rbasak-guest/backport-tools

You can find the repositories for the other source packages on Alioth at: https://anonscm.debian.org/cgit/letsencrypt/

Nish, could you review these please?

Separately, I'll look at Peter's SRU documentation.

On Fri, Jul 7, 2017 at 7:06 AM, Robie Basak <email address hidden> wrote:
> I've prepared backport branches for 0.14.2 for Xenial, Yakkety and Zesty
> against python-acme, python-certbot, python-certbot-apache and python-
> certbot-nginx. That's 12 branches. There are an additional 2 branches
> that handle the source package renames in Xenial. I've pushed all of
> these to Alioth git prefixed "rbasak-guest/".
>
> Next step: review.
>
> Managing the branches by hand was a nightmare, so I ended up writing a
> script that creates all the branches. This allowed me to iterate. It's
> probably easier to review my generator script than review each branch
> individually. I've pushed the generator scripts (and support files) to a
> branch called "rbasak-guest/backport-tools" in the Alioth python-certbot
> repository: https://anonscm.debian.org/cgit/letsencrypt/python-
> certbot.git/log/?h=rbasak-guest/backport-tools

The logic of the scripts seems sound to me. Do you want me to do the
individual reviews of each branch as well in this case?

Robie Basak (racb) wrote :

On Wed, Jul 12, 2017 at 11:30:14PM -0000, Nish Aravamudan wrote:
> The logic of the scripts seems sound to me. Do you want me to do the
> individual reviews of each branch as well in this case?

Could you review what I put in the branches, please? That is: that the
backport contents are suitable in your opinion, and that I haven't
missed anything that might make it inappropriate for SRU - IOW, would
you be happy to sponsor these?

I don't think it's necessary for you to review that the branches were in
fact produced by my script, so I think it'd be sufficient for you to
examine just my scripts.

Note though that there are two branches not produced by my scripts
xenial-transitional for python-certbot and python-certbot-apache
(python-letsencrypt-apache repo). Please check these look good to you
too.

Thanks!

Nish Aravamudan (nacc) wrote :

On Wed, Jul 12, 2017 at 5:43 PM, Robie Basak <email address hidden> wrote:
> On Wed, Jul 12, 2017 at 11:30:14PM -0000, Nish Aravamudan wrote:
>> The logic of the scripts seems sound to me. Do you want me to do the
>> individual reviews of each branch as well in this case?
>
> Could you review what I put in the branches, please? That is: that the
> backport contents are suitable in your opinion, and that I haven't
> missed anything that might make it inappropriate for SRU - IOW, would
> you be happy to sponsor these?

Reviewed:

- python-acme/rbasak-guest/xenial
- python-acme/rbasak-guest/yakkety
- python-acme/rbasak-guest/zesty
- python-certbot/rbasak-guest/xenial
- python-certbot/rbasak-guest/yakkety
- python-certbot/rbasak-guest/zesty
- python-certbot/rbasak-guest/xenial-transitional
- python-letsencrypt-apache/rbasak-guest/xenial
- python-letsencrypt-apache/rbasak-guest/yakkety
- python-letsencrypt-apache/rbasak-guest/zesty
- python-letsencrypt-apache/rbasak-guest/xenial-transitional
- python-letsencrypt-nginx/rbasak-guest/xenial
- python-letsencrypt-nginx/rbasak-guest/yakkety
- python-letsencrypt-nginx/rbasak-guest/zesty

These all look good to me. Excellent work on a very complicated series
of changes that all need to land together and I think they are
suitable for uploading.

Below is a draft of the instructions for how to run integration tests on the proposed Certbot packages. Two comments about this script:

1. I'm not sure if there's a better way to get the Ubuntu codename or how reliable the method I'm using is.
2. I don't know if the way I'm installing the proposed packages will work because I don't know what will happen when apt tries to resolve python-certbot-apache's dependencies.

Please feel free to make/suggest edits if anyone knows how to make these areas work (more robustly).

== Running integration tests ==

Integration tests should be run on the proposed updates for each modified Ubuntu release. You can do this using the script below in a virtual machine for the release of Ubuntu you would like to test.

```
if test "$(id -u)" -ne "0"; then
    echo "Must run as root to install packages and use Docker." >&2
    exit 1
fi

apt update
apt install docker.io git lsb-release net-tools wget -y

wget https://github.com/docker/compose/releases/download/1.15.0-rc1/docker-compose-Linux-x86_64 -O /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose

codename=$(lsb_release -a 2>/dev/null | grep Codename | cut -f2)
echo "Package: *" > /etc/apt/preferences.d/proposed-updates
echo "Pin: release a=$codename-proposed" >> /etc/apt/preferences.d/proposed-updates
echo "Pin-Priority: 400" >> /etc/apt/preferences.d/proposed-updates

apt update
apt install python-certbot-apache/$codename-proposed -y
service apache2 stop
apt install python-certbot-nginx/$codename-proposed -y
service nginx stop
certbot_version=$(certbot --version 2>&1 | grep "^certbot" | cut -d " " -f 2)

cd ~
git clone https://github.com/certbot/certbot --branch v$certbot_version --depth 1
cd certbot
tests/boulder-fetch.sh
until curl http://localhost:4000/directory 2>/dev/null; do
  echo waiting for boulder
  sleep 1
done
tests/boulder-integration.sh
echo "Success!"
```

Brad Warren (bradmwarren) wrote :

Sorry for the 2nd message.

I just noticed I missed the shebang line when I copied my script and having `set -e` is important. I was using `#!/bin/bash -xe`.

Robie Basak (racb) wrote :

Thanks Brad!

Based on this, and our conversations on IRC, I've prepared documentation for an SRU policy exception at https://wiki.ubuntu.com/StableReleaseUpdates/Certbot

Rather than mandate use of your script, I tried to write it so that it's the outcomes that are mandated, and linked to the script as one potential way of achieving that.

I'd like to seek an opinion from another SRU team member that what I am doing here is in principle reasonable, and then I think we're ready to upload into the proposed pockets.

Robie Basak (racb) wrote :

I also welcome anyone to comment on this plan or my proposed exception documentation at https://wiki.ubuntu.com/StableReleaseUpdates/Certbot.

Brian Murray (brian-murray) wrote :

@brad - lsb_release -c returns just the codename so if you use that you could remove the pipe to grep.

Peter Eckersley (pde-lists) wrote :

@racb the exception document looks good to me.

Hello Peter, or anyone else affected,

Accepted python-acme into zesty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/python-acme/0.14.2-0ubuntu0.17.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-zesty to verification-done-zesty. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-zesty. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in python-acme (Ubuntu Zesty):
status: New → Fix Committed
tags: added: verification-needed verification-needed-zesty
Changed in python-acme (Ubuntu Yakkety):
status: Incomplete → Won't Fix
Changed in python-certbot (Ubuntu Yakkety):
status: Incomplete → Won't Fix
Brian Murray (brian-murray) wrote :

Hello Peter, or anyone else affected,

Accepted python-certbot into zesty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/python-certbot/0.14.2-0ubuntu0.17.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-zesty to verification-done-zesty. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-zesty. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in python-certbot (Ubuntu Zesty):
status: New → Fix Committed
Changed in python-certbot-nginx (Ubuntu Yakkety):
status: In Progress → Won't Fix
Changed in python-certbot-nginx (Ubuntu Zesty):
status: New → Fix Committed
Brian Murray (brian-murray) wrote :

Hello Peter, or anyone else affected,

Accepted python-certbot-nginx into zesty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/python-certbot-nginx/0.14.2-0ubuntu0.17.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-zesty to verification-done-zesty. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-zesty. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in python-certbot-apache (Ubuntu Zesty):
status: New → Fix Committed
Brian Murray (brian-murray) wrote :

Hello Peter, or anyone else affected,

Accepted python-certbot-apache into zesty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/python-certbot-apache/0.14.2-0ubuntu0.17.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-zesty to verification-done-zesty. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-zesty. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in python-certbot-apache (Ubuntu Yakkety):
status: Incomplete → Won't Fix
summary: - [SRU] Backport letsencrypt 0.14.1
+ [SRU] Backport letsencrypt 0.14.2
Brian Murray (brian-murray) wrote :

Hello Peter, or anyone else affected,

Accepted python-acme into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/python-acme/0.14.2-0ubuntu0.16.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in python-acme (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed-xenial
Brian Murray (brian-murray) wrote :

I don't see an upload of the python-certbot package in the -proposed queue for Xenial, am I missing something?

+python-letsencrypt (0.4.1-1ubuntu0.16.04.1) xenial; urgency=medium
+
+ * Drop letsencrypt binary package; this is now produced by the
+ python-certbot package (LP: #1640978)

Brad Warren (bradmwarren) wrote :

I just successfully ran the test script I provided earlier on the proposed Ubuntu Zesty packages. I had to make a couple modifications to properly install the packages from the proposed archive. An updated version of the script is attached.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers