Upload integrity: jenkins should publish expected MD5

Bug #1118469 reported by Thierry Carrez on 2013-02-07
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Core Infrastructure
Confirmed
Medium
Jeremy Stanley

Bug Description

My upload_release script[1] downloads the tarball built after a tag is pushed, prompts me to GPG-sign it and uploads it to LP. It checks that the MD5 on LP is the same as the tarball it downloaded.

That leaves the possibility of corruption in the first steps of the process (artefact copy to tarballs.o.o, and download from there). Ideally the Jenkins job which produces the tarball would compute the MD5 for that before copying anything, publish that and my script would check that the end result (uploaded file as seen from LP) still has the same MD5.

To that effect, I need to be able to query Jenkins with $project and $tag as parameters, and get an MD5 in return.

Proposed implementation (by mordred) is to set two Jenkins metadata as the result of its job: tag and md5, and then I could query using API the recent $project-tarball jobs for jobs with tag=$tag, and retrieve the corresponding md5.

James E. Blair (corvus) on 2013-04-23
Changed in openstack-ci:
status: New → Triaged
importance: Undecided → Medium
milestone: none → havana
Clark Boylan (cboylan) wrote :

I think a simple way to publish the md5 would be to upload a $PACKAGE_NAME.md5 file along with the actual sdist upload. For this to be useful we will need to put the items we publish behind HTTPS to avoid a MiTM. This may be overkill though. Releases currently go to launchpad from tarballs via ttx so we need only make the path from tarballs to ttx trusted. In any case this could probably use a little more discussion.

Jeremy Stanley (fungi) wrote :

I think it wouldn't be too hard to push this a little further. We already agree that we don't trust workers which run code from arbitrary projects, and to build an sdist tarball this is necessary. By extension, we should not implicitly trust these tarballs. I think an ideal solution would be:

0. untrusted worker builds the sdist tarball as currently based on the usual triggers (git tag, et cetera)

1. trusted worker checks out the associated git tag, retrieves and untars the tarball, copies the .git directory from the checkout into the directory where the tarball was unpacked and confirms that git indicates no changes (note this means we may need to track a couple of sdist-related files in .gitignore which we currently do not, but that seems like a good idea anyway)

2. if the validation succeeds, use gpg to generate a detached signature of the tarball with a locally-installed key specific to that trusted worker

3. upload the detached signature into the same directory as the tarball is being served from, attesting to its validity

This mechanism provides additional assurance beyond a simple checksum file. If someone is going to compromise a tarball, they're most likely to do it by MitM'ing the download (in which case they can also just modify the checksum file in flight as well) or by altering it at rest where it's published (and can again do the same to the checksum published with it). Signing the file not only guards against these (because the signature cannot be forged), but also against compromise of the aforementioned untrusted worker which built the tarball and generated the checksum.

If we still want to provide checksums for convenience (inclusion in release announcements and the like), we can do that in step 2 and sign the checksum list with the same key while we're at it, then upload it in step 3 along with the signature of the tarball. This provides similar assurances that a checksum has not been falsified.

To expand on this, if we want to protect against compromise of the trusted worker, the script it uses to vet tarballs will be published and easy to run independently. Further, we could create a release-sigs branch in each project where individuals are expected to upload detached signatures of tarballs they've vetted, and then code review those such that CI automation can run check/gate tests to confirm the signature does actually verify the tarball correctly, uploading it to be served alongside that tarball once approved in Gerrit. Going a step further, our upload jobs (for example PyPI uploads) could be triggered not off the signed git tag, but instead off an approved signature upload matching a predetermined project-specific keyring, and then include that signature in the upload (also revalidating it at upload time for sanity).

Jeremy Stanley (fungi) wrote :

I forgot to mention that the validation job running at step #1 should also verify the OpenPGP signature of the git tag it checks out, and optionally check the key which signed that tag against a local keyring of authorized signing keys for that particular project to ensure it is not getting a git repository which has been tampered with. These could also be the same keyrings used as whitelists to trigger upload jobs if/when we decide to take it to that level.

Jeremy Stanley (fungi) wrote :

One more tidbit, during step #1 we should also copy the .gitignore just to be sure it's not tampered with in the tarball and thus hiding a compromised file.

Khai Do (zaro0508) on 2013-12-10
Changed in openstack-ci:
assignee: nobody → Khai Do (zaro0508)
Khai Do (zaro0508) on 2014-02-18
Changed in openstack-ci:
assignee: Khai Do (zaro0508) → nobody
Jeremy Stanley (fungi) on 2014-03-11
Changed in openstack-ci:
status: Triaged → Confirmed
assignee: nobody → Jeremy Stanley (fungi)
Khai Do (zaro0508) on 2014-05-29
tags: added: jenkins
tags: added: jjb
Jeremy Stanley (fungi) on 2014-10-26
Changed in openstack-ci:
milestone: havana → kilo
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers