Comment 70 for bug 1267393

Revision history for this message
Seth Arnold (seth-arnold) wrote :

I reviewed juju version ff791983cd1a186e2e09878a37cf243f7f9eb734. The
review covered significantly less portion of the codebase than usual, and
should not be considered a security audit.

Juju 1.18.1 is in trusty-release
Juju 1.22.6 is in trusty-updates
Juju 1.22.1 is in vivid-release, meaning upgrades from updated trusty to vivid fail
Juju 1.22.6 is in wily

When were 1339770 1389326 1391276 fixed? Are they fixed in all supported
releases? There's no mention of any of these bug numbers in the published
changelogs:
https://launchpad.net/ubuntu/+source/juju-core/+changelog

These bugs were known to be dangerous in 2014 yet still caused extensive
damage in May, 2015. What allowed them to persist so long? What steps have
been taken to ensure future bugs of similar severity don't last unpatched
in production for so long?

In 1339770, in May 2015, it was mentioned that 1.18 was end-of-life and no
further updates could be prepared for it. 1.18.0 was released just 13
months earlier and 1.18.1 had been included in 14.04 LTS. Why was the 1.18
infrastructure torn down so shortly after including 1.18 in a release with
five-year support? Have there been any similar changes in process that
would prevent or delay issuing an update to the currently supported
versions of juju already in the archive?

It is currently impossible to upgrade from 14.04 LTS to 15.04 due to
incorrect version numbers. Has anyone else noticed this yet? When will
this be fixed? Are there any changes in process needed to ensure this
doesn't happen in the future?

Will the juju team be asking for an MRE? Is it anticipated that new series
(e.g., the 1.18 to 1.22 change) would be included as an MRE? What
processes are in place to test updates before including updates into the
archive? What processes are available to the security team to test
updates that we would prepare?

I had more trouble reading the Juju code this review cycle than last
review cycle -- the Facade indirection mechanism makes code navigating
harder. I'm worried about it for a few reasons:
- Strings to reference method names are brittle and can't be checked at
  compile time. What methods are in place to ensure that these aren't
  typoed?
- Generic args and return types defeat type checking. What ensures types
  being returned or accepted have the desired properties?
- Java has had significant problems with their Reflection mechanism,
  probably dozens of issues per year. At what points of a process
  lifetimes is the Facade mechanism dynamic?

Here's a few issues I found:

- ./apiserver/apiserver.go logs passwords when tracing is enabled -- this
  is fine IFF this is loudly documented somewhere obvious. Is it? It'd be
  best to filter out passwords regardless.

- Chown() doesn't quote the user or group

- ./api/client.go WatchDebugLog() claims to read a line but looks like it
  may read up to 4096 bytes -- is this correct?

- significant number of TODO comments; is there a method in place to find
  unowned comments and assign them somewhere? is there a process in place
  to ensure they get revisited?

- Which versions of the client work with which versions of the servers?
  Where's that described?

- ./api/keyupdater/authorisedkeys.go AuthorisedKeys(),
  WatchAuthorisedKeys() expects exactly one authorized key, this seems
  fragile

- Is -static-libgo still being used?

- Perhaps redundant to say it, the embedded code copies mostly need to be
  packaged separately. I don't know to what extent they deserve review,
  but they do represent a significant amount of code not written here that
  will run as root in many environments.

There's a lot to like about the Juju codebase; error checking is rigorous,
the coding style is consistent, the shellscript quoting infrastructure is
awesome, it's inspired clever new Go packages that cleanly solve problems.
I didn't review as much as I would have liked, but what I did see looked
like rigorous work.

Juju has been growing new features at an incredible pace. Will development
of new features impede supporting deployed environments? The security
team cannot support Juju alone -- there is far too much domain-specific
knowledge required to properly maintain Juju. We will need the Juju
team's help to address practically every issue for all stages of future
security-relevant bugs: proper diagnosis, proper fix preperation,
proper backporting to all supported releases, proper test development,
and proper testing.

I'm concerned with how previous issues have been handled -- the three
referenced bug reports have combined to represent the single most
expensive consequence I've personally seen and all were known issues five
months earlier. So I need reassurance that the Juju team will help the
security team maintain Juju in our supported releases:

- Ask for an MRE, if that's the most appropriate mechanism to update Juju.
- Ask for special treatment that allow more frequent full-version updates,
  if that's the most appropriate mechanism to update Juju.
- Commitment to help the security team diagnose and prepare fixes.
- Commitment to help the security team backport fixes to all supported
  versions, as necessary.
- Commitment to help the security team prepare test cases and use testing
  infrastructures.

This may represent enough work that dedicated Juju headcount may
eventually be necessary. LTS releases every two years and enthusiast
releases every six months add up quickly.

Thanks