Comment 8 for bug 978127

Revision history for this message
Julian Edwards (julian-edwards) wrote :

Via Scott:

I think a reasonable and SRU-able solution is below. Note, that in order
to deliver this, we have to deliver updated ephemeral images (which was
always expected, just pointing out that this fix comes in a ~600M
download).

The way this works right now is the following:
 A. ephemeral instance is booted with a 'url=' parameter on the kernel
   command line something like this:
      url=http://maasserver/cblr/svc/op/ks/system/node-XXXX
 B. as described at [1], cloud-init pulls that un-authed url, and stores it
   as local configuration. Currently the payload looks like this:
      #cloud-config
      datasource:
       MAAS:
         metadata_url: http://mass-host.localdomain/source
         consumer_key: Xh234sdkljf
         token_key: kjfhgb3n
         token_secret: 24uysdfx1w4
 C. cloud-init then continues on and uses that maas datasource as if it
    were locally configured to do so. It pulls user-data from
    the derivative url, and then executes it.
 D. The user-data provided is read from
    /etc/maas/commissioning-user-data [2]. cloud-init executes this code
    which makes api calls back to the configured maas server in 'B' to
    post commissioning status.

The issue that we see in bug 978127 is that the http requests done in 'C'
fail because of out of sync clock on the ephemeral node.

The solution that I suggest is:
 i.) modify 'B' above to include 'time_sync_url' field under 'MAAS'
 ii.) Before cloud-init does oauthed requests for user-data in 'C'
    above, it will first do an un-authed request to the value of
    'time_sync_url' which will return data like:
      Wed, 27 Jun 2012 10:13:29 -0400
 iii.) cloud-init will then set the system clock (not the hardware clock)
    to the given date. The subsequent oauth requests will succeed as
    they'll have a reasonable system clock at that point.
 iv.) if possible make cloud-init log the failure in 'C' above more
      obviously on the console. I believe this is less than
      straightforward unfortunately due to the console switching around
      that is done on boot.

Note, I skipping the 'time_sync_url' by simply directly providing a
'time_sync' like:
  time_sync: Wed, 27 Jun 2012 10:13:29 -0400
If that is seen as desirable it could probably be accommodated. The thing
that I do not like about it is that it writes that data to a local config
file, and obviously the current time stamp very quickly becomes incorrect.
Hiding it behind a url that has dynamic and correct content removes that.