Ubuntu

[FFE] ec2-init should check for image upgrades

Reported by Gustavo Niemeyer on 2009-09-03
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu on EC2
Wishlist
Unassigned
cloud-init (Ubuntu)
Wishlist
Scott Moser

Bug Description

Binary package hint: ec2-init

Amazon's EC2 and Eucalyptus do not offer a way for the user to know if the image he's using is out of date or not. With a simple mechanism which queries an API server at Canonical, and a hook inside the image itself, we'd be able to tell users at boot time that their images are out-of-date.

For this to work, we have to improve ec2-init with a simple feature which does the actual querying.

Changed in ec2-init (Ubuntu):
assignee: nobody → Dustin Kirkland (kirkland)
status: New → In Progress
Eric Hammond (esh) wrote :

I believe I have heard rather strong statements from Canonical that Ubuntu will never have "phone home" type data collection hooks to find out how many people are running Ubuntu. Though the overt intent of this proposal may be philanthropic, this sounds suspiciously like such a feature.

I would have loved to put such a hook in the Ubuntu images I've built for EC2 to find out the amount of usage they get, but did not so so because of the Ubuntu philosophy in this regard.

Gustavo Niemeyer (niemeyer) wrote :

Can you suggest a different mechanism that would work both in Eucalyptus and in Amazon, and that would allow people running customized images to be notified as well? Or are you saying that notifying people about new images being available isn't valuable?

Note that this isn't much different from the usual package upgrade checks which are performed by Ubuntu desktops already.

Changed in ec2-init (Ubuntu):
importance: Undecided → Wishlist
milestone: none → karmic-alpha-6
Dustin Kirkland  (kirkland) wrote :

I'm attaching a patch that solves this problem, but I'll await resolution of the concerns that Eric raises.

Personally, I don't see much difference between ec2-init using curl'ing an image-upgrade-notification file, and update-notifier's daily cronjob pulling information about security updates that are available on every Ubuntu desktop/server installed.

:-Dustin

Eric Hammond (esh) wrote :

I'm not saying it's not valuable. I'm not saying I personally oppose it.

I am saying that it appears to violate what I understood to be a promise from Ubuntu and that the implications should be considered carefully and the policy should be discussed publicly before the feature is implemented.

Gustavo Niemeyer (niemeyer) wrote :

Right, agreed.

What I find particularly interesting in this case is the concern that Canonical might be having some benefit in accessing *anonymous* data about upgrade inquires in a public API. This really puzzles me, because EC2 is a proprietary system from Amazon, and Amazon is able to tell *exactly* how much disk space, disk io, network, CPU usage, which images, which kernels, etc etc, you are using. Please read that again: how much *you* are using. It surprises me that such a concern is being raised in this context.

Eric Hammond (esh) wrote :

When the issue of package updates were raised, I believe the answer included things like:
- The update is not immediately run on every Ubuntu installation
- The user has the ability to turn it off
- The user has the ability to decide which archives to use
[Again, I'm not fighting against this, but I think those who might oppose it should be given a chance to voice their concerns before it launches, rather than creating a ruckus afterwards.]

Eric Hammond (esh) wrote :

Gustavo: Note that the current enhancement proposal goes beyond just EC2, but appears to also apply to Ubuntu images users run with Eucalyptus on their own hardware at their own company.

Gustavo Niemeyer (niemeyer) wrote :

> Gustavo: Note that the current enhancement proposal goes beyond just
> EC2, but appears to also apply to Ubuntu images users run with
> Eucalyptus on their own hardware at their own company.

Eric, your point about the "Ubuntu philosophy" was specific to EC2.

About working on Eucalytpus, yes, we want this mechanism to work in Eucalyptus too, because we want people to know that there are new images of Ubuntu available with new fixes and features and they might want to upgrade (or perhaps they *really should* upgrade, in case of kernel issues). If you have a proposal on how to offer this feature on both EC2 and Eucalyptus, including simple image customizations, I'd like to know about it.

> I am saying that it appears to violate what I understood to be a promise
> from Ubuntu and that the implications should be considered carefully and
> the policy should be discussed publicly before the feature is implemented.

Again, as far as I'm aware of, the context you've been working on was in EC2, and as such I can't see what's the promise Ubuntu would offer you when Amazon is monitoring everything you are doing.

If your concern now is about Eucalyptus, there are many trivial ways to prevent this mechanism from working: simply disable it by removing the configuration file, block the connections to the public API, etc.

> Again, I'm not fighting against this, but I think those who might oppose it
> should be given a chance to voice their concerns before it launches,
> rather than creating a ruckus afterwards

I believe you are already exercising this chance. What else would you suggest?

Martin Pitt (pitti) wrote :

For the FF exception part, approved. It seems rather important to me to not keep outdated instances around with potentially unapplied security updates, and this is just a simple local notification which should help admins to keep their system current.

I agree that this is just the counterpart of the apt-get update cronjob. It ought not to send any system specific data, of course, just ask the server about the most recent version. But I let you discuss/decide that.

Soren Hansen (soren) wrote :

> About working on Eucalytpus, yes, we want this mechanism to work in
> Eucalyptus too, because we want people to know that there are new images of
> Ubuntu available with new fixes and features and they might want to upgrade
> (or perhaps they *really should* upgrade, in case of kernel issues). If you
> have a proposal on how to offer this feature on both EC2 and Eucalyptus,
> including simple image customizations, I'd like to know about it.

It would seem to me that any decision we can make server side could be made
client side. It's little different from what apt-get does. It fetches a list of
what's available, compares it to what it's got installed and decides what it
wants to do. Compare this to sending the complete list of installed packages to
a server somewhere which then decides which upgraded packages it wants to give
you.

>> I am saying that it appears to violate what I understood to be a promise
>> from Ubuntu and that the implications should be considered carefully and
>> the policy should be discussed publicly before the feature is implemented.
> Again, as far as I'm aware of, the context you've been working on was in EC2,
> and as such I can't see what's the promise Ubuntu would offer you when Amazon
> is monitoring everything you are doing.

Just because I trust my data to Amazon does not mean that I'm comfortable
sharing my data with the rest of the world. That's the "trust" part.

I don't see how this is relevant at all, though. Regardless of what Amazon may
or may not be doing with your data, we should hold ourselves and Ubuntu to the
highest possible standard and as far as possible respect our users' privacy.

>> Again, I'm not fighting against this, but I think those who might oppose it
>> should be given a chance to voice their concerns before it launches,
>> rather than creating a ruckus afterwards
> I believe you are already exercising this chance. What else would you suggest?

I'm sure I don't understand what you're suggesting here. Are you suggesting
that this is uncontroversial enough that this bug (which /extremely/ few of the
target users will ever see) is enough exposure, and also that now that Eric has
seen it, there couldn't possibly be more to be said?

Dustin Kirkland  (kirkland) wrote :

It should *not* send any user- or system-specific data. I will be opposed to any utility that does this without user consent.

The ec2-upgrade-available utility will:

 1) Source /etc/ubuntu-image (if it exists) to obtain a URI.
  * Admins who want to disable this feature would simply remove this file.
  * Alternatively, admins could set this URI to a different or locally mirrored URI, if they desire.

 2) Use curl to retrieve $URI, and cache it locally in /var/run/ec2-upgrade-available, which is cleared on reboot

 3) URI would look something like:
URI=http://imagestore.canonical.com/api/images/ubuntu-9.10-amd64

 4) On login, if /var/run/ec2-upgrade-available exists, pam_motd will insert friendly, helpful text toward the bottom of the system MOTD (same as updates-available, release-upgrade, reboot-required on all other Ubuntu installations)

 5) The URI would be pulled once at system boot (in the background), and then every 24 hours, by a cronjob dynamically created at boot, to randomize the load on targeted servers.

I believe that this design provides a system that allows both EC2 and UEC users of Ubuntu images to be informed of updates in a manner that:
 * does not impact system boot (the first retrieval occurs in the background)
 * does not impact overall system performance (24 hour caching cronjob seems reasonable)
 * does not impact login (only prints cached message)
 * mimics the design of apt-get-update retrieval of security notifications, and update-motd notification of such information
 * can be disabled by removing /etc/ubuntu-image
 * can be customized to point to a different URI

Are there still any objections?

:-Dustin

Gustavo Niemeyer (niemeyer) wrote :

> Just because I trust my data to Amazon does not mean that I'm comfortable
> sharing my data with the rest of the world. That's the "trust" part.

This is a pretty misleading statement.

The mechanism does not send *your data* anywhere, and the query is a read-only query made against an API in Canonical's data center specifically, including just the image version to see if it's out-of-date or not, and this image version is the same for all users of the same image. This is very different from "sharing my data with the rest of the world".

The comparison with EC2 was made specifically because it sounds silly to be worried about such a mechanism when Amazon knows so much about exactly what is being done and by whom.

>> I believe you are already exercising this chance. What else would you suggest?
> I'm sure I don't understand what you're suggesting here. Are you suggesting
> that this is uncontroversial enough that this bug (which /extremely/ few of the
> target users will ever see) is enough exposure, and also that now that Eric has
> seen it, there couldn't possibly be more to be said?

I'm merely suggesting you two could help me a little bit, so that we can actually get something done together.

If you want do publish this elsewhere and ask for opinions, please do so. If you want to suggest a different approach that would enable people to know there are new images (customized or not) in EC2 and Eucalyptus which might be realistically put in place, please do so as well. If you think this is a terrible idea and that it should be shot down and replaced by nothing because no one cares about image upgrades, speak up.

Soren Hansen (soren) wrote :

>> Just because I trust my data to Amazon does not mean that I'm comfortable
>> sharing my data with the rest of the world. That's the "trust" part.
> This is a pretty misleading statement.

Without the context, yes.

> The mechanism does not send *your data* anywhere, and the query is a
> read-only query made against an API in Canonical's data center specifically,
> including just the image version to see if it's out-of-date or not, and this
> image version is the same for all users of the same image. This is very
> different from "sharing my data with the rest of the world".

I did not make the comparison.

> The comparison with EC2 was made specifically because it sounds silly to be
> worried about such a mechanism when Amazon knows so much about exactly what
> is being done and by whom.

Right. There's no way they could provide the service they do without having a
way of knowing these things. That's the nature of running virtual machines on
someone else's hardware. There most definitely is a way to run a free
operating system on said hardware without feeding information back to the
provider of said operating system.

> If you want do publish this elsewhere and ask for opinions, please do so.

I'd prefer it if there was nothing to discuss.

> If you want to suggest a different approach that would enable people to know
> there are new images (customized or not) in EC2 and Eucalyptus which might be
> realistically put in place, please do so as well.

A simple directory structure holding information about the current image,
kernel, and ramdisk ID's, mirrored onto our release or archive mirrors.
This would provide instances information about what is current. They can fetch
this, compare to themselves and decide that they're out of date.

Something like this:

  http://people.canonical.com/~soren/ec2-version-query/current/

..only with different identifiers. The mirroring would provide same level of
privacy as apt-get.

> If you think this is a terrible idea and that it should be shot down and
> replaced by nothing because no one cares about image upgrades, speak up.

Can we keep the false dichotomies to a minimum, please?

Gustavo Niemeyer (niemeyer) wrote :

> Something like this:
>
> http://people.canonical.com/~soren/ec2-version-query/current/

That's really not much different in terms of privacy than what is being suggested in this bug. The image will have to look at a URI like this for knowing if it's out of date or not:

   http://people.canonical.com/~soren/ec2-version-query/current/eu-west-1/i386/hardy/ami

Then whoever is maintaining the service will see that *someone* has a hardy-i386 machine. I don't really think that the fact that disclosing that this someone is running version A or B of the given image is that much different.

> Can we keep the false dichotomies to a minimum, please?

I'm just trying to move the discussion into something constructive, Soren. Sometimes it's hard to word things in a way that pleases everyone.

On Fri, Sep 4, 2009 at 12:50 PM, Gustavo Niemeyer<email address hidden> wrote:
> Then whoever is maintaining the service will see that *someone* has a
> hardy-i386 machine.  I don't really think that the fact that disclosing
> that this someone is running version A or B of the given image is that
> much different.

I don't see where "hardy-i386" should bother anyone too much.

That information is usually included in HTTP_USER_AGENT, which is
transmitted in most any browser request. Your hardy box will transmit
this information to any server you request information from.

 * Firefox:
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.5) Gecko/2008121621
Ubuntu/8.04 (hardy) Firefox/3.0.5

 * Curl:
curl/7.18.0 (i686-pc-linux-gnulp) libcurl/7.18.0 OpenSSL/0.9.8g
zlib/1.2.3.3 libidn/1.1

 * Wget
Wget/1.10.2

That said, Soren's suggestion of something even more generic such as
"current" works for me too...

:-Dustin

Matt Zimmerman (mdz) wrote :

So long as it is anonymous, and is intended to provide a useful service (i.e. notifying the user of updates), I don't see a problem with this feature, or with using the data from it to estimate how many instances are running in the wild. We measure HTTP traffic on the update servers to estimate how many Ubuntu systems are running as well.

Eric Hammond (esh) on 2009-09-04
Changed in ubuntu-on-ec2:
importance: Undecided → Wishlist
status: New → In Progress
Eric Hammond (esh) wrote :

Random response points:

- My main statement is that I think this feature should be discussed on a more public forum. I don't know if this should be ubuntu-ec2, ec2ubuntu, a technical board meeting, or whatever. I'm simply reporting a conflict in what I've heard in the past and what seems to be proposed here.

- I would rather not be put in the position of arguing the view points of others, though I think they deserve a chance to be heard. Besides caring about the community's needs, I also care about a backlash against Ubuntu if the feature is launched without appropriate discussion and community support.

- It is possible that I entirely misunderstood what I thought was an Ubuntu promise not to implement dial-home features like you find in software from other vendors. (I also thought the name "Ubuntu" was not going to be used for commercial software products and it turns out I was wrong there ;-) )

- When you ask a Canonical web server for currency information, you are not just passing a release number and user agent. You are also informing Canonical of the IP address where you are running the image (or gateway). I imagine that would be the biggest privacy concern.

- I'm not just talking about EC2. I think think folks who run Eucalyptus on their own servers would be even more likely to object to having their instances contact Canonical and tell them they are running Eucalyptus in their company, and how often.

- It sounded to me like this feature would immediately contact Canonical on boot, making it difficult to disable even for people who knew about it and wanted to.

- It sounds like Soren has a great idea for integrating the check and notification features into a standard Ubuntu package so there is nothing happening beyond a normal upgrade to get the information through the normal channels. This would allow the user to decide who and when to notify about the fact that they are running Ubuntu just as they do today.

- See my first point. That's my point. All the rest of these points are just distractions.

Scott Moser (smoser) on 2009-09-08
tags: added: ec2-images uec-images
Scott Moser (smoser) on 2009-09-08
Changed in ubuntu-on-ec2:
status: In Progress → Invalid
Changed in ec2-init (Ubuntu):
milestone: karmic-alpha-6 → none
Dustin Kirkland  (kirkland) wrote :

Patch exists, has been tested, and I believe it meets the requirements raised by all parties, from the sake of anonymity, usefulness, etc.

However, seeing as we do not have a consensus at this point, I'm downgrading this bug from "In Progress" to "Triaged", unassigned myself, and un-targeting it at Alpha6.

:-Dustin

Changed in ec2-init (Ubuntu):
assignee: Dustin Kirkland (kirkland) → nobody
status: In Progress → Triaged
Scott Moser (smoser) wrote :

I'm marking this fix-released. I just logged into a daily build from 20100315, and it says in the MOTD:

A newer build of the Ubuntu lucid server image is available.
It is named 'beta1' and has build serial '20100317'.

affects: ec2-init (Ubuntu) → cloud-init (Ubuntu)
Changed in cloud-init (Ubuntu):
assignee: nobody → Scott Moser (smoser)
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Related blueprints

Bug attachments