Ubuntu

ec2-init selects us-east-1 mirror when running in us-west-1 region

Reported by Scott Moser on 2009-12-08
20
This bug affects 2 people
Affects Status Importance Assigned to Milestone
ec2-init (Ubuntu)
Medium
Scott Moser
Karmic
High
Scott Moser
Lucid
Medium
Scott Moser

Bug Description

Binary package hint: ec2-init

ec2-init has the following code:

| def get_mirror_from_availability_zone(self, availability_zone):
| if availability_zone.startswith("us"):
| return 'http://us.ec2.archive.ubuntu.com/ubuntu/'
| elif availability_zone.startswith("eu"):
| return 'http://eu.ec2.archive.ubuntu.com/ubuntu/'
|
| return 'http://archive.ubuntu.com/ubuntu/'

The above incorrectly sets the archive to us.ec2.archive.ubuntu.com which is in us-east-1 region.

Currently that host is not accessible from inside the us-west-1 region.

ProblemType: Bug
Architecture: amd64
Date: Tue Dec 8 20:24:36 2009
DistroRelease: Ubuntu 10.04
Ec2AMI: ami-133c6d56
Ec2AMIManifest: ubuntu-images-testing-us-west-1/ubuntu-lucid-daily-amd64-server-20091207.manifest.xml
Ec2Kernel: aki-0d3c6d48
Package: ec2-init 0.4.999-0ubuntu7
PackageArchitecture: all
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: User Name 2.6.32-300.1-ec2
SourcePackage: ec2-init
Uname: Linux 2.6.32-300-ec2 x86_64

================
SRU Report (ec2-init):

Background: The UEC Images have code that runs at "first boot" and customizes an images to the region upon which it is being run. One of the customizations done is to attempt to select an archive mirror in the same region that this instance is running in. For example, Canonical runs 3 archive mirrors in ec2, {us-west-1,us,eu}.ec2.archive.ubuntu.com . The us and eu mirrors have aliases "us-east-1" and "eu-west-1". The short names are largely historic now. To limit cost and access to the ec2 archive mirrors they are configured to only allow systems inside their region to access them.

  An instance is assigned a 'availability zone' within an instance. These availability zones are currently consistently named as '<region>[a-z]' (Example: us-east-1a, eu-west-1d). The previous logic was to select 'us.ec2.archive.ubuntu.com' if the availability zone started with 'us' and 'eu.ec2.archive.ubuntu.com' if it started with 'eu'. The failure is that availability zones in both us-east-1 and us-west-1 start with 'us', and would select the 'us.ec2.archive.ubuntu.com' mirror.

Impact: Instances started in the us-west-1 region incorrectly select the 'us.ec2.archive.ubuntu.com' mirror. That mirror is not accessible outside of the us-east-1 region, and 'apt-get update' or 'install' cannot be run without manual modification of /etc/apt/sources.list.

Changes:
 Please see the attachment [http://launchpadlibrarian.net/36703733/bug-494185.diff]. The code change is the final hunk of that patch. We modify the 'get_mirror_from_availability_zone' method. The new code selects a mirror based on naming convention of availability zones. Failure or Exception in the logic will selects 'archive.ubuntu.com' as a mirror. Exception in the previous code would end up with no /etc/apt/sources.list.

Test Case:
- Start an instance in us-west-1 region
- ssh to instance, run 'apt-get update'.
- grep "us-west-1" /etc/apt/sources.list
  # you *should* see entries for the us-west-1. Currently, there the list shows 'us.ec2.archive.ubuntu.com'
- run 'apt-get update' to verify that the mirrors are functional.
- To verify there is no regression, we should a.) test multiple instance b.) verify that instances in us-east-1 and eu-west-1 do not regress the 'apt-get update' function.

Regression potential:
- The biggest cause for regression is that we are changing logic inside the image. Previously it was all self contained. The change makes the mirror selection depend on dns resolution of a hostname that is dependent upon meta-data available to the image.
This leaves two possible failure paths:
a.) False positive: If the availability zone is named such that a dns entry *does* exist in the .ec2.archive.ubuntu.com subdomain, but there is no mirror running there, the code will falsely write /etc/apt/sources.list to connect to that mirror. This is somewhat unlikely as the availability zones are currently consistently named, and canonical has control over the ec2.archive.ubuntu.com domain.
b.) False negative: Failure in the logic (dns resolution timeout, or temporary un-availability) could result in failure to select the correct mirror. This is mitigated by a selection of 'archive.ubuntu.com' on Exception or failure.

Notes:
- The changes suggested here also reduce the likelyhood that images run in UEC to incorrectly select a ec2 mirror. There is still a possibility of that, but it has been reduced
Previously the code in UEC would fail if the user defined availability zone started with 'us' or 'eu'. Now, the possibility for error is reduced to availability zone names where the folowing is a valid hostname:
   "%s.ec2.archive.ubuntu.com" % availability_zone[:-1]
=====

Scott Moser (smoser) wrote :
Changed in ec2-init (Ubuntu):
importance: Undecided → Medium
Scott Moser (smoser) wrote :

The workaround is to run the following on the booted image:
sudo sed -i 's,http://us.ec2.archive.ubuntu.com/ubuntu/,http://archive.ubuntu.com/ubuntu/,g' /etc/apt/sources.list

Eric Hammond (esh) wrote :

Scott: While we wait for (or work on) new images which do not have this problem, the major pain could be eliminated by asking the IS team to open up the restrictions on us.ec2.archive.ubuntu.com so that it can be accessed from us-west-1. I don't expect it would cost that much given the fairly low cost of network traffic is on EC2, fairly small volume of each update, and fairly low usage of us-west-1 at this point.

The alternative is that every user who tries an official Ubuntu AMI on us-west-1 is unable to upgrade unless they happen to find this bug report with the workaround.

Eric Hammond (esh) wrote :

Note that this is going to be an issue every time that Amazon opens up a new region and images are migrated there. The solution in the images and in the IS supported serves should take this into account.

On Wed, 9 Dec 2009, Eric Hammond wrote:

> Note that this is going to be an issue every time that Amazon opens up a
> new region and images are migrated there. The solution in the images
> and in the IS supported serves should take this into account.

Well, yes, but we can definitely do this better.
a.) I could have realized this was going to happen and modified the images
bound for us-west-1 before uploading.
b.) the logic could be:
  if host_exists(region + "ec2.archive.ubuntu.com")
     use region + "ec2.archive.ubuntu.com"
  else
     use archive.ubuntu.com

I'll ping IS again to see if we can't open it up at least until images are
refreshed.

Scott

Scott Moser (smoser) on 2009-12-09
Changed in ec2-init (Ubuntu):
assignee: nobody → Scott Moser (smoser)
Scott Moser (smoser) wrote :

I'm thinking right now to replace get_mirror_from_availability_zone with:

| def get_mirror_from_availability_zone(self, availability_zone):
| # availability is like 'us-west-1b' or 'eu-west-1a'
| try:
| host="%s.ec2.archive.ubuntu.com" % availability_zone[:-1]
| socket.getaddrinfo(host, None, 0, socket.SOCK_STREAM)
| return host
| except:
| return 'http://archive.ubuntu.com/ubuntu/'

Over all, it takes a much better "hit" to select a ec2 mirror. That should greatly reduce the chance for false positives.

The chance for error is then:
a.) availability zone names change form (ie, no longer '<region>[a-z]')
b.) there are images in a new region before a mirror is up, but the
    dns entry already exists

I think its reasonably good, the only thing that concerns me is the possibility of getaddrinfo hanging.

tags: added: iso-testing
Jos Boumans (jib) on 2009-12-11
description: updated
Jonathan Davies (jpds) on 2009-12-11
Changed in ec2-init (Ubuntu):
status: New → In Progress
Thierry Carrez (ttx) on 2009-12-11
Changed in ec2-init (Ubuntu Lucid):
milestone: none → lucid-alpha-2
Scott Moser (smoser) wrote :
Scott Moser (smoser) wrote :

The lucid debdiff attached in comment 7 works for karmic. We should get this into karmic proposed.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ec2-init - 0.4.999-0ubuntu8

---------------
ec2-init (0.4.999-0ubuntu8) lucid; urgency=low

  * fix mirror selection for us-west-1 (LP: #494185)
 -- Scott Moser <email address hidden> Fri, 11 Dec 2009 15:12:19 -0500

Changed in ec2-init (Ubuntu Lucid):
status: In Progress → Fix Released
Jonathan Davies (jpds) on 2009-12-14
Changed in ec2-init (Ubuntu Karmic):
milestone: none → karmic-updates
status: New → Triaged
importance: Undecided → Medium
Scott Moser (smoser) wrote :

Marking this in-progress as the lucid patch will apply cleanly to karmic deb. We just need to get it uploaded.

I'm marking it has 'High' importance here as it is present in released images.

Changed in ec2-init (Ubuntu Karmic):
assignee: nobody → Scott Moser (smoser)
importance: Medium → High
status: Triaged → In Progress
Scott Moser (smoser) on 2009-12-14
description: updated
Scott Moser (smoser) wrote :

I verfied that the following images have 0ubuntu8 version of ec2 init and all function 'apt-get update' properly (tested each region)

ami-7bedc60f ubuntu-images-testing-eu/ubuntu-lucid-daily-amd64-server-20091215.manifest.xml
ami-fc41a395 ubuntu-images-testing-us/ubuntu-lucid-daily-i386-server-20091215.manifest.xml
ami-832170c6 ubuntu-images-testing-us-west-1/ubuntu-lucid-daily-i386-server-20091215.manifest.xml

So, fixed-verified in lucid.

Chuck Short (zulcss) wrote :

I have attached karmic-proposed debdiff that has been uploaded.

Regards
chuck

Martin Pitt (pitti) wrote :

Any reason why you added simple-patchsys.mk to debian/rules and then patched it inline?

No harm done, anyway, just looks a bit curious.

Changed in ec2-init (Ubuntu Karmic):
status: In Progress → Fix Committed
tags: added: verification-needed

Accepted ec2-init into karmic-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

> Any reason why you added simple-patchsys.mk to debian/rules and then
> patched it inline?

The add of 'simple-patchsys.mk' was more for the hardy backport than the
karmic version. Previously (as you can see) there was no patch system in
place.

I added the simple-patchsys.mk so that I could easily base the hardy
version off [1] of the karmic version.

--
[1] https://launchpad.net/~ubuntu-on-ec2/+archive/ppa/+sourcepub/898230/+listing-archive-extra

Eric Hammond (esh) wrote :

It would be nice to get new Karmic AMIs published with this fix. Users are still unable to apt-get update on the Karmic AMI for us-west-1.

Scott Moser (smoser) wrote :

From a help-ticket:
"The us-east-1 mirror will now accept connections from 204.236.128.0/18,
which is the address space currently assigned to us-west-1. Resolving."

This should now be "fixed" for karmic images in us-west-1. I will still work on getting Karmic AMI's refreshed though.

Scott Moser (smoser) wrote :

Martin,
  I've tested the -proposed ec2-init (0.4.999-0ubuntu7.1) and it correctly fixes this bug.
  Please move this to updates.

Scott Moser (smoser) on 2010-01-19
tags: added: verification-done
removed: verification-needed
Steve Langasek (vorlon) wrote :

SRU verification process doesn't allow for the uploader of the SRU being the sole verifier; setting back to verification-needed. Scott, can someone else test out the fix too?

tags: added: verification-needed
removed: verification-done
Scott Moser (smoser) wrote :

On Tue, 19 Jan 2010, Steve Langasek wrote:

> SRU verification process doesn't allow for the uploader of the SRU being
> the sole verifier; setting back to verification-needed. Scott, can
> someone else test out the fix too?

Can someone else please verify this for me? Heres what I did to verify:
- Boot an instance of karmic 'testing' ami on ec2.
  I booted the 20100118 i386 build ubuntu-karmic-daily-i386-server-20100118:
  - eu-west-1 ami-47cfe433
  - us-east-1 ami-0312ff6a
  - us-west-1 ami-3b2c7d7e
  Ie:
  $ ec2-run-instances --region eu-west-1 ami-47cfe433
- ssh to system and enable proposed
  $ printf "%s %s %s %s\n" \
   deb http://archive.ubuntu.com/ubuntu/ karmic-proposed main |
   sudo tee -a /etc/apt/sources.list
- get new ec2-init
  sudo apt-get update && sudo apt-get install ec2-init
- remove the lock file that says setting of 'defaults' has already run
  and then reboot
  $ sudo rm -f /var/lib/ec2/ec2-defaults.ever
  $ sudo reboot
- reboot and re-connect
  verify that /etc/apt/sources.list has a valid mirror
  the region should be present in the mirror domainname, and 'apt-get
update' should function.

Eric Hammond (esh) wrote :

I verified this fix in my personal EC2 account using the 64-bit instances in eu-west-1, us-east-1, and us-west-1.

# For each of these three groups, run the remainder of the commands

  region=eu-west-1
  amiid=ami-4fcfe43b
  type=m1.large

  region=us-west-1
  amiid=ami-c12c7d84
  type=m1.large

  region=us-east-1
  amiid=ami-bb12ffd2
  type=m1.large

# Add an ssh keypair for the region

  ec2-add-keypair \
    --region $region \
    ec2-$region \
    > ~/.ssh/ec2-$region.pem

# Run an instance and connect to it

  instanceid=$(ec2-run-instances \
    --region $region \
    --instance-type $type \
    --key ec2-$region \
    $amiid |
    egrep ^INSTANCE | cut -f2)
  echo "instanceid=$instanceid"

  while host=$(ec2-describe-instances --region $region "$instanceid" |
    egrep ^INSTANCE | cut -f4) && test -z $host; do echo -n .; sleep 1; done
  echo host=$host

  ssh \
    -i ~/.ssh/ec2-$region.pem \
    ubuntu@$host

# Enable proposed and get fixed ec2-init

  printf "%s %s %s %s\n" \
     deb http://archive.ubuntu.com/ubuntu/ karmic-proposed main |
     sudo tee -a /etc/apt/sources.list

  sudo apt-get update && sudo apt-get install ec2-init

# Let defaults be run on next reboot and reboot

  sudo rm -f /var/lib/ec2/ec2-defaults.ever

  sudo reboot

# Connect to the instance again

  ssh \
    -i ~/.ssh/ec2-$region.pem \
    ubuntu@$host

# Verify archive host matches $region and test accessibility

  cat /etc/apt/sources.list

  sudo apt-get update && sudo apt-get upgrade -y

  exit

# Kill the instance

  ec2-terminate-instances --region $region $instanceid

Here is a sample /etc/apt/sources.list from the instance in us-west-1 (which had the original problem):

deb http://eu-west-1.ec2.archive.ubuntu.com/ubuntu/ karmic main universe
deb-src http://eu-west-1.ec2.archive.ubuntu.com/ubuntu/ karmic main universe
deb http://eu-west-1.ec2.archive.ubuntu.com/ubuntu/ karmic-updates main universe
deb-src http://eu-west-1.ec2.archive.ubuntu.com/ubuntu/ karmic-updates main universe
deb http://security.ubuntu.com/ubuntu karmic-security main universe
deb-src http://security.ubuntu.com/ubuntu karmic-security main universe

tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ec2-init - 0.4.999-0ubuntu7.1

---------------
ec2-init (0.4.999-0ubuntu7.1) karmic-proposed; urgency=low

  * fix mirror selection for us-west-1 (LP: #494185)
 -- Scott Moser <email address hidden> Fri, 11 Dec 2009 15:12:19 -0500

Changed in ec2-init (Ubuntu Karmic):
status: Fix Committed → Fix Released
Changed in ec2-init (Ubuntu Karmic):
assignee: Scott Moser (smoser) → mashaobing1 (mashaobing1)
Martin Pitt (pitti) on 2010-04-20
Changed in ec2-init (Ubuntu Karmic):
assignee: mashaobing1 (mashaobing1) → Scott Moser (smoser)
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers