Ubuntu

fsck / dirty filesystem on instance is death

Reported by Scott Moser on 2012-02-08
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
cloud-init (Ubuntu)
High
Ben Howard
Precise
High
Ben Howard
Quantal
High
Ben Howard

Bug Description

As we saw in bug 898373, if a filesystem needs manual intervention for an fsck, then boot completely stops. mountall will wait indefinitely on someone attending to this broken filesystem, and in a cloud (at least in EC2) with no console access, that means the system will never boot.

I discussed this in #ubuntu-devel with slangasek at http://irclogs.ubuntu.com/2012/02/02/%23ubuntu-devel.html#t16:54.

He suggested to mark disk 'noauto' or 'nobootwait', but that is not really what someone would want to do.

The ideal fix in some sense would be to start an ssh daemon, and force the user into a screen session where they could fix it.

To that, slangasek said:
  smoser: sure, either fixing /etc/init/mountall-shell.conf to support this, or diverting it in a cloud-specific job, seems reasonable there

Related bugs:
 * bug 898373 : fsck.ext3: Device or resource busy while trying to open /dev/xvda2

tags: added: rls-p-tracking
James Page (james-page) on 2012-04-03
Changed in cloud-init (Ubuntu):
assignee: nobody → Ben Howard (utlemming)
James Page (james-page) on 2012-04-24
Changed in cloud-init (Ubuntu Precise):
status: Triaged → Won't Fix
Changed in cloud-init (Ubuntu Quantal):
status: New → Triaged
importance: Undecided → High
assignee: nobody → Ben Howard (utlemming)
tags: added: rls-q-incoming
removed: rls-p-tracking
Changed in cloud-init (Ubuntu Quantal):
milestone: none → quantal-alpha-3
Ursula Junque (ursinha) wrote :

Removed the rls-q-incoming tag, according to the process.

tags: removed: rls-q-incoming
Ben Howard (utlemming) wrote :

for this bug, later in the cycle we'll be publishing a "rescue" EBS volume for booting from. All the other options would include a dramatic change of behavior that would not be desireable for users -- like a "rescue SSH console" or some other nonsense.

The path forward will be:
  - Canonical will publish a "rescue volume" which is a cloud image with a label of "RESCUE_VOL".
  - Cloud-Init on boot will recognize this rescue volume and swtich over to the rescue volume (Cloud init launches before the general mount)
  - Users will then rescue their system the same way

The rescue volumes should land sometime this cycle.

Changed in cloud-init (Ubuntu Quantal):
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers