fsck / dirty filesystem on instance is death

Bug #928990 reported by Scott Moser
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
cloud-init (Ubuntu)
Won't Fix
High
Unassigned
Precise
Won't Fix
High
Unassigned
Quantal
Won't Fix
High
Unassigned

Bug Description

As we saw in bug 898373, if a filesystem needs manual intervention for an fsck, then boot completely stops. mountall will wait indefinitely on someone attending to this broken filesystem, and in a cloud (at least in EC2) with no console access, that means the system will never boot.

I discussed this in #ubuntu-devel with slangasek at http://irclogs.ubuntu.com/2012/02/02/%23ubuntu-devel.html#t16:54.

He suggested to mark disk 'noauto' or 'nobootwait', but that is not really what someone would want to do.

The ideal fix in some sense would be to start an ssh daemon, and force the user into a screen session where they could fix it.

To that, slangasek said:
  smoser: sure, either fixing /etc/init/mountall-shell.conf to support this, or diverting it in a cloud-specific job, seems reasonable there

Related bugs:
 * bug 898373 : fsck.ext3: Device or resource busy while trying to open /dev/xvda2

tags: added: rls-p-tracking
James Page (james-page)
Changed in cloud-init (Ubuntu):
assignee: nobody → Ben Howard (utlemming)
James Page (james-page)
Changed in cloud-init (Ubuntu Precise):
status: Triaged → Won't Fix
Changed in cloud-init (Ubuntu Quantal):
status: New → Triaged
importance: Undecided → High
assignee: nobody → Ben Howard (utlemming)
tags: added: rls-q-incoming
removed: rls-p-tracking
Changed in cloud-init (Ubuntu Quantal):
milestone: none → quantal-alpha-3
Revision history for this message
Ursula Junque (ursinha) wrote :

Removed the rls-q-incoming tag, according to the process.

tags: removed: rls-q-incoming
Revision history for this message
Ben Howard (darkmuggle-deactivatedaccount) wrote :

for this bug, later in the cycle we'll be publishing a "rescue" EBS volume for booting from. All the other options would include a dramatic change of behavior that would not be desireable for users -- like a "rescue SSH console" or some other nonsense.

The path forward will be:
  - Canonical will publish a "rescue volume" which is a cloud image with a label of "RESCUE_VOL".
  - Cloud-Init on boot will recognize this rescue volume and swtich over to the rescue volume (Cloud init launches before the general mount)
  - Users will then rescue their system the same way

The rescue volumes should land sometime this cycle.

Changed in cloud-init (Ubuntu Quantal):
status: Triaged → Won't Fix
Changed in cloud-init (Ubuntu):
status: Triaged → Won't Fix
Mathew Hodson (mhodson)
Changed in cloud-init (Ubuntu):
milestone: quantal-alpha-3 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.