rm: cannot remove `/var/lib/urandom/random-seed': Read-only file system

Bug #567592 reported by C de-Avillez
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
plymouth (Ubuntu)
New
Undecided
Unassigned

Bug Description

Binary package hint: cloud-init

Sometimes I see, on a failed instance start console the message "rm: cannot remove <file>". On today's runs, for example, I have the following entries:

single_test.log.2010-04-20_172056:WARNING:INSTANCE i-4741093D:rm: cannot remove `/var/lib/urandom/random-seed': Read-only file system
single_test.log.2010-04-20_172056:WARNING:INSTANCE i-4741093D:rm: cannot remove `/boot/grub/grubenv': Read-only file system
single_test.log.2010-04-20_172056:WARNING:INSTANCE i-3C9E0644:rm: cannot remove `/var/lib/urandom/random-seed': Read-only file system
single_test.log.2010-04-20_172056:WARNING:INSTANCE i-3C9E0644:rm: cannot remove `/boot/grub/grubenv': Read-only file system
single_test.log.2010-04-20_172056:WARNING:INSTANCE i-4B240905:rm: cannot remove `/var/lib/urandom/random-seed': Read-only file system
single_test.log.2010-04-20_172056:WARNING:INSTANCE i-4B240905:rm: cannot remove `/boot/grub/grubenv': Read-only file system
single_test.log.2010-04-20_172056:WARNING:INSTANCE i-3D35060E:rm: cannot remove `/var/lib/urandom/random-seed': Read-only file system
single_test.log.2010-04-20_172056:WARNING:INSTANCE i-3D35060E:rm: cannot remove `/boot/grub/grubenv': Read-only file system
single_test.log.2010-04-20_175515:WARNING:INSTANCE i-41B407DB:rm: cannot remove `/var/lib/urandom/random-seed': Read-only file system
single_test.log.2010-04-20_175515:WARNING:INSTANCE i-41B407DB:rm: cannot remove `/boot/grub/grubenv': Read-only file system
single_test.log.2010-04-20_175515:WARNING:INSTANCE i-4D7A09B4:rm: cannot remove `/var/lib/urandom/random-seed': Read-only file system
single_test.log.2010-04-20_175515:WARNING:INSTANCE i-4D7A09B4:rm: cannot remove `/boot/grub/grubenv': Read-only file system
single_test.log.2010-04-20_175515:WARNING:INSTANCE i-4FB10848:rm: cannot remove `/var/lib/urandom/random-seed': Read-only file system
single_test.log.2010-04-20_175515:WARNING:INSTANCE i-4FB10848:rm: cannot remove `/boot/grub/grubenv': Read-only file system
single_test.log.2010-04-20_175515:WARNING:INSTANCE i-4E34088E:rm: cannot remove `/var/lib/urandom/random-seed': Read-only file system
single_test.log.2010-04-20_175515:WARNING:INSTANCE i-4E34088E:rm: cannot remove `/boot/grub/grubenv': Read-only file system
single_test.log.2010-04-20_175515:WARNING:INSTANCE i-4A1E0728:rm: cannot remove `/var/lib/urandom/random-seed': Read-only file system
single_test.log.2010-04-20_175515:WARNING:INSTANCE i-4A1E0728:rm: cannot remove `/boot/grub/grubenv': Read-only file system
single_test.log.2010-04-20_182930:WARNING:INSTANCE i-3C180723:rm: cannot remove `/var/lib/urandom/random-seed': Read-only file system
single_test.log.2010-04-20_182930:WARNING:INSTANCE i-3C180723:rm: cannot remove `/boot/grub/grubenv': Read-only file system
single_test.log.2010-04-20_182930:WARNING:INSTANCE i-53D40A54:rm: cannot remove `/var/lib/urandom/random-seed': Read-only file system
single_test.log.2010-04-20_182930:WARNING:INSTANCE i-53D40A54:rm: cannot remove `/boot/grub/grubenv': Read-only file system
single_test.log.2010-04-20_182930:WARNING:INSTANCE i-5039093B:rm: cannot remove `/var/lib/urandom/random-seed': Read-only file system
single_test.log.2010-04-20_182930:WARNING:INSTANCE i-5039093B:rm: cannot remove `/boot/grub/grubenv': Read-only file system

Revision history for this message
C de-Avillez (hggdh2) wrote :

I am attaching the console output for a failed instance. All of them are similar. I am pretty sure this is not cloud-init, but I really have no idea of which package it should be under. Looking at the console output, plymouth/mountall sound as a good bet (or linux?).

Revision history for this message
C de-Avillez (hggdh2) wrote :

for the record, total instance runs today are, so far, at about 1400.

Revision history for this message
Scott Moser (smoser) wrote :

The important part of this log is:
| [ 8.287482] FDC 0 is a S82078B
| [ 8.944041] Console: switching to colour frame buffer device 80x30
| [ 9.453002] e1000: 0000:00:03.0: e1000_probe: (PCI:33MHz:32-bit) d0:0d:3c:18:07:23
| [ 9.455306] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3
| mountall: Disconnected from Plymouth
| init: plymouth main process (51) killed by SEGV signal
| init: plymouth-splash main process (277) terminated with status 2
| mountall: Skipping mounting / since Plymouth is not available
| mountall: Skipping mounting /tmp since Plymouth is not available
| init: dbus pre-start process (344) terminated with status 1
| init: plymouth-log main process (361) terminated with status 1
| init: udev-finish main process (377) terminated with status 2

something happened to plymouth, and mountall refused to mount / (more correctly, refused to remount it rw).

Revision history for this message
C de-Avillez (hggdh2) wrote :

reassigning package to plymouth, then.

affects: cloud-init (Ubuntu) → plymouth (Ubuntu)
Revision history for this message
Steve Langasek (vorlon) wrote :

what version of mountall is in the image you're testing?

mountall 2.13 includes this change:

  * Don't skip filesystems due to timeout when Plymouth not available.

The fact that mountall is skipping / when plymouth is unavailable is (was) a mountall bug.

plymouth segfaulting is also a bug, but that appears to be a general problem when running in UEC and probably going to be hard to get traction on without a usable rootfs for debugging.

Can you try booting with 'plymouth:debug' added to the commandline, and report the resulting console output?

Revision history for this message
C de-Avillez (hggdh2) wrote :

Per the manifest (http://uec-images.ubuntu.com/lucid/20100420/unpacked/lucid-server-uec-amd64.manifest) this is mountall 2.13 indeed.

FWIW, plymouth is 0.8.2-2.

I will see what I can do re. debugging plymouth. Right now the test script will terminate an instance when it fails to SSH into it, and not all instances have this issue. In fact, most don't. But I think I can hack it in.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Could you give more details here.

Did the remount of / to rw actually *fail*?

Revision history for this message
C de-Avillez (hggdh2) wrote :

@Scott: I am sorry, I do not know, not yet. We only know of these issues because the test script will grab the console output for *failed* SSH sessions to the instance. I intend to try and -- instead of terminating the instance -- leave it running, so I can get to the specific node that is running it and try to access it locally. But, if the root is really read-only, I very much doubt I will be able to log in to the instance :-(

Running the instance in debug mode for either plymouth or mountall is probably not going to help, since we cannot get the *current* console output: we can get it just after boot, at shutdown, or on "significat events" (whatever that means), and only 64K-worth of output.

I understand this is a tough nail to gnaw (and I know personally, my teeth hurt). Please rest assured that we do intend to get to the bottom of it.

Revision history for this message
Colin Watson (cjwatson) wrote :

I bodged up a test case for this by running the UEC image like this:

  kvm -monitor stdio -m 512 -kernel lucid-server-uec-i386-vmlinuz-virtual -append "root=/dev/sda" -hda lucid-server-uec-i386.img -boot c

What appears to be happening is:

  1) Initially, I hadn't changed /etc/fstab to match the fact that the root image in my test happened to wind up on /dev/sda rather than /dev/sda1, so mountall stalled waiting for /dev/sda1 to appear which was never going to happen
  2) After changing this, it still took a while for /dev/sda to appear for some reason, which exceeded mountall's boredom timeout
  3) Either way, mountall asked plymouth to display a message along the lines of "The disk drive for / is not ready yet or not present" (or, in case 2, /tmp instead of /)

  4) Because I'm booting without an initramfs, /dev/pts wasn't present before mountall ran and so plymouthd failed to attach to its session
  5) This resulted in the details plugin's show_splash_screen method being called with boot_buffer == NULL: bam, segfault

  6) Furthermore, in the process of investigating this, I found that plymouth:debug=file: doesn't work even if you hack around to make sure that it has a writable filesystem

I'm attaching a few logs, respectively:

  * plymouth-debug.log: plymouth:debug output showing the failure to attach to a session, and trace output up to the subsequent segfault
  * plymouthd.trace: strace matching the above
  * mountall.log: mountall log output after adjusting /etc/fstab as above; note that I had to run 'mount -t devtmpfs -o mode=0755 none /dev || true' before running mountall in order to make sure that I had somewhere to write the log output, since I wasn't using an initramfs

I'll also send patches upstream for 5 and 6. I think that we should consider an SRU for 5, since as far as I can tell this will break standard server boots without an initramfs.

Revision history for this message
Colin Watson (cjwatson) wrote :
Revision history for this message
Colin Watson (cjwatson) wrote :
Revision history for this message
Colin Watson (cjwatson) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.