live cd from nfsroot breaks the nfs mount during bootup
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
casper (Ubuntu) |
New
|
Undecided
|
Unassigned | ||
network-manager (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
Binary package hint: network-manager
I'm half guessing that this is N-M's bug, so it may need to be reassigned.
I've been trying to boot the Intrepid Desktop i386 (alpha5) LiveCD from a NFS/TFTP/DHCP server (i.e. PXE).
I can netboot the Intrepid live CD, accessing filesystem.squashfs over nfs (mounted by the initramfs, so not nfsroot by all definitions). The problem is that after the init scripts are done, the NFS mount is dead, and all I/O hangs forever. cat is usually in the cache, so I can even cat /var/log/kern.log, and files in /proc, like /proc/mounts, but if tab-completion accesses anything from filesystem.
So you only get 6 tries after switching away from X (which manages to start up far enough to show a blank orange screen and start spinning the cursor). Running things in the background in case they hang works, but tab-completion will still hang your shell. /proc/mounts doesn't indicate any problems, and the NFS mount looks up. I don't know what to cat in /proc or /sys to duplicate the info I'd get from ifconfig. (neither it nor ip(8) are in the cache.) This is a really hard problem to debug... TORAM=yes might help, but it doesn't seem to do anything. (There is code that looks for that env var, even if toram isn't parsed by /init.) Probably the thing to do would be to mount a local filesystem somewhere and run binaries from it.
It's definitely an init script that breaks the NFS mount, because booting with init=/bin/bash drops me to a shell after the initramfs does its thing. Then I can run find over the whole squashfs filesystem with no problems.
My client machine (holly) PXE boots, and my pxelinux.
DEFAULT menu.c32
SAY press return for menu
prompt 1
LABEL intrepid-i386
kernel intrepid-
append boot=casper netboot=nfs nfsroot=
# other useful args: text to not start gdm
...other LABELs
Everything is set up properly so it loads vmlinuz and initrd.gz that were unpacked from /casper in the intrepid iso image. I've been netbooting Debian installers and whatnot for years, and I'm sure I didn't get that part wrong. BTW, I had to search for a while and eventally read /init to figure out the right boot args, if those even are all the boot args needed. I found lots of older docs, e.g. for feisty and Gutsy. I guess I should have looked at initramfs(8), although it doesn't say what combination you need exactly for NFS root to work.
10.0.0.17:/... is where I unpacked the whole ISO (with 7z x intrepid-
Unsurprisingly, I get the same results if I boot the same kernel and initrd with the same args, but from syslinux on a USB stick instead of pxelinux.
I've tried this on two different clients with identical results: a Dell PE1950 server at work (dual bnx2 gigE onboard), and an Asus K8V (Marvell Yukon gigE onboard, no other NICS) at home. So it's not a eth0/eth1 confusion problem, because my home machine only has an eth0. I use NFS all the time between my home machines, so I'm confident there's nothing wrong with the network or my NFS setup. (NFS server (tesla) is running linux Ubuntu 2.6.24-
LiveCD netbooting used to work as late as Gutsy. see https:/
I thought of a way to debug this: hit alt+sysrq+e to send a sigterm to all tasks while init scripts are running, but before NetworkManager starts.
I tried it after booting with TORAM=yes, since it seemed to be hung there. It actually works, and I'm running from a filesystem.squashfs that was loaded to tmpfs. So I manually ran some sudo /etc/rc2.d/whatever start, and after starting dbus then NetworkManager, I see that ifconfig shows eth0 go down for a short while after N-M's start script runs.
For netbooting the livecd to work, we need a way to prevent N-M from running, or at least from doing this. Or we need NFS mounts that survive the interface going down then up. In fact I'm a bit surprised it doesn't seem to survive. (since I booted with TORAM=yes, I didn't have any NFS mounts when I was running the init scripts.)