Comment 0 for bug 1615021

Revision history for this message
bugproxy (bugproxy) wrote :

== Comment: #7 - Guilherme Guaglianoni Piccoli <email address hidden> - 2016-08-19 10:08:07 ==
The normal procedure to perform a Netboot installation of Ubuntu 16.04 is to download the latest vmlinux and initrd.gz files available, and kexec them with no parameters (at least in ppc64el).

We're experiencing a strange issue in which the installer freezes before menus are showed. The system hangs in the point specified below, right after the i40e driver initialization:

[ 11.052832] i40e 0002:01:00.0 enP2p1s0f0: renamed from eth0
[ 11.073976] i40e 0002:01:00.1 enP2p1s0f1: renamed from eth1
[ 11.117799] i40e 0002:01:00.2 enP2p1s0f2: renamed from eth2
[ 11.225745] i40e 0002:01:00.3 enP2p1s0f3: renamed from eth3
***HANG***

The most difficult part in this issue is that it seems to be a timing issue/race condition, and many debug trials end up by avoiding the issue reproduction (heisenbug).

We were successful though in getting logs by booting the kernel with the command-line "BOOT_DEBUG=2" and by changing the initrd in order to enable systemd debug; only the files "init" and "start-udev" were changed in initrd, both attached here.

We've attached here a saved screen session that shows the entire boot process until it gets flooded with lots of messages like:

"starting '/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules'
'/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules'(err) 'failed to execute '/bin/readlink' '/bin/readlink /etc/
udev/rules.d/80-net-setup-link.rules': No such file or directory'

seq 3244 queued, 'add' 'pci_bus'
starting '/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules'
passed 408 byte device to netlink monitor 0x1003cfe8020seq 3236 running'/bin/readlink /etc/udev/rules.d/80-net-setup-l
ink.rules'(err) 'failed to execute '/bin/readlink' '/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules': No such
file or directory'
'/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules'(err) 'failed to execute '/bin/readlink' '/bin/readlink /etc/
udev/rules.d/80-net-setup-link.rules': No such file or directory'
Process '/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules' failed with exit code 2.
PROGRAM '/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules' /lib/udev/rules.d/73-usb-net-by-mac.rules:6
passed device to netlink monitor 0x1003d01f730
"

Then it keeps hanged in this stage. We re-tested it by changing the file 73-usb-net-by-mac.rules in initrd, replacing " /etc/udev/rules.d/80-net-setup-link.rules" to "/lib/udev/rules.d/80-net-setup-link.rules", since the former does not exist whereas the latter does. Same issue were observed!

Notice that if we boot the installer with command-line "net.ifnames=0" or "net.ifnames=1", the problem does not reproduces anymore.

We want to ask Canonical's help in investigating this issue.
Thanks,

Guilherme

== Comment: #8 - Guilherme Guaglianoni Piccoli <email address hidden> - 2016-08-19 10:09:51 ==

== Comment: #9 - Guilherme Guaglianoni Piccoli <email address hidden> - 2016-08-19 10:10:31 ==

== Comment: #10 - Guilherme Guaglianoni Piccoli <email address hidden> - 2016-08-19 10:11:49 ==