== Comment: #7 - Guilherme Guaglianoni Piccoli <email address hidden> - 2016-08-19 10:08:07 ==
The normal procedure to perform a Netboot installation of Ubuntu 16.04 is to download the latest vmlinux and initrd.gz files available, and kexec them with no parameters (at least in ppc64el).
We're experiencing a strange issue in which the installer freezes before menus are showed. The system hangs in the point specified below, right after the i40e driver initialization:
[ 11.052832] i40e 0002:01:00.0 enP2p1s0f0: renamed from eth0
[ 11.073976] i40e 0002:01:00.1 enP2p1s0f1: renamed from eth1
[ 11.117799] i40e 0002:01:00.2 enP2p1s0f2: renamed from eth2
[ 11.225745] i40e 0002:01:00.3 enP2p1s0f3: renamed from eth3
***HANG***
The most difficult part in this issue is that it seems to be a timing issue/race condition, and many debug trials end up by avoiding the issue reproduction (heisenbug).
We were successful though in getting logs by booting the kernel with the command-line "BOOT_DEBUG=2" and by changing the initrd in order to enable systemd debug; only the files "init" and "start-udev" were changed in initrd, both attached here.
We've attached here a saved screen session that shows the entire boot process until it gets flooded with lots of messages like:
"starting '/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules'
'/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules'(err) 'failed to execute '/bin/readlink' '/bin/readlink /etc/
udev/rules.d/80-net-setup-link.rules': No such file or directory'
seq 3244 queued, 'add' 'pci_bus'
starting '/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules'
passed 408 byte device to netlink monitor 0x1003cfe8020seq 3236 running'/bin/readlink /etc/udev/rules.d/80-net-setup-l
ink.rules'(err) 'failed to execute '/bin/readlink' '/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules': No such
file or directory'
'/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules'(err) 'failed to execute '/bin/readlink' '/bin/readlink /etc/
udev/rules.d/80-net-setup-link.rules': No such file or directory'
Process '/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules' failed with exit code 2.
PROGRAM '/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules' /lib/udev/rules.d/73-usb-net-by-mac.rules:6
passed device to netlink monitor 0x1003d01f730
"
Then it keeps hanged in this stage. We re-tested it by changing the file 73-usb-net-by-mac.rules in initrd, replacing " /etc/udev/rules.d/80-net-setup-link.rules" to "/lib/udev/rules.d/80-net-setup-link.rules", since the former does not exist whereas the latter does. Same issue were observed!
Notice that if we boot the installer with command-line "net.ifnames=0" or "net.ifnames=1", the problem does not reproduces anymore.
We want to ask Canonical's help in investigating this issue.
Thanks,
== Comment: #7 - Guilherme Guaglianoni Piccoli <email address hidden> - 2016-08-19 10:08:07 ==
The normal procedure to perform a Netboot installation of Ubuntu 16.04 is to download the latest vmlinux and initrd.gz files available, and kexec them with no parameters (at least in ppc64el).
We're experiencing a strange issue in which the installer freezes before menus are showed. The system hangs in the point specified below, right after the i40e driver initialization:
[ 11.052832] i40e 0002:01:00.0 enP2p1s0f0: renamed from eth0
[ 11.073976] i40e 0002:01:00.1 enP2p1s0f1: renamed from eth1
[ 11.117799] i40e 0002:01:00.2 enP2p1s0f2: renamed from eth2
[ 11.225745] i40e 0002:01:00.3 enP2p1s0f3: renamed from eth3
***HANG***
The most difficult part in this issue is that it seems to be a timing issue/race condition, and many debug trials end up by avoiding the issue reproduction (heisenbug).
We were successful though in getting logs by booting the kernel with the command-line "BOOT_DEBUG=2" and by changing the initrd in order to enable systemd debug; only the files "init" and "start-udev" were changed in initrd, both attached here.
We've attached here a saved screen session that shows the entire boot process until it gets flooded with lots of messages like:
"starting '/bin/readlink /etc/udev/ rules.d/ 80-net- setup-link. rules' rules.d/ 80-net- setup-link. rules'( err) 'failed to execute '/bin/readlink' '/bin/readlink /etc/ d/80-net- setup-link. rules': No such file or directory'
'/bin/readlink /etc/udev/
udev/rules.
seq 3244 queued, 'add' 'pci_bus' rules.d/ 80-net- setup-link. rules' /bin/readlink /etc/udev/ rules.d/ 80-net- setup-l rules.d/ 80-net- setup-link. rules': No such rules.d/ 80-net- setup-link. rules'( err) 'failed to execute '/bin/readlink' '/bin/readlink /etc/ d/80-net- setup-link. rules': No such file or directory' rules.d/ 80-net- setup-link. rules' failed with exit code 2. rules.d/ 80-net- setup-link. rules' /lib/udev/ rules.d/ 73-usb- net-by- mac.rules: 6
starting '/bin/readlink /etc/udev/
passed 408 byte device to netlink monitor 0x1003cfe8020seq 3236 running'
ink.rules'(err) 'failed to execute '/bin/readlink' '/bin/readlink /etc/udev/
file or directory'
'/bin/readlink /etc/udev/
udev/rules.
Process '/bin/readlink /etc/udev/
PROGRAM '/bin/readlink /etc/udev/
passed device to netlink monitor 0x1003d01f730
"
Then it keeps hanged in this stage. We re-tested it by changing the file 73-usb- net-by- mac.rules in initrd, replacing " /etc/udev/ rules.d/ 80-net- setup-link. rules" to "/lib/udev/ rules.d/ 80-net- setup-link. rules", since the former does not exist whereas the latter does. Same issue were observed!
Notice that if we boot the installer with command-line "net.ifnames=0" or "net.ifnames=1", the problem does not reproduces anymore.
We want to ask Canonical's help in investigating this issue.
Thanks,
Guilherme
== Comment: #8 - Guilherme Guaglianoni Piccoli <email address hidden> - 2016-08-19 10:09:51 ==
== Comment: #9 - Guilherme Guaglianoni Piccoli <email address hidden> - 2016-08-19 10:10:31 ==
== Comment: #10 - Guilherme Guaglianoni Piccoli <email address hidden> - 2016-08-19 10:11:49 ==