Comment 65 for bug 1569925

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Hypothesis,

Test (1) - The error is NEVER propagated to upper layers:

# xfs and ext4 mounted automatically

inaddy@iscsihang:~$ mount | grep _netde
/dev/sda1 on /ext4 type ext4 (rw,relatime,stripe=32,data=ordered,_netdev)
/dev/sdb1 on /xfs type xfs (rw,relatime,attr2,inode64,noquota,_netdev)

# no error propagation

inaddy@iscsihang:~$ sudo iscsiadm -m node -o show | grep timeo.replace
node.session.timeo.replacement_timeout = -1
node.session.timeo.replacement_timeout = -1

# target server can't give any more packets to guest:

inaddy@machete:~$ sudo iptables -A INPUT -s 192.168.49.8 -p tcp --destination-port 3260 -j DROP

# reboot can't succeed

inaddy@iscsihang:~$ sudo reboot

[ 27.596135] connection2:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4294896692, last ping 4294897944, now 4294899196
[ 27.628109] connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4294896700, last ping 4294897952, now 4294899204

Systemd hangs forever:

[ OK ] Stopped target Remote File Systems.
         Unmounting /ext4...
         Unmounting /xfs...

OBS: There is a tight relationship in between connection disappearing before the umount service runs and the capability of systemd to shutdown the machine entirely. I would say that, in case of no error propagation, is even worse since kernel would be locked up forever:

[ 240.132208] INFO: task systemd:1094 blocked for more than 120 seconds.
[ 240.133499] Not tainted 4.4.0-93-generic #116-Ubuntu
[ 240.134544] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 240.136092] INFO: task umount:1199 blocked for more than 120 seconds.
[ 240.137262] Not tainted 4.4.0-93-generic #116-Ubuntu
[ 240.138302] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 240.139742] INFO: task umount:1201 blocked for more than 120 seconds.
[ 240.140898] Not tainted 4.4.0-93-generic #116-Ubuntu
[ 240.141953] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

Systemd is still trying...

[ OK ] Unmounted /ext4.
[ OK ] Unmounted /xfs.
[ OK ] Stopped File System Check on /dev/disk/by-label/XFS.
[ OK ] Stopped File System Check on /dev/disk/by-label/EXT4.
[ OK ] Removed slice system-systemd\x2dfsck.slice.
[ OK ] Stopped target Remote File Systems (Pre).
         Stopping Login to default iSCSI targets...

[ 360.140109] INFO: task systemd:1094 blocked for more than 120 seconds.
[ 360.141219] Not tainted 4.4.0-93-generic #116-Ubuntu
[ 360.142100] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 360.143377] INFO: task umount:1199 blocked for more than 120 seconds.
[ 360.144451] Not tainted 4.4.0-93-generic #116-Ubuntu
[ 360.145333] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 360.146576] INFO: task umount:1201 blocked for more than 120 seconds.
[ 360.147586] Not tainted 4.4.0-93-generic #116-Ubuntu
[ 360.148472] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

This will happen forever. I still have to find a way of causing systemd to shutdown network and cause this hang because error, likely, is propagated after the umount service gives up its logic (or something like it) <-- theory.