pVM:pinelp2:ubuntu 16.04: Network is unreachable when using ssh in kdump

Bug #1571590 reported by bugproxy on 2016-04-18
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
makedumpfile (Ubuntu)
Medium
Taco Screen team
Xenial
Medium
Louis Bouchard

Bug Description

[SRU justification]
Without this fix, networked enabled kernel dump will fail to complete.

[Impact]
Broken networked functionality

[Fix]
Add a Wants target to the systemd unit.

[Test Case]
Follow "Test to reproduce" in Original Problem description.

[Regression]
None expected, the existing systemd unit setup remains but is improved by proposed fix.

[Description of the problem]

Original problem Description
============================
When dumping to ssh target , kdump failed:

         Starting Kernel crash dump capture service...
[ 18.542292] kdump-tools[2748]: Starting kdump-tools: ssh: connect to host 10.33.31.113 port 22: Network is unreachable
[ 18.542894] kdump-tools[2748]: * kdump-tools: Unable to reach remote server root@10.33.31.113. No reason to continue
[ 18.613376] kdump-tools[2748]: Mon, 11 Apr 2016 01:23:55 -0500
[ 19.310893] kdump-tools[2748]: Rebooting.
[ 19.652714] reboot: Restarting system

But the weired thing is that nfs target works on the same system.

---uname output---
Linux pinelp2 4.4.0-18-generic #34-Ubuntu SMP Wed Apr 6 14:00:30 UTC 2016 ppc64le ppc64le ppc64le GNU/Linux

Steps to Reproduce
===================================
1. install latest ubuntu 16.04 on pinelp2
2. config kdump to use ssh:
SSH="root@10.33.31.113"
3. run "kdump-config propagate" as root
4. trigger kdump

Userspace tool common name: kdump-tools

The userspace tool has the following bit modes: 64-bit

Userspace rpm: version 1:1.5.9-5

== Comment: #8 - Ping Tian Han <email address hidden> - 2016-04-17 21:58:52 ==
(In reply to comment #6)
> I am able to reproduce this issue as well.
> Ping, what is you try to dump on some other machine with ssh ? Does it still
> complain the same ? Is the ssh failing to connect to the network ?

I saw the same problem when dumping to medlp4 with ssh:

         Starting Kernel crash dump capture service...
[ 17.293028] kdump-tools[2733]: Starting kdump-tools: ssh: connect to host 10.33.7.181 port 22: Network is unreachable
[ 17.293692] kdump-tools[2733]: * kdump-tools: Unable to reach remote server root@10.33.7.181. No reason to continue
[ 17.405333] kdump-tools[2733]: Sun, 17 Apr 2016 20:56:59 -0500
[ 18.536587] kdump-tools[2733]: Rebooting.
[ 18.903280] reboot: Restarting system

but I can ssh log noto medlp4 before triggering kdump.

Default Comment by Bridge

tags: added: architecture-ppc64le bugnameltc-140193 severity-high targetmilestone-inin---
Changed in ubuntu:
assignee: nobody → Taco Screen team (taco-screen-team)

Thank you for taking the time to report this bug and helping to make Ubuntu better. It seems that your bug report is not filed about a specific source package though, rather it is just filed against Ubuntu in general. It is important that bug reports be filed about source packages so that people interested in the package can find the bugs about it. You can find some hints about determining what package your bug might be about at https://wiki.ubuntu.com/Bugs/FindRightPackage. You might also ask for help in the #ubuntu-bugs irc channel on Freenode.

To change the source package that this bug is filed about visit https://bugs.launchpad.net/ubuntu/+bug/1571590/+editstatus and add the package name in the text box next to the word Package.

[This is an automated message. I apologize if it reached you inappropriately; please just reply to this message indicating so.]

tags: added: bot-comment
vaishnavi (vaishnavi) on 2016-04-19
affects: ubuntu → makedumpfile (Ubuntu)

This seems to be a timing issue between network coming up and kdump-tools service
trying to access the remote target with ssh because when I get into emergency shell
after failure, this is what I get to see..

---
[FAILED] Failed to start Kernel crash dump capture service.
See 'systemctl status kdump-tools.service' for details.
Welcome to emergency mode! After logging in, type "journalctl -xb" to view
system logs, "systemctl reboot" to reboot, "systemctl default" or ^D to
try again to boot into default mode.
Give root password for maintenance
(or press Control-D to continue):
root@pinelp2:~# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 56:2d:60:47:c4:02 brd ff:ff:ff:ff:ff:ff
3: enP257p80s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq portid 40f2e93188c0 state UP group default qlen 1000
link/ether 40:f2:e9:31:88:c0 brd ff:ff:ff:ff:ff:ff
inet 10.33.11.58/16 brd 10.33.255.255 scope global enP257p80s0f0
valid_lft forever preferred_lft forever
inet6 2002:903:15f:1130:89c:e5b:6d8c:753f/64 scope global temporary dynamic
valid_lft 604746sec preferred_lft 85746sec
inet6 2002:903:15f:1130:42f2:e9ff:fe31:88c0/64 scope global mngtmpaddr dynamic
valid_lft 2591946sec preferred_lft 604746sec
inet6 2002:926:3e2:1130:89c:e5b:6d8c:753f/64 scope global temporary dynamic
valid_lft 604746sec preferred_lft 85746sec
inet6 2002:926:3e2:1130:42f2:e9ff:fe31:88c0/64 scope global mngtmpaddr dynamic
valid_lft 2591946sec preferred_lft 604746sec
inet6 fe80::42f2:e9ff:fe31:88c0/64 scope link
valid_lft forever preferred_lft forever
4: enP257p80s0f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop portid 40f2e93188c2 state DOWN group default qlen 1000
link/ether 40:f2:e9:31:88:c2 brd ff:ff:ff:ff:ff:ff
root@pinelp2:~# hostname -I
10.33.11.58 2002:903:15f:1130:89c:e5b:6d8c:753f 2002:903:15f:1130:42f2:e9ff:fe31:88c0 2002:926:3e2:1130:89c:e5b:6d8c:753f 2002:926:3e2:1130:42f2:e9ff:fe31:88c0
root@pinelp2:~# ping 10.33.31.113
PING 10.33.31.113 (10.33.31.113) 56(84) bytes of data.
64 bytes from 10.33.31.113: icmp_seq=1 ttl=64 time=0.399 ms
64 bytes from 10.33.31.113: icmp_seq=2 ttl=64 time=0.239 ms
64 bytes from 10.33.31.113: icmp_seq=3 ttl=64 time=0.234 ms

--- 10.33.31.113 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.234/0.290/0.399/0.079 ms
root@pinelp2:~#
root@pinelp2:~#
root@pinelp2:~#
root@pinelp2:~# ssh -i /root/.ssh/kdump_id_rsa root@10.33.31.113 mkdir -p /var/crash/10.33.11.58-test
root@pinelp2:~#
---

Most importantly, ssh command to create dir on remote target succeeds while that is what has failed leading to the below error message in the first place

"Starting kdump-tools: ssh: connect to host 10.33.31.113 port 22: Network is unreachable"

Looking into ways to fix that up.

Thanks
Hari

Louis Bouchard (louis) on 2016-04-28
Changed in makedumpfile (Ubuntu):
status: New → Confirmed
importance: Undecided → Medium
Changed in makedumpfile (Ubuntu Xenial):
status: New → Confirmed
importance: Undecided → Medium
assignee: nobody → Louis Bouchard (louis-bouchard)
Louis Bouchard (louis) wrote :

Hello,

Thank you for the work in identifying the solution and the patch. I was able to reproduce the bug and will upload a fix to debian shortly.

Once it is synchronized with Yakkety, I will proceed to SRU the solution to Xenial.

Kind regards,

...Louis

Changed in makedumpfile (Ubuntu):
status: Confirmed → In Progress
Louis Bouchard (louis) wrote :

Hello,

Just to keep you posted; I have uploaded your fix to Ubuntu Yakkety as I need more debian testing. Once it lands in the archive, I will do the SRU to Xenial.

Kind regards,

...Louis

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package makedumpfile - 1:1.5.9-5ubuntu1

---------------
makedumpfile (1:1.5.9-5ubuntu1) yakkety; urgency=medium

  [Hari Bathini <email address hidden>]
  * Fix networked kdump failure to reach remote server. Avoids
    "Network is unreachable" message when trying to do remote dumps on either
    SSH or NFS. (LP: #1571590)

 -- Louis Bouchard <email address hidden> Fri, 29 Apr 2016 10:43:58 +0200

Changed in makedumpfile (Ubuntu):
status: In Progress → Fix Released

(In reply to comment #24)
> This bug was fixed in the package makedumpfile - 1:1.5.9-5ubuntu1
>
> ---------------
> makedumpfile (1:1.5.9-5ubuntu1) yakkety; urgency=medium
>
> [Hari Bathini <email address hidden>]
> * Fix networked kdump failure to reach remote server. Avoids
> "Network is unreachable" message when trying to do remote dumps on either
> SSH or NFS. (LP: #1571590)
>
> -- Louis Bouchard <email address hidden> Fri, 29 Apr 2016 10:43:58
> +0200

When will the new package be available to Xenial, please?

Thanks.

Louis Bouchard (louis) on 2016-05-09
Changed in makedumpfile (Ubuntu Xenial):
status: Confirmed → In Progress
bugproxy (bugproxy) wrote :

*** Bug 141947 has been marked as a duplicate of this bug. ***

Louis Bouchard (louis) on 2016-07-21
description: updated

Hello bugproxy, or anyone else affected,

Accepted makedumpfile into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/makedumpfile/1:1.5.9-5ubuntu0.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in makedumpfile (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed

(In reply to comment #29)
> Hello bugproxy, or anyone else affected,
>
> Accepted makedumpfile into xenial-proposed. The package will build now and
> be available at
> https://launchpad.net/ubuntu/+source/makedumpfile/1:1.5.9-5ubuntu0.1 in a
> few hours, and then in the -proposed repository.
>
> Please help us by testing this new package. See
> https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to
> enable and use -proposed. Your feedback will aid us getting this update out
> to other Ubuntu users.
>
> If this package fixes the bug for you, please add a comment to this bug,
> mentioning the version of the package you tested, and change the tag from
> verification-needed to verification-done. If it does not fix the bug for
> you, please add a comment stating that, and change the tag to
> verification-failed. In either case, details of your testing will help us
> make a better decision.
>
> Further information regarding the verification process can be found at
> https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in
> advance!

Sorry, but I think this bug should be fixed with kdump-tools?

bugproxy (bugproxy) wrote :

I can confirm that this bug has been fixed with kdump-tools_1.5.9-5ubuntu0.1_all.deb

tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package makedumpfile - 1:1.5.9-5ubuntu0.1

---------------
makedumpfile (1:1.5.9-5ubuntu0.1) xenial; urgency=medium

  [ Hari Bathini <email address hidden> ]
  * Fix networked kdump failure to reach remote server.
    Avoids "Network is unreachable" message when trying to do remote dumps on
    either SSH or NFS. (LP: #1571590)

  * Replace maxcpus by nr_cpus
    nr_cpus is a hard limit that has an impact on the (kdump) kernel
    memory consumption, while it is not the case with maxcpus=1, as we can
    theoretically hotplug cpus with maxcpus=1 (LP: #1568952)

  * define_stampdir() : Loop on hostname -I for 5 sec to get IP address
    if HOSTTAG=ip. The network stack may not be ready when kdump-config runs.
    Give it some time before reverting HOSTTAG to hostname if an IP address
    cannot be found. (LP: #1599561)

  * Add cio_ignore result to /etc/default/kdump-tools on s390x
    In order to have crashkernel=128M to work correctly on the s390
    architecture the result of cio_ignore -u -k needs to be appended to the
    KDUMP_CMDLINE_APPEND variable in /etc/default/kdump-tools. This patch
    adds the required logic to do the proper modification. (LP: #1570775)

  * debian/rules : drop the dh_installinit override
    Uses a syntax which is no longer supported and generate an error on
    install. (LP: #1599491)

 -- Louis Bouchard <email address hidden> Fri, 22 Jul 2016 10:15:20 +0200

Changed in makedumpfile (Ubuntu Xenial):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for makedumpfile has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

bugproxy (bugproxy) on 2016-08-08
tags: added: targetmilestone-inin1604
removed: targetmilestone-inin---
bugproxy (bugproxy) on 2016-10-06
tags: added: targetmilestone-inin16041
removed: targetmilestone-inin1604
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers