netplan does not work on nfsroot

Bug #1767359 reported by Frank Steinberg
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Netplan
In Progress
High
Mathieu Trudel-Lapierre
initramfs-tools (Ubuntu)
In Progress
High
Unassigned
netplan.io (Ubuntu)
In Progress
High
Mathieu Trudel-Lapierre

Bug Description

It seems like netplan is not working on nfsroot systems.

Whenever a valid config.yaml is located in /etc/netplan/ the boot procedure gets stuck.

However, with an empty /etc/netplan directory, the nfsroot system boots (fetching IP DHCP config by kernel). Then, when I store a config in /etc/netplan and run "netplan apply", it stucks again.

I guess, netplan is (temporarily) taking down active interfaces when applying a configuration. This must never happen on nfsroot systems(!). I this assumption is true, this is probably a general mistake, in my opinion.

Regards,
 -frank

Revision history for this message
Ryan Harper (raharper) wrote :

Can you provide your netplan config that you attempted to apply? Also your /proc/cmdline with the nfsroot parameters.

Netplan apply does:

1) generate backend config (networkd in your case) from /etc/netplan/*.yaml and renders to /run/systemd/networkd/10-netplan*

2) (optionally) stops networkd ; not this does not take down active interfaces

3) for any interface that is not in the 'up' state, they may have the drive "replugged" to trigger any .link changes (names, mtu)

4) restart networkd

If possible, can we get the state of your network after booting but before applying any netplan config (ip a is sufficient)

Changed in netplan:
status: New → Incomplete
Revision history for this message
Frank Steinberg (steinberg-9) wrote : Re: [Bug 1767359] netplan does not work on nfsroot
Download full text (3.3 KiB)

Hi Ryan,

thank you very much for your prompt response and sorry for my delay.
Here is the missing information. Let me know, if I could supply other
details.

root@test1:/etc/netplan# cat /proc/cmdline
root=/dev/nfs nfsroot=134.169.34.15:/ibr/vm-root/test1 ip=::::::dhcp rw

root@test1:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether de:ad:be:ef:00:07 brd ff:ff:ff:ff:ff:ff
    inet 134.169.35.218/24 brd 134.169.35.255 scope global ens3
       valid_lft forever preferred_lft forever
    inet6 2001:638:602:1183:dcad:beff:feef:7/64 scope global dynamic mngtmpaddr
       valid_lft 86394sec preferred_lft 14394sec
    inet6 fe80::dcad:beff:feef:7/64 scope link
       valid_lft forever preferred_lft forever

root@test1:/etc/netplan# scp zfs1:/tmp/config.yaml /etc/netplan/
root@test1:/etc/netplan# cat /etc/netplan/*.yaml
network:
  version: 2
  renderer: networkd
  ethernets:
    ens3:
      dhcp4: true

root@test1:/etc/netplan# netplan apply
root@test1:/etc/netplan#

Note that the shell prompt appears, but no more input is possible.
The host can no longer be "pinged".
The test host is a KVM.

When booting the host with this netplan config in place, the boot procedure
gets stuck when showing "Starting Wait for Network to be Configured...".

> Am 27.04.2018 um 16:36 schrieb Ryan Harper <email address hidden>:
>
> Can you provide your netplan config that you attempted to apply? Also
> your /proc/cmdline with the nfsroot parameters.
>
> Netplan apply does:
>
> 1) generate backend config (networkd in your case) from
> /etc/netplan/*.yaml and renders to /run/systemd/networkd/10-netplan*
>
> 2) (optionally) stops networkd ; not this does not take down active
> interfaces
>
> 3) for any interface that is not in the 'up' state, they may have the
> drive "replugged" to trigger any .link changes (names, mtu)
>
> 4) restart networkd
>
> If possible, can we get the state of your network after booting but
> before applying any netplan config (ip a is sufficient)
>
>
> ** Changed in: netplan
> Status: New => Incomplete
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1767359
>
> Title:
> netplan does not work on nfsroot
>
> Status in netplan:
> Incomplete
>
> Bug description:
> It seems like netplan is not working on nfsroot systems.
>
> Whenever a valid config.yaml is located in /etc/netplan/ the boot
> procedure gets stuck.
>
> However, with an empty /etc/netplan directory, the nfsroot system
> boots (fetching IP DHCP config by kernel). Then, when I store a config
> in /etc/netplan and run "netplan apply", it stucks again.
>
> I guess, netplan is (temporarily) taking down active interfaces when
> applying a configuration. This must never happen on nfsroot
> systems(!). I...

Read more...

Revision history for this message
Frank Steinberg (steinberg-9) wrote :
Download full text (4.2 KiB)

One thing is a little special in our network config.
We propagate a second net route (10.1/16) via DHCP:

root@test1:~# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 134.169.35.2 0.0.0.0 UG 0 0 0 ens3
10.1.0.0 0.0.0.0 255.255.0.0 U 0 0 0 ens3
134.169.35.0 0.0.0.0 255.255.255.0 U 0 0 0 ens3

We achieve this in dhcpd.conf:

    option rfc3442-classless-static-routes code 121 = array of integer 8;

    option rfc3442-classless-static-routes 16, 10, 1, 0, 0, 0, 0, 0, 134, 169, 35, 2;

> Am 02.05.2018 um 10:17 schrieb Frank Steinberg <email address hidden>:
>
> Hi Ryan,
>
> thank you very much for your prompt response and sorry for my delay.
> Here is the missing information. Let me know, if I could supply other
> details.
>
> root@test1:/etc/netplan# cat /proc/cmdline
> root=/dev/nfs nfsroot=134.169.34.15:/ibr/vm-root/test1 ip=::::::dhcp rw
>
> root@test1:~# ip a
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> inet 127.0.0.1/8 scope host lo
> valid_lft forever preferred_lft forever
> inet6 ::1/128 scope host
> valid_lft forever preferred_lft forever
> 2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
> link/ether de:ad:be:ef:00:07 brd ff:ff:ff:ff:ff:ff
> inet 134.169.35.218/24 brd 134.169.35.255 scope global ens3
> valid_lft forever preferred_lft forever
> inet6 2001:638:602:1183:dcad:beff:feef:7/64 scope global dynamic mngtmpaddr
> valid_lft 86394sec preferred_lft 14394sec
> inet6 fe80::dcad:beff:feef:7/64 scope link
> valid_lft forever preferred_lft forever
>
> root@test1:/etc/netplan# scp zfs1:/tmp/config.yaml /etc/netplan/
> root@test1:/etc/netplan# cat /etc/netplan/*.yaml
> network:
> version: 2
> renderer: networkd
> ethernets:
> ens3:
> dhcp4: true
>
> root@test1:/etc/netplan# netplan apply
> root@test1:/etc/netplan#
>
> Note that the shell prompt appears, but no more input is possible.
> The host can no longer be "pinged".
> The test host is a KVM.
>
> When booting the host with this netplan config in place, the boot procedure
> gets stuck when showing "Starting Wait for Network to be Configured...".
>
>
>> Am 27.04.2018 um 16:36 schrieb Ryan Harper <email address hidden>:
>>
>> Can you provide your netplan config that you attempted to apply? Also
>> your /proc/cmdline with the nfsroot parameters.
>>
>> Netplan apply does:
>>
>> 1) generate backend config (networkd in your case) from
>> /etc/netplan/*.yaml and renders to /run/systemd/networkd/10-netplan*
>>
>> 2) (optionally) stops networkd ; not this does not take down active
>> interfaces
>>
>> 3) for any interface that is not in the 'up' state, they may have the
>> drive "replugged" to trigger any .link changes (names, mtu)
>>
>> 4) restart networkd
>>
>> If possible, can we get the state of your network after booting but
>> before applying any netplan config (ip a i...

Read more...

Revision history for this message
Ryan Harper (raharper) wrote :
Download full text (5.4 KiB)

Hi,

Thanks for the reply.

I think we need a bit more info, if possible. Prior to running the
netplan apply, please do:

# enable systemd-networkd debugging
% sudo mkdir /etc/systemd/system/systemd-networkd.service.d/
% sudo echo -e "[Service]\nEnvironment=SYSTEMD_LOG_LEVEL=debug" >
/etc/systemd/system/systemd-networkd.service.d/10-debug.conf
% sudo systemctl daemon-reload

Then you can trigger the netplan apply. As you say, this breaks your
connection, however I'm hoping we can get some of the
logging written to the journal before the link goes down.

After resetting/removing the network when you log back in, I would be
interested in output from:

% sudo journalctl -o short-precise -u systemd-networkd.service

On Wed, May 2, 2018 at 10:28 AM, Frank Steinberg
<email address hidden> wrote:
> One thing is a little special in our network config.
> We propagate a second net route (10.1/16) via DHCP:
>
> root@test1:~# netstat -rn
> Kernel IP routing table
> Destination Gateway Genmask Flags MSS Window irtt Iface
> 0.0.0.0 134.169.35.2 0.0.0.0 UG 0 0 0 ens3
> 10.1.0.0 0.0.0.0 255.255.0.0 U 0 0 0 ens3
> 134.169.35.0 0.0.0.0 255.255.255.0 U 0 0 0 ens3
>
> We achieve this in dhcpd.conf:
>
> option rfc3442-classless-static-routes code 121 = array of integer
> 8;
>
> option rfc3442-classless-static-routes 16, 10, 1, 0, 0, 0, 0, 0,
> 134, 169, 35, 2;
>
>
>> Am 02.05.2018 um 10:17 schrieb Frank Steinberg <email address hidden>:
>>
>> Hi Ryan,
>>
>> thank you very much for your prompt response and sorry for my delay.
>> Here is the missing information. Let me know, if I could supply other
>> details.
>>
>> root@test1:/etc/netplan# cat /proc/cmdline
>> root=/dev/nfs nfsroot=134.169.34.15:/ibr/vm-root/test1 ip=::::::dhcp rw
>>
>> root@test1:~# ip a
>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>> inet 127.0.0.1/8 scope host lo
>> valid_lft forever preferred_lft forever
>> inet6 ::1/128 scope host
>> valid_lft forever preferred_lft forever
>> 2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>> link/ether de:ad:be:ef:00:07 brd ff:ff:ff:ff:ff:ff
>> inet 134.169.35.218/24 brd 134.169.35.255 scope global ens3
>> valid_lft forever preferred_lft forever
>> inet6 2001:638:602:1183:dcad:beff:feef:7/64 scope global dynamic mngtmpaddr
>> valid_lft 86394sec preferred_lft 14394sec
>> inet6 fe80::dcad:beff:feef:7/64 scope link
>> valid_lft forever preferred_lft forever
>>
>> root@test1:/etc/netplan# scp zfs1:/tmp/config.yaml /etc/netplan/
>> root@test1:/etc/netplan# cat /etc/netplan/*.yaml
>> network:
>> version: 2
>> renderer: networkd
>> ethernets:
>> ens3:
>> dhcp4: true
>>
>> root@test1:/etc/netplan# netplan apply
>> root@test1:/etc/netplan#
>>
>> Note that the shell prompt appears, but no more input is possible.
>> The host can no longer be "pinged".
>> The test host is a KVM.
>>
>> When booting the host w...

Read more...

Revision history for this message
Frank Steinberg (steinberg-9) wrote :
Download full text (6.8 KiB)

Unfortunately, nothing to read after the reboot:

root@test1:~# journalctl -o short-precise -u systemd-networkd.service
-- Logs begin at Fri 2018-04-27 15:09:00 CEST, end at Wed 2018-05-02 11:23:27 CEST. --
-- No entries --

> Am 02.05.2018 um 10:46 schrieb Ryan Harper <email address hidden>:
>
> Hi,
>
> Thanks for the reply.
>
> I think we need a bit more info, if possible. Prior to running the
> netplan apply, please do:
>
> # enable systemd-networkd debugging
> % sudo mkdir /etc/systemd/system/systemd-networkd.service.d/
> % sudo echo -e "[Service]\nEnvironment=SYSTEMD_LOG_LEVEL=debug" >
> /etc/systemd/system/systemd-networkd.service.d/10-debug.conf
> % sudo systemctl daemon-reload
>
> Then you can trigger the netplan apply. As you say, this breaks your
> connection, however I'm hoping we can get some of the
> logging written to the journal before the link goes down.
>
> After resetting/removing the network when you log back in, I would be
> interested in output from:
>
> % sudo journalctl -o short-precise -u systemd-networkd.service
>
>
> On Wed, May 2, 2018 at 10:28 AM, Frank Steinberg
> <email address hidden> wrote:
>> One thing is a little special in our network config.
>> We propagate a second net route (10.1/16) via DHCP:
>>
>> root@test1:~# netstat -rn
>> Kernel IP routing table
>> Destination Gateway Genmask Flags MSS Window irtt Iface
>> 0.0.0.0 134.169.35.2 0.0.0.0 UG 0 0 0 ens3
>> 10.1.0.0 0.0.0.0 255.255.0.0 U 0 0 0 ens3
>> 134.169.35.0 0.0.0.0 255.255.255.0 U 0 0 0 ens3
>>
>> We achieve this in dhcpd.conf:
>>
>> option rfc3442-classless-static-routes code 121 = array of integer
>> 8;
>>
>> option rfc3442-classless-static-routes 16, 10, 1, 0, 0, 0, 0, 0,
>> 134, 169, 35, 2;
>>
>>
>>> Am 02.05.2018 um 10:17 schrieb Frank Steinberg <email address hidden>:
>>>
>>> Hi Ryan,
>>>
>>> thank you very much for your prompt response and sorry for my delay.
>>> Here is the missing information. Let me know, if I could supply other
>>> details.
>>>
>>> root@test1:/etc/netplan# cat /proc/cmdline
>>> root=/dev/nfs nfsroot=134.169.34.15:/ibr/vm-root/test1 ip=::::::dhcp rw
>>>
>>> root@test1:~# ip a
>>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
>>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>>> inet 127.0.0.1/8 scope host lo
>>> valid_lft forever preferred_lft forever
>>> inet6 ::1/128 scope host
>>> valid_lft forever preferred_lft forever
>>> 2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>>> link/ether de:ad:be:ef:00:07 brd ff:ff:ff:ff:ff:ff
>>> inet 134.169.35.218/24 brd 134.169.35.255 scope global ens3
>>> valid_lft forever preferred_lft forever
>>> inet6 2001:638:602:1183:dcad:beff:feef:7/64 scope global dynamic mngtmpaddr
>>> valid_lft 86394sec preferred_lft 14394sec
>>> inet6 fe80::dcad:beff:feef:7/64 scope link
>>> valid_lft forever preferred_lft forever
>>>
>>> root@test1:/etc/netplan# scp zfs1:/tmp/config.yaml /etc/ne...

Read more...

Revision history for this message
Ryan Harper (raharper) wrote :
Download full text (7.6 KiB)

Hrm, ok, let's try this:

1) copy in your netplan yaml
2) netplan generate (this will write the networkd files to
/run/systemd/network/*
3) sudo systemctl restart systemd-networkd

If that keeps things up; then some part of the apply path is dropping
the connection

On Wed, May 2, 2018 at 11:26 AM, Frank Steinberg
<email address hidden> wrote:
> Unfortunately, nothing to read after the reboot:
>
> root@test1:~# journalctl -o short-precise -u systemd-networkd.service
> -- Logs begin at Fri 2018-04-27 15:09:00 CEST, end at Wed 2018-05-02 11:23:27 CEST. --
> -- No entries --
>
>
>> Am 02.05.2018 um 10:46 schrieb Ryan Harper <email address hidden>:
>>
>> Hi,
>>
>> Thanks for the reply.
>>
>> I think we need a bit more info, if possible. Prior to running the
>> netplan apply, please do:
>>
>> # enable systemd-networkd debugging
>> % sudo mkdir /etc/systemd/system/systemd-networkd.service.d/
>> % sudo echo -e "[Service]\nEnvironment=SYSTEMD_LOG_LEVEL=debug" >
>> /etc/systemd/system/systemd-networkd.service.d/10-debug.conf
>> % sudo systemctl daemon-reload
>>
>> Then you can trigger the netplan apply. As you say, this breaks your
>> connection, however I'm hoping we can get some of the
>> logging written to the journal before the link goes down.
>>
>> After resetting/removing the network when you log back in, I would be
>> interested in output from:
>>
>> % sudo journalctl -o short-precise -u systemd-networkd.service
>>
>>
>> On Wed, May 2, 2018 at 10:28 AM, Frank Steinberg
>> <email address hidden> wrote:
>>> One thing is a little special in our network config.
>>> We propagate a second net route (10.1/16) via DHCP:
>>>
>>> root@test1:~# netstat -rn
>>> Kernel IP routing table
>>> Destination Gateway Genmask Flags MSS Window irtt Iface
>>> 0.0.0.0 134.169.35.2 0.0.0.0 UG 0 0 0 ens3
>>> 10.1.0.0 0.0.0.0 255.255.0.0 U 0 0 0 ens3
>>> 134.169.35.0 0.0.0.0 255.255.255.0 U 0 0 0 ens3
>>>
>>> We achieve this in dhcpd.conf:
>>>
>>> option rfc3442-classless-static-routes code 121 = array of integer
>>> 8;
>>>
>>> option rfc3442-classless-static-routes 16, 10, 1, 0, 0, 0, 0, 0,
>>> 134, 169, 35, 2;
>>>
>>>
>>>> Am 02.05.2018 um 10:17 schrieb Frank Steinberg <email address hidden>:
>>>>
>>>> Hi Ryan,
>>>>
>>>> thank you very much for your prompt response and sorry for my delay.
>>>> Here is the missing information. Let me know, if I could supply other
>>>> details.
>>>>
>>>> root@test1:/etc/netplan# cat /proc/cmdline
>>>> root=/dev/nfs nfsroot=134.169.34.15:/ibr/vm-root/test1 ip=::::::dhcp rw
>>>>
>>>> root@test1:~# ip a
>>>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
>>>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>>>> inet 127.0.0.1/8 scope host lo
>>>> valid_lft forever preferred_lft forever
>>>> inet6 ::1/128 scope host
>>>> valid_lft forever preferred_lft forever
>>>> 2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
>>>> link/ether de:ad:be:ef:00:07 brd ff:ff:ff:ff:ff:ff
>>...

Read more...

Revision history for this message
Frank Steinberg (steinberg-9) wrote :
Download full text (8.9 KiB)

The system freezes again right after step (3).

> Am 02.05.2018 um 12:10 schrieb Ryan Harper <email address hidden>:
>
> Hrm, ok, let's try this:
>
> 1) copy in your netplan yaml
> 2) netplan generate (this will write the networkd files to
> /run/systemd/network/*
> 3) sudo systemctl restart systemd-networkd
>
> If that keeps things up; then some part of the apply path is dropping
> the connection
>
>
> On Wed, May 2, 2018 at 11:26 AM, Frank Steinberg
> <email address hidden> wrote:
>> Unfortunately, nothing to read after the reboot:
>>
>> root@test1:~# journalctl -o short-precise -u systemd-networkd.service
>> -- Logs begin at Fri 2018-04-27 15:09:00 CEST, end at Wed 2018-05-02 11:23:27 CEST. --
>> -- No entries --
>>
>>
>>> Am 02.05.2018 um 10:46 schrieb Ryan Harper <email address hidden>:
>>>
>>> Hi,
>>>
>>> Thanks for the reply.
>>>
>>> I think we need a bit more info, if possible. Prior to running the
>>> netplan apply, please do:
>>>
>>> # enable systemd-networkd debugging
>>> % sudo mkdir /etc/systemd/system/systemd-networkd.service.d/
>>> % sudo echo -e "[Service]\nEnvironment=SYSTEMD_LOG_LEVEL=debug" >
>>> /etc/systemd/system/systemd-networkd.service.d/10-debug.conf
>>> % sudo systemctl daemon-reload
>>>
>>> Then you can trigger the netplan apply. As you say, this breaks your
>>> connection, however I'm hoping we can get some of the
>>> logging written to the journal before the link goes down.
>>>
>>> After resetting/removing the network when you log back in, I would be
>>> interested in output from:
>>>
>>> % sudo journalctl -o short-precise -u systemd-networkd.service
>>>
>>>
>>> On Wed, May 2, 2018 at 10:28 AM, Frank Steinberg
>>> <email address hidden> wrote:
>>>> One thing is a little special in our network config.
>>>> We propagate a second net route (10.1/16) via DHCP:
>>>>
>>>> root@test1:~# netstat -rn
>>>> Kernel IP routing table
>>>> Destination Gateway Genmask Flags MSS Window irtt Iface
>>>> 0.0.0.0 134.169.35.2 0.0.0.0 UG 0 0 0 ens3
>>>> 10.1.0.0 0.0.0.0 255.255.0.0 U 0 0 0 ens3
>>>> 134.169.35.0 0.0.0.0 255.255.255.0 U 0 0 0 ens3
>>>>
>>>> We achieve this in dhcpd.conf:
>>>>
>>>> option rfc3442-classless-static-routes code 121 = array of integer
>>>> 8;
>>>>
>>>> option rfc3442-classless-static-routes 16, 10, 1, 0, 0, 0, 0, 0,
>>>> 134, 169, 35, 2;
>>>>
>>>>
>>>>> Am 02.05.2018 um 10:17 schrieb Frank Steinberg <email address hidden>:
>>>>>
>>>>> Hi Ryan,
>>>>>
>>>>> thank you very much for your prompt response and sorry for my delay.
>>>>> Here is the missing information. Let me know, if I could supply other
>>>>> details.
>>>>>
>>>>> root@test1:/etc/netplan# cat /proc/cmdline
>>>>> root=/dev/nfs nfsroot=134.169.34.15:/ibr/vm-root/test1 ip=::::::dhcp rw
>>>>>
>>>>> root@test1:~# ip a
>>>>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
>>>>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>>>>> inet 127.0.0.1/8 scope host lo
>>>>> valid_lft forever preferred_lft forever
>>>>> inet...

Read more...

Revision history for this message
Ryan Harper (raharper) wrote :
Download full text (9.8 KiB)

OK, thanks.

I think we have a bug/feature for having the initramfs use the ip=dhcp
parameter which does the initial lease to pass that information to
networkd so it can maintain the lease rather than take it down. I'll
follow up with a link to that bug once I find it.

Thank you for helping narrow it down.

On Wed, May 2, 2018 at 1:39 PM, Frank Steinberg
<email address hidden> wrote:
> The system freezes again right after step (3).
>
>> Am 02.05.2018 um 12:10 schrieb Ryan Harper <email address hidden>:
>>
>> Hrm, ok, let's try this:
>>
>> 1) copy in your netplan yaml
>> 2) netplan generate (this will write the networkd files to
>> /run/systemd/network/*
>> 3) sudo systemctl restart systemd-networkd
>>
>> If that keeps things up; then some part of the apply path is dropping
>> the connection
>>
>>
>> On Wed, May 2, 2018 at 11:26 AM, Frank Steinberg
>> <email address hidden> wrote:
>>> Unfortunately, nothing to read after the reboot:
>>>
>>> root@test1:~# journalctl -o short-precise -u systemd-networkd.service
>>> -- Logs begin at Fri 2018-04-27 15:09:00 CEST, end at Wed 2018-05-02 11:23:27 CEST. --
>>> -- No entries --
>>>
>>>
>>>> Am 02.05.2018 um 10:46 schrieb Ryan Harper <email address hidden>:
>>>>
>>>> Hi,
>>>>
>>>> Thanks for the reply.
>>>>
>>>> I think we need a bit more info, if possible. Prior to running the
>>>> netplan apply, please do:
>>>>
>>>> # enable systemd-networkd debugging
>>>> % sudo mkdir /etc/systemd/system/systemd-networkd.service.d/
>>>> % sudo echo -e "[Service]\nEnvironment=SYSTEMD_LOG_LEVEL=debug" >
>>>> /etc/systemd/system/systemd-networkd.service.d/10-debug.conf
>>>> % sudo systemctl daemon-reload
>>>>
>>>> Then you can trigger the netplan apply. As you say, this breaks your
>>>> connection, however I'm hoping we can get some of the
>>>> logging written to the journal before the link goes down.
>>>>
>>>> After resetting/removing the network when you log back in, I would be
>>>> interested in output from:
>>>>
>>>> % sudo journalctl -o short-precise -u systemd-networkd.service
>>>>
>>>>
>>>> On Wed, May 2, 2018 at 10:28 AM, Frank Steinberg
>>>> <email address hidden> wrote:
>>>>> One thing is a little special in our network config.
>>>>> We propagate a second net route (10.1/16) via DHCP:
>>>>>
>>>>> root@test1:~# netstat -rn
>>>>> Kernel IP routing table
>>>>> Destination Gateway Genmask Flags MSS Window irtt Iface
>>>>> 0.0.0.0 134.169.35.2 0.0.0.0 UG 0 0 0 ens3
>>>>> 10.1.0.0 0.0.0.0 255.255.0.0 U 0 0 0 ens3
>>>>> 134.169.35.0 0.0.0.0 255.255.255.0 U 0 0 0 ens3
>>>>>
>>>>> We achieve this in dhcpd.conf:
>>>>>
>>>>> option rfc3442-classless-static-routes code 121 = array of integer
>>>>> 8;
>>>>>
>>>>> option rfc3442-classless-static-routes 16, 10, 1, 0, 0, 0, 0, 0,
>>>>> 134, 169, 35, 2;
>>>>>
>>>>>
>>>>>> Am 02.05.2018 um 10:17 schrieb Frank Steinberg <email address hidden>:
>>>>>>
>>>>>> Hi Ryan,
>>>>>>
>>>>>> thank you very much for your prompt response and sorry for my delay.
>>>>>> Here is the missing information. Let me know, if I could supply...

Revision history for this message
Frank Steinberg (steinberg-9) wrote :
Download full text (11.5 KiB)

That sounds reasonable. Thank you!

FYI: Meanwhile, I installed a second VM with an identical configuration and (FAI-based) installation procedure, except that it uses and iSCSI volume (on the KVM level) as root device instead of nfsroot. This VM behaves fine, including the secondary net route on the ethernet interface.

> Am 02.05.2018 um 14:09 schrieb Ryan Harper <email address hidden>:
>
> OK, thanks.
>
> I think we have a bug/feature for having the initramfs use the ip=dhcp
> parameter which does the initial lease to pass that information to
> networkd so it can maintain the lease rather than take it down. I'll
> follow up with a link to that bug once I find it.
>
> Thank you for helping narrow it down.
>
> On Wed, May 2, 2018 at 1:39 PM, Frank Steinberg
> <email address hidden> wrote:
>> The system freezes again right after step (3).
>>
>>> Am 02.05.2018 um 12:10 schrieb Ryan Harper <email address hidden>:
>>>
>>> Hrm, ok, let's try this:
>>>
>>> 1) copy in your netplan yaml
>>> 2) netplan generate (this will write the networkd files to
>>> /run/systemd/network/*
>>> 3) sudo systemctl restart systemd-networkd
>>>
>>> If that keeps things up; then some part of the apply path is dropping
>>> the connection
>>>
>>>
>>> On Wed, May 2, 2018 at 11:26 AM, Frank Steinberg
>>> <email address hidden> wrote:
>>>> Unfortunately, nothing to read after the reboot:
>>>>
>>>> root@test1:~# journalctl -o short-precise -u systemd-networkd.service
>>>> -- Logs begin at Fri 2018-04-27 15:09:00 CEST, end at Wed 2018-05-02 11:23:27 CEST. --
>>>> -- No entries --
>>>>
>>>>
>>>>> Am 02.05.2018 um 10:46 schrieb Ryan Harper <email address hidden>:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Thanks for the reply.
>>>>>
>>>>> I think we need a bit more info, if possible. Prior to running the
>>>>> netplan apply, please do:
>>>>>
>>>>> # enable systemd-networkd debugging
>>>>> % sudo mkdir /etc/systemd/system/systemd-networkd.service.d/
>>>>> % sudo echo -e "[Service]\nEnvironment=SYSTEMD_LOG_LEVEL=debug" >
>>>>> /etc/systemd/system/systemd-networkd.service.d/10-debug.conf
>>>>> % sudo systemctl daemon-reload
>>>>>
>>>>> Then you can trigger the netplan apply. As you say, this breaks your
>>>>> connection, however I'm hoping we can get some of the
>>>>> logging written to the journal before the link goes down.
>>>>>
>>>>> After resetting/removing the network when you log back in, I would be
>>>>> interested in output from:
>>>>>
>>>>> % sudo journalctl -o short-precise -u systemd-networkd.service
>>>>>
>>>>>
>>>>> On Wed, May 2, 2018 at 10:28 AM, Frank Steinberg
>>>>> <email address hidden> wrote:
>>>>>> One thing is a little special in our network config.
>>>>>> We propagate a second net route (10.1/16) via DHCP:
>>>>>>
>>>>>> root@test1:~# netstat -rn
>>>>>> Kernel IP routing table
>>>>>> Destination Gateway Genmask Flags MSS Window irtt Iface
>>>>>> 0.0.0.0 134.169.35.2 0.0.0.0 UG 0 0 0 ens3
>>>>>> 10.1.0.0 0.0.0.0 255.255.0.0 U 0 0 0 ens3
>>>>>> 134.169.35.0 0.0.0.0 255.255.255.0 U 0 0 0 ens3
>...

Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

There's more code needed to allow this to work properly.

For one, we need to have netplan write "CriticalConnection=true" in systemd-networkd config for the remote root cases, so that networkd doesn't release the IP when it restarts...

This is definitely an issue that requires changes in netplan.io and initramfs; so I'm updating the targets here on the bug to reflect that.

Changed in netplan:
status: Incomplete → In Progress
importance: Undecided → High
Changed in initramfs-tools (Ubuntu):
status: New → In Progress
Changed in netplan.io (Ubuntu):
status: New → In Progress
Changed in initramfs-tools (Ubuntu):
importance: Undecided → High
Changed in netplan.io (Ubuntu):
importance: Undecided → High
Changed in netplan:
assignee: nobody → Mathieu Trudel-Lapierre (cyphermox)
Changed in netplan.io (Ubuntu):
assignee: nobody → Mathieu Trudel-Lapierre (cyphermox)
Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

This is essentially a duplicate of bug 1769682.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.