drbd not working after kernel upgrade 5.0.x -> 5.3.x

Bug #1866458 reported by a1bert
20
This bug affects 2 people
Affects Status Importance Assigned to Milestone
drbd-utils (Ubuntu)
Fix Released
Undecided
Unassigned
Bionic
Fix Released
Medium
Rafael David Tinoco

Bug Description

[Impact]

 * One can't manage drbd resources through drbdadm command after kernel has been upgraded to 5.3 (latest HWE kernel).

[Test Case]

$ sudo dd if=/dev/zero of=/.loop bs=1M count=1024

$ sudo losetup --find --show /.loop
/dev/loop0

$ cat /etc/drbd.d/r0.res
resource r0 {
        protocol C;
        startup {
                wfc-timeout 15;
                degr-wfc-timeout 60;
        }
        net {
                cram-hmac-alg sha1;
                shared-secret "secret";
        }
        on drbdfix {
                device /dev/drbd0;
                disk /dev/loop0;
                address 10.250.99.202:7788;
                meta-disk internal;
        }
        on drbdnon {
                device /dev/drbd0;
                disk /dev/loop0;
                address 192.168.0.2:7788;
                meta-disk internal;
        }
}

 * Check that with kernel 5.0.0 drbdadm command works fine for the configured resource:

$ uname -a
Linux drbdfix 5.0.0-43-generic #47~18.04.1-Ubuntu SMP Mon Mar 2 04:28:21 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

$ sudo drbdadm create-md r0
initializing activity log
NOT initializing bitmap
Writing meta data...
New drbd meta data block successfully created.

$ sudo drbdadm -v up r0
drbdsetup-84 new-resource r0
drbdsetup-84 new-minor r0 0 0
drbdmeta 0 v08 /dev/loop0 internal apply-al
drbdsetup-84 attach 0 /dev/loop0 /dev/loop0 internal
drbdsetup-84 connect r0 ipv4:10.250.99.202:7788 ipv4:192.168.0.2:7788 --protocol=C --cram-hmac-alg=sha1 --shared-secret=secret

 * And with kernel 5.3.0 it does not:

$ uname -a
Linux drbdfix 5.3.0-42-generic #34~18.04.1-Ubuntu SMP Fri Feb 28 13:42:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

$ sudo drbdadm -v up r0
drbdsetup-84 new-resource r0
r0: Invalid argument
Command 'drbdsetup-84 new-resource r0' terminated with exit code 20
drbdadm: new-minor r0: skipped due to earlier error

[Regression Potential]

 * Very minor in this case as it adds a single flag to nla_put() attribute argument.
* Based in upstream patch fixing the issue, test case is fixed.

 * [racb] Older kernels still supported in affected Ubuntu releases may not understand the new flag, causing unexpected failure or other unexpected behaviour. #ifndef doesn't mitigate this since the new flag constant would be available at build time (we only build one src:drbd-utils for all kernels and don't ship a different set of binary packages per kernel).

[Other Info]

 * Original Case Description:

I am not able to bring drbd resource up after kernel upgrade (5.0 -> 5.3)

/sbin/drbdadm -v up amail
drbdsetup-84 new-resource amail
amail: Invalid argument
Command 'drbdsetup-84 new-resource amail' terminated with exit code 20
drbdadm: new-minor amail: skipped due to earlier error

it maybe this issue:

https://<email address hidden>/msg64900.html

but not tested by me

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: drbd-utils 8.9.10-2
ProcVersionSignature: Ubuntu 5.3.0-40.32~18.04.1-generic 5.3.18
Uname: Linux 5.3.0-40-generic x86_64
ApportVersion: 2.20.9-0ubuntu7.11
Architecture: amd64
Date: Sat Mar 7 13:39:39 2020
Dependencies:
 gcc-8-base 8.3.0-6ubuntu1~18.04.1
 libc6 2.27-3ubuntu1
 libgcc1 1:8.3.0-6ubuntu1~18.04.1
 libstdc++6 8.3.0-6ubuntu1~18.04.1
 lsb-base 9.20170808ubuntu1
InstallationDate: Installed on 2019-10-22 (136 days ago)
InstallationMedia: Ubuntu-Server 18.04.3 LTS "Bionic Beaver" - Release amd64 (20190805)
SourcePackage: drbd-utils
UpgradeStatus: No upgrade log present (probably fresh install)
mtime.conffile..etc.drbd.d.global_common.conf: 2019-12-05T11:34:58.322390

Related branches

Revision history for this message
a1bert (a1bert) wrote :
Revision history for this message
a1bert (a1bert) wrote :

drbd-utils_9.5.0-1_amd64.deb does not work either..

Changed in drbd-utils (Ubuntu):
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Thanks for your report @a1bert!

The commit that likely fixes the issue is this one:

  1 commit 859151b2
  2 Author: He Zhe <email address hidden>
  3 Date: Fri Jul 12 07:07:27 2019
  4
  5 netlink: Add NLA_F_NESTED flag to nested attribute
  6
  7 The mainline kernel v5.2 commit b424e432e770
  8 ("netlink: add validation of NLA_F_NESTED flag") imposes strict validation
  9 against nested attribute as follow.
 10
 11 "
 12 Add new validation flag NL_VALIDATE_NESTED which adds three consistency
 13 checks of NLA_F_NESTED_FLAG:
 14
 15 - the flag is set on attributes with NLA_NESTED{,_ARRAY} policy
 16 - the flag is not set on attributes with other policies except NLA_UNSPEC
 17 - the flag is set on attribute passed to nla_parse_nested()
 18 "
 19
 20 Sending messages with nested attribute without NLA_F_NESTED would cause failed
 21 validation. For example,
 22
 23 $ drbdsetup new-resource r0
 24 Invalid argument
 25
 26 This patch adds NLA_F_NESTED flag to all nested attributes.
 27
 28 Signed-off-by: He Zhe <email address hidden>

and a simple:

+#ifndef NLA_F_NESTED
+#define NLA_F_NESTED 0
+#endif

I'll try to fix this in the next days or so.

ADVICE (for other users who find this):

For now mitigation is to stay in the previous kernel and wait until this bug is either Fix Proposed (-proposed repo) or Fix Released (-updates).

Changed in drbd-utils (Ubuntu):
status: Triaged → Confirmed
Robie Basak (racb)
tags: added: server-next
description: updated
Changed in drbd-utils (Ubuntu):
status: Confirmed → In Progress
status: In Progress → Fix Released
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
importance: Medium → Undecided
Changed in drbd-utils (Ubuntu Bionic):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

After applying the given patch:

(k)rafaeldtinoco@drbdfix:~$ sudo dd if=/dev/zero of=/.loop bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.93099 s, 556 MB/s
(k)rafaeldtinoco@drbdfix:~$ sudo losetup --find --show /.loop
/dev/loop0
(k)rafaeldtinoco@drbdfix:~$ sudo modprobe drbd
(k)rafaeldtinoco@drbdfix:~$ sudo drbdadm create-md r0
initializing activity log
NOT initializing bitmap
Writing meta data...
New drbd meta data block successfully created.
(k)rafaeldtinoco@drbdfix:~$ sudo drbdadm -v up r0
drbdsetup-84 new-resource r0
drbdsetup-84 new-minor r0 0 0
drbdmeta 0 v08 /dev/loop0 internal apply-al
drbdsetup-84 attach 0 /dev/loop0 /dev/loop0 internal
drbdsetup-84 connect r0 ipv4:10.250.99.202:7788 ipv4:192.168.0.2:7788 --protocol=C --cram-hmac-alg=sha1 --shared-secret=secret
(k)rafaeldtinoco@drbdfix:~$ uname -a
Linux drbdfix 5.3.0-42-generic #34~18.04.1-Ubuntu SMP Fri Feb 28 13:42:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

description: updated
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

I'm having some issues pushing drbd-utils to my launchpad git repo and launchpad team is working on it. Meanwhile I'm attaching this bug the debdiff AND sending source package to a PPA to get reviews and feedbacks.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

@a1bert: please check PPA and if the fix solves your initial issue if you can. I'm pretty confident it will as the testcase is simple and I was able to verify on my side... anyway =). I'll wait launchpad to fix this small issue to push the merge review into ubuntu-server upload queue so a colleague can review it before I upload the SRU.

Will get back to this soon...

description: updated
Revision history for this message
a1bert (a1bert) wrote : Re: [Bug 1866458] Re: drbd not working after kernel upgrade 5.0.x -> 5.3.x

Hello,

I can confirm it is working...

great work thanks

jn

On 10. 03. 20 1:24, Rafael David Tinoco wrote:
> @a1bert: please check PPA and if the fix solves your initial issue if
> you can. I'm pretty confident it will as the testcase is simple and I
> was able to verify on my side... anyway =). I'll wait launchpad to fix
> this small issue to push the merge review into ubuntu-server upload
> queue so a colleague can review it before I upload the SRU.
>
> Will get back to this soon...
>

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Thanks for the feedback. I have uploaded the fix:

rafaeldtinoco@workstation:~/.../drbd-utils$ dput ubuntu ../drbd-utils_8.9.10-2ubuntu0.1_source.changes
Checking signature on .changes
gpg: ../drbd-utils_8.9.10-2ubuntu0.1_source.changes: Valid signature from A93E0E0AD83C0D0F
Checking signature on .dsc
gpg: ../drbd-utils_8.9.10-2ubuntu0.1.dsc: Valid signature from A93E0E0AD83C0D0F
Uploading to ubuntu (via ftp to upload.ubuntu.com):
  Uploading drbd-utils_8.9.10-2ubuntu0.1.dsc: done.
  Uploading drbd-utils_8.9.10-2ubuntu0.1.debian.tar.xz: done.
  Uploading drbd-utils_8.9.10-2ubuntu0.1_source.buildinfo: done.
  Uploading drbd-utils_8.9.10-2ubuntu0.1_source.changes: done.
Successfully uploaded packages.

And it shall be ready in -proposed soon. Please note that I have changed version from ubuntu1 to ubuntu0.1 so any upgrades in systems using previous PPA will have to pay attention to that (might have to be manually done).

-rafaeldtinoco

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Fix works in kernel 5.0 as well:

(k)rafaeldtinoco@drbdfix:~$ sudo dd if=/dev/zero of=/.loop bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.71956 s, 624 MB/s
(k)rafaeldtinoco@drbdfix:~$ sudo losetup --find --show /.loop
/dev/loop0
(k)rafaeldtinoco@drbdfix:~$ sudo modprobe drbd
(k)rafaeldtinoco@drbdfix:~$ sudo drbdadm create-md r0
initializing activity log
NOT initializing bitmap
Writing meta data...
New drbd meta data block successfully created.
(k)rafaeldtinoco@drbdfix:~$ sudo drbdadm -v up r0
drbdsetup-84 new-resource r0
drbdsetup-84 new-minor r0 0 0
drbdmeta 0 v08 /dev/loop0 internal apply-al
drbdsetup-84 attach 0 /dev/loop0 /dev/loop0 internal
drbdsetup-84 connect r0 ipv4:10.250.99.202:7788 ipv4:192.168.0.2:7788 --protocol=C --cram-hmac-alg=sha1 --shared-secret=secret
(k)rafaeldtinoco@drbdfix:~$ uname -a
Linux drbdfix 5.0.0-43-generic #47~18.04.1-Ubuntu SMP Mon Mar 2 04:28:21 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Robie Basak (racb) wrote :

@Rafael,

What about against 4.15 - I think that's the original shipped Bionic kernel that's still supported?

Robie Basak (racb)
description: updated
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

@rbasak,

## Before my package is installed:

$ dpkg -l drbd-utils | grep RAID
ii drbd-utils 8.9.10-2 amd64 RAID 1 over TCP/IP for Linux (user utilities)

$ sudo losetup --find --show /.loop
/dev/loop0

$ sudo drbdadm create-md r0
initializing activity log
NOT initializing bitmap
Writing meta data...
New drbd meta data block successfully created.

$ sudo drbdadm -v up r0
drbdsetup-84 new-resource r0
drbdsetup-84 new-minor r0 0 0
drbdmeta 0 v08 /dev/loop0 internal apply-al
drbdsetup-84 attach 0 /dev/loop0 /dev/loop0 internal
drbdsetup-84 connect r0 ipv4:10.250.99.19:7788 ipv4:10.250.99.25:7788 --protocol=C --cram-hmac-alg=sha1 --shared-secret=secret

$ uname -a
Linux drbdtest01 4.15.0-92-generic #93-Ubuntu SMP Mon Mar 16 19:44:23 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

$ ls /dev/drbd0
/dev/drbd0

## And after my package is installed:

(k)rafaeldtinoco@drbdtest01:~$ uname -a
Linux drbdtest01 4.15.0-92-generic #93-Ubuntu SMP Mon Mar 16 19:44:23 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

(k)rafaeldtinoco@drbdtest01:~$ sudo losetup --find --show /.loop
/dev/loop0

(k)rafaeldtinoco@drbdtest01:~$ sudo drbdadm create-md r0
initializing activity log
NOT initializing bitmap
Writing meta data...
New drbd meta data block successfully created.

(k)rafaeldtinoco@drbdtest01:~$ sudo drbdadm -v up r0
drbdsetup-84 new-resource r0
drbdsetup-84 new-minor r0 0 0
drbdmeta 0 v08 /dev/loop0 internal apply-al
drbdsetup-84 attach 0 /dev/loop0 /dev/loop0 internal
drbdsetup-84 connect r0 ipv4:10.250.99.19:7788 ipv4:10.250.99.25:7788 --protocol=C --cram-hmac-alg=sha1 --shared-secret=secret

(k)rafaeldtinoco@drbdtest01:~$ dpkg -l drbd-utils | grep RAID
ii drbd-utils 8.9.10-2ubuntu1~ppa1 amd64 RAID 1 over TCP/IP for Linux (user utilities)

Revision history for this message
Robie Basak (racb) wrote :

Thank you for checking! Unfortunately we're going to need to do this again once the final proposed binaries are built. When they are, please could you do the normal SRU verification but against each currently supported major kernel version?

Changed in drbd-utils (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-bionic
Revision history for this message
Robie Basak (racb) wrote : Please test proposed package

Hello a1bert, or anyone else affected,

Accepted drbd-utils into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/drbd-utils/8.9.10-2ubuntu0.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

# verification with v4.15 kernel:

(k)rafaeldtinoco@drbdtest01:~$ sudo dd if=/dev/zero of=/.loop bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.25576 s, 476 MB/s

(k)rafaeldtinoco@drbdtest01:~$ sudo losetup --find --show /.loop
/dev/loop0

(k)rafaeldtinoco@drbdtest01:~$ sudo drbdadm create-md r0
initializing activity log
NOT initializing bitmap
Writing meta data...
New drbd meta data block successfully created.

(k)rafaeldtinoco@drbdtest01:~$ sudo drbdadm -v up r0
drbdsetup-84 new-resource r0
drbdsetup-84 new-minor r0 0 0
drbdmeta 0 v08 /dev/loop0 internal apply-al
drbdsetup-84 attach 0 /dev/loop0 /dev/loop0 internal
drbdsetup-84 connect r0 ipv4:10.250.99.19:7788 ipv4:10.250.99.25:7788 --protocol=C --cram-hmac-alg=sha1 --shared-secret=secret

(k)rafaeldtinoco@drbdtest01:~$ uname -a
Linux drbdtest01 4.15.0-92-generic #93-Ubuntu SMP Mon Mar 16 19:44:23 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

# verification with 5.0 kernel

(k)rafaeldtinoco@drbdtest01:~$ sudo dd if=/dev/zero of=/.loop bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.25077 s, 477 MB/s
(k)rafaeldtinoco@drbdtest01:~$ sudo losetup --find --show /.loop
/dev/loop0
(k)rafaeldtinoco@drbdtest01:~$ sudo drbdadm create-md r0
initializing activity log
NOT initializing bitmap
Writing meta data...
New drbd meta data block successfully created.
(k)rafaeldtinoco@drbdtest01:~$ sudo drbdadm -v up r0
drbdsetup-84 new-resource r0
drbdsetup-84 new-minor r0 0 0
drbdmeta 0 v08 /dev/loop0 internal apply-al
drbdsetup-84 attach 0 /dev/loop0 /dev/loop0 internal
drbdsetup-84 connect r0 ipv4:10.250.99.19:7788 ipv4:10.250.99.25:7788 --protocol=C --cram-hmac-alg=sha1 --shared-secret=secret
(k)rafaeldtinoco@drbdtest01:~$ uname -a
Linux drbdtest01 5.0.0-44-generic #48~18.04.1-Ubuntu SMP Wed Mar 18 09:11:43 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

# verification with 5.3 kernel

(k)rafaeldtinoco@drbdtest01:~$ sudo dd if=/dev/zero of=/.loop bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.35899 s, 455 MB/s
(k)rafaeldtinoco@drbdtest01:~$ sudo losetup --find --show /.loop
/dev/loop0
(k)rafaeldtinoco@drbdtest01:~$ sudo drbdadm create-md r0
initializing activity log
NOT initializing bitmap
Writing meta data...
New drbd meta data block successfully created.
(k)rafaeldtinoco@drbdtest01:~$ sudo drbdadm -v up r0
drbdsetup-84 new-resource r0
drbdsetup-84 new-minor r0 0 0
drbdmeta 0 v08 /dev/loop0 internal apply-al
drbdsetup-84 attach 0 /dev/loop0 /dev/loop0 internal
drbdsetup-84 connect r0 ipv4:10.250.99.19:7788 ipv4:10.250.99.25:7788 --protocol=C --cram-hmac-alg=sha1 --shared-secret=secret
(k)rafaeldtinoco@drbdtest01:~$ uname -a
Linux drbdtest01 5.3.0-43-generic #36~18.04.2-Ubuntu SMP Thu Mar 19 16:03:35 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

tags: added: verification-done verification-done-bionic
removed: verification-needed verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package drbd-utils - 8.9.10-2ubuntu0.1

---------------
drbd-utils (8.9.10-2ubuntu0.1) bionic; urgency=medium

  * d/p/lp1866458-add-NLA_F_NESTED-flag.patch (LP: #1866458):
    - Fix missing flag for HWE kernels >= v5.3.0

 -- Rafael David Tinoco <email address hidden> Mon, 09 Mar 2020 23:59:37 +0000

Changed in drbd-utils (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for drbd-utils has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.