python-rtslib-fb needs to handle new attribute cpus_allowed_list

Bug #1988366 reported by Sergio Durigan Junior
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
python-rtslib-fb (Ubuntu)
Fix Released
Critical
Sergio Durigan Junior
Jammy
Fix Released
Medium
Mauricio Faria de Oliveira

Bug Description

[ Impact ]

* getting information about "attached_luns" fails via python3-rtslib-fb when running the HWE kernel on jammy due to the new kernel module attribute cpus_allowed_list

* As a consequence, the following operations on jammy fails:

  - creating an iSCSI target with Ceph-iSCSI service
   https://docs.ceph.com/en/quincy/rbd/iscsi-target-cli/

(LUN.allocate) created test-iscsi-pool/disk_1 successfully
(LUN.add_dev_to_lio) Adding image 'test-iscsi-pool/disk_1' to LIO backstore user:rbd
tcmu-runner: tcmu_rbd_open:1162 rbd/test-iscsi-pool.disk_1: address: {172.16.12.185:0/2337103748}
(LUN.add_dev_to_lio) Successfully added test-iscsi-pool/disk_1 to LIO
LUN alloc problem - Delete from LIO/backstores failed - [Errno 20] Not a directory: '/sys/kernel/config/target/iscsi/cpus_allowed_list'

  - targetcli clearconfig confirm=True

[Errno 20] Not a directory: '/sys/kernel/config/target/iscsi/cpus_allowed_list'

  - targetctl clear

$ sudo targetctl clear
Traceback (most recent call last):
  File "/usr/bin/targetctl", line 82, in <module>
    main()
  File "/usr/bin/targetctl", line 79, in main
    funcs[sys.argv[1]](savefile)
  File "/usr/bin/targetctl", line 57, in clear
    RTSRoot().clear_existing(confirm=True)
  File "/usr/lib/python3/dist-packages/rtslib_fb/root.py", line 318, in clear_existing
    so.delete()
  File "/usr/lib/python3/dist-packages/rtslib_fb/tcm.py", line 269, in delete
    for lun in self._gen_attached_luns():
  File "/usr/lib/python3/dist-packages/rtslib_fb/tcm.py", line 215, in _gen_attached_luns
    for tpgt_dir in listdir(tpgts_base):
NotADirectoryError: [Errno 20] Not a directory: '/sys/kernel/config/target/iscsi/cpus_allowed_list'

[ Test Plan ]

## create two VMs, one for the GA kernel and the other for the HWE kernel
for kernel in ga hwe; do
    uvt-kvm create \
        --cpu=4 --memory=4096 \
        rtslib-fb-sru-testing-$kernel \
        release=jammy

    uvt-kvm wait rtslib-fb-sru-testing-$kernel
    uvt-kvm ssh rtslib-fb-sru-testing-$kernel 'sudo apt-get update && sudo apt-get upgrade -y'
    uvt-kvm ssh rtslib-fb-sru-testing-$kernel 'sudo apt-get install -y python3-rtslib-fb targetcli-fb'
done

## Install the HWE kernel and reboot
uvt-kvm ssh rtslib-fb-sru-testing-hwe 'sudo apt-get install -y linux-generic-hwe-22.04 && sudo reboot'

## Upgrade python3-rtslib-fb to the -proposed one

## create the test iSCSI target based on the quickstart guide in targetcli(8)
## https://manpages.ubuntu.com/manpages/jammy/en/man8/targetcli.8.html
cat <<EOF | sudo targetcli
backstores/fileio create test /tmp/test.img 100m;
iscsi/ create iqn.2006-04.com.example:test-target;
cd iscsi/iqn.2006-04.com.example:test-target/tpg1/;
luns/ create /backstores/fileio/test;
set attribute generate_node_acls=1;
EOF

## confirm the test iSCSI target is discoverable locally and confirm the discovered one is output in the terminal
sudo iscsiadm --mode discoverydb --type sendtargets \
    --portal 127.0.0.1 --discover

## tear down the test iSCSI target and confirm there is no error returned
sudo targetcli clearconfig confirm=True

[ Where problems could occur ]

The worst case scenario is it could cause a regression to the environment where the GA kernel is still running since the fix is for newer kernels.

To mitigate the risk, the same test case will be run for both GA kernel and HWE kernel machines with -proposed package.

[ Other Info ]

* upstream fix https://github.com/open-iscsi/rtslib-fb/commit/8d2543c4da62e962661011fea5b19252b9660822

====

python-rtslib-fb needs to properly handle the new kernel module attribute cpus_allowed_list.

This is causing a problem during targetcli-fb's autopkgtest on s390x:

https://autopkgtest.ubuntu.com/results/autopkgtest-kinetic/kinetic/s390x/t/targetcli-fb/20220830_075622_04113@/log.gz

Related branches

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :
Changed in python-rtslib-fb (Ubuntu):
importance: Undecided → Critical
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package python-rtslib-fb - 2.1.74-0ubuntu5

---------------
python-rtslib-fb (2.1.74-0ubuntu5) kinetic; urgency=medium

  * d/p/handle-target-kernel-module-new-attribute-cpus_allow.patch:
    Handle new Linux kernel module attribute "cpus_allowed_list".
    (LP: #1988366)

 -- Sergio Durigan Junior <email address hidden> Wed, 31 Aug 2022 22:41:11 -0400

Changed in python-rtslib-fb (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Nobuto Murata (nobuto) wrote :

The latest LTS (jammy) is missing this patch, and causes a failure in LUN operations when the host is running the HWE kernel, v6.5.

 python3-rtslib-fb | 2.1.74-0ubuntu4 | jammy | all
 python3-rtslib-fb | 2.1.74-0ubuntu5 | mantic | all
 python3-rtslib-fb | 2.1.74-0ubuntu5 | noble | all

Those are the log lines from the ceph-iscsi use cases (to expose an RBD volume over iSCSI/LIO) and it fails to complete the export creation and will be stuck at an unrecoverable state unless manually fixing gateway.conf in rados by deleting a half broken volume.

====
(LUN.allocate) created test-iscsi-pool/disk_1 successfully
(LUN.add_dev_to_lio) Adding image 'test-iscsi-pool/disk_1' to LIO backstore user:rbd
tcmu-runner: tcmu_rbd_open:1162 rbd/test-iscsi-pool.disk_1: address: {172.16.12.185:0/2337103748}
(LUN.add_dev_to_lio) Successfully added test-iscsi-pool/disk_1 to LIO
LUN alloc problem - Delete from LIO/backstores failed - [Errno 20] Not a directory: '/sys/kernel/config/target/iscsi/cpus_allowed_list'
====

similar report:
https://bugs.launchpad.net/python-cinderclient/yoga/+bug/2008010

Revision history for this message
Nobuto Murata (nobuto) wrote :

The workaround is to switch back to GA kernel (v5.15), but it's far from ideal to be used for newer generation of servers (less than two years old).

Revision history for this message
Nobuto Murata (nobuto) wrote :

Ceph-iSCSI is a bit complicated example as a reproducer
https://docs.ceph.com/en/quincy/rbd/iscsi-overview/
But the simplest reproducer is `targetctl clear` with jammy HWE kernel.

$ sudo targetctl clear
Traceback (most recent call last):
  File "/usr/bin/targetctl", line 82, in <module>
    main()
  File "/usr/bin/targetctl", line 79, in main
    funcs[sys.argv[1]](savefile)
  File "/usr/bin/targetctl", line 57, in clear
    RTSRoot().clear_existing(confirm=True)
  File "/usr/lib/python3/dist-packages/rtslib_fb/root.py", line 318, in clear_existing
    so.delete()
  File "/usr/lib/python3/dist-packages/rtslib_fb/tcm.py", line 269, in delete
    for lun in self._gen_attached_luns():
  File "/usr/lib/python3/dist-packages/rtslib_fb/tcm.py", line 215, in _gen_attached_luns
    for tpgt_dir in listdir(tpgts_base):
NotADirectoryError: [Errno 20] Not a directory: '/sys/kernel/config/target/iscsi/cpus_allowed_list'

Nobuto Murata (nobuto)
description: updated
Revision history for this message
James Page (james-page) wrote :

Thanks Nobuto - uploaded to jammy UNAPPROVED for SRU team review.

Revision history for this message
Robie Basak (racb) wrote :

SRU review

> +-version_attributes = set(["lio_version", "version"])
> +-discovery_auth_attributes = set(["discovery_auth"])

These might be accessed by an API caller somewhere, and so would represent a regression. I spent some time digging around and didn't find a direct example, so I was going to leave it, but then I came across in rtslib/tcm.py:

> from .fabric import target_names_excludes

target_names_excludes is being maintained so I think this is OK. However, I think it demonstrates that "stuff" does make use of these names arbitrarily and so we probably shouldn't be dropping items from the module namespace if we don't have to. Would there be a problem with just dropping the dropping of these two lines, to reduce the risk that we'll break something somewhere, including something external to the archive that we cannot find, by unnecessarily dropping these? It should be trivial to do and could save users quite a bit of pain if they are impacted, even if unlikely.

Or, put another way, if we were writing this directly for Jammy, we certainly wouldn't be doing the refactoring that is being done here because it is riskier, so it seems to me that we shouldn't, and the change I suggest should be trivially safe. So I think we should do that.

Apart from this, +1 from an SRU perspective.

If you disagree let's discuss, so I'll leave this in the queue for now. If you agree, please upload an adjustment.

Changed in python-rtslib-fb (Ubuntu Jammy):
status: New → Incomplete
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Hi Robie, Sergio,

The concern about removing 'version_attributes' and 'discovery_auth_attributes'
can be addressed with this simple change/backport.

What do you think?

Don't remove the 2 attributes / move elements into 'target_name_excloudes'.
Instead, add a new attribute / its element into 'target_name_excloudes'.

i.e., this (backport):

  version_attributes = set(["lio_version", "version"])
  discovery_auth_attributes = set(["discovery_auth"])

 +cpus_allowed_list_attributes = set(["cpus_allowed_list"])

 -target_names_excludes = version_attributes | discovery_auth_attributes
 +target_names_excludes = version_attributes | discovery_auth_attributes | cpus_allowed_list_attributes

instead of this (upstream):

 -version_attributes = set(["lio_version", "version"])
 -discovery_auth_attributes = set(["discovery_auth"])
 -target_names_excludes = version_attributes | discovery_auth_attributes

 +excludes_list = [
 + # version_attributes
 + "lio_version", "version",
 + # discovery_auth_attributes
 + "discovery_auth",
 + # cpus_allowed_list_attributes
 + "cpus_allowed_list",
 +]
 +target_names_excludes = set(excludes_list)

Python confirms the results are equivalent:

 $ python3 -q

 Backport:

 >>> version_attributes = set(["lio_version", "version"])
 >>> discovery_auth_attributes = set(["discovery_auth"])
 >>> cpus_allowed_list_attributes = set(["cpus_allowed_list"])
 >>> target_names_excludes = version_attributes | discovery_auth_attributes | cpus_allowed_list_attributes

 >>> target_names_excludes
 {'lio_version', 'cpus_allowed_list', 'discovery_auth', 'version'}
 >>> backport = target_names_excludes

 Upstream:

 >>> excludes_list = [
 ... # version_attributes
 ... "lio_version", "version",
 ... # discovery_auth_attributes
 ... "discovery_auth",
 ... # cpus_allowed_list_attributes
 ... "cpus_allowed_list",
 ... ]
 >>> target_names_excludes = set(excludes_list)

 >>> target_names_excludes
 {'lio_version', 'cpus_allowed_list', 'discovery_auth', 'version'}
 >>> upstream = target_names_excludes

 The 2 sets are identical, and not the same object:

 >>> backport == upstream
 True

 >>> backport is upstream
 False

[1] https://github.com/open-iscsi/rtslib-fb/commit/8d2543c4da62e962661011fea5b19252b9660822

Revision history for this message
Jacob Becker (jacob-becker-h) wrote :

Hi
since i'm also affected by this bug/issue i wonder if there is anything i can do to help.
Any news on that or are there packages i can test ?

Kind regards
Jacob Becker

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Hi Jacob,

Thanks for being willing to help!

Could you please test the package in ppa:mfo/lp1988366 [1] ?

It has the backport described in comment #8, which may help move the SRU forward.

If any assistance is needed, just let me know.

cheers,
Mauricio

[1] https://launchpad.net/~mfo/+archive/ubuntu/lp1988366

Changed in python-rtslib-fb (Ubuntu Jammy):
status: Incomplete → Triaged
importance: Undecided → Medium
assignee: nobody → Mauricio Faria de Oliveira (mfo)
Revision history for this message
Jacob Becker (jacob-becker-h) wrote :

Hi Mauricio,

i can confirm the the issue with clearconfig no longer happens.
(NotADirectoryError: [Errno 20] Not a directory: '/sys/kernel/config/target/iscsi/cpus_allowed_list')
instead it gives a smooth "All configuration cleared" with no errors at all. :-)

This is obviously not a thorough test, but i looks ok.

cheers,
Jacob

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Hi Jacob,

Thanks for testing; that's good news.

Package uploaded to Jammy; now waiting on review by the SRU team.

cheers,
Mauricio

Changed in python-rtslib-fb (Ubuntu Jammy):
status: Triaged → In Progress
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Please test proposed package

Hello Sergio, or anyone else affected,

Accepted python-rtslib-fb into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/python-rtslib-fb/2.1.74-0ubuntu4.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in python-rtslib-fb (Ubuntu Jammy):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-jammy
Revision history for this message
Nobuto Murata (nobuto) wrote :
Download full text (4.2 KiB)

The test case in the description succeeded for both the GA kernel and HWE kernel for jammy.

[GA kernel]

ubuntu@rtslib-fb-sru-testing-ga:~$ apt policy python3-rtslib-fb
python3-rtslib-fb:
  Installed: 2.1.74-0ubuntu4.1
  Candidate: 2.1.74-0ubuntu4.1
  Version table:
 *** 2.1.74-0ubuntu4.1 500
        500 http://archive.ubuntu.com/ubuntu jammy-proposed/main amd64 Packages
        100 /var/lib/dpkg/status
     2.1.74-0ubuntu4 500
        500 http://archive.ubuntu.com/ubuntu jammy/main amd64 Packages

ubuntu@rtslib-fb-sru-testing-ga:~$ ## create the test iSCSI target based on the quickstart guide in targetcli(8)
## https://manpages.ubuntu.com/manpages/jammy/en/man8/targetcli.8.html
cat <<EOF | sudo targetcli
backstores/fileio create test /tmp/test.img 100m
iscsi/ create iqn.2006-04.com.example:test-target
cd iscsi/iqn.2006-04.com.example:test-target/tpg1/
luns/ create /backstores/fileio/test
set attribute generate_node_acls=1
EOF

Warning: Could not load preferences file /root/.targetcli/prefs.bin.
targetcli shell version 2.1.53
Copyright 2011-2013 by Datera, Inc and others.
For help on commands, type 'help'.

/> Created fileio test with size 104857600
/> Created target iqn.2006-04.com.example:test-target.
Created TPG 1.
Global pref auto_add_default_portal=true
Created default portal listening on all IPs (0.0.0.0), port 3260.
/> /iscsi/iqn.20...t-target/tpg1> Created LUN 0.
/iscsi/iqn.20...t-target/tpg1> Parameter generate_node_acls is now '1'.
/iscsi/iqn.20...t-target/tpg1> exit
Global pref auto_save_on_exit=true
Configuration saved to /etc/rtslib-fb-target/saveconfig.json

ubuntu@rtslib-fb-sru-testing-ga:~$ ## confirm the test iSCSI target is discoverable locally and confirm the discovered one is output in the terminal
sudo iscsiadm --mode discoverydb --type sendtargets \
    --portal 127.0.0.1 --discover

127.0.0.1:3260,1 iqn.2006-04.com.example:test-target

ubuntu@rtslib-fb-sru-testing-ga:~$ ## tear down the test iSCSI target and confirm there is no error returned
sudo targetcli clearconfig confirm=True
All configuration cleared

[HWE kernel]

ubuntu@rtslib-fb-sru-testing-hwe:~$ apt policy python3-rtslib-fb
python3-rtslib-fb:
  Installed: 2.1.74-0ubuntu4.1
  Candidate: 2.1.74-0ubuntu4.1
  Version table:
 *** 2.1.74-0ubuntu4.1 500
        500 http://archive.ubuntu.com/ubuntu jammy-proposed/main amd64 Packages
        100 /var/lib/dpkg/status
     2.1.74-0ubuntu4 500
        500 http://archive.ubuntu.com/ubuntu jammy/main amd64 Packages

ubuntu@rtslib-fb-sru-testing-hwe:~$ ## create the test iSCSI target based on the quickstart guide in targetcli(8)
## https://manpages.ubuntu.com/manpages/jammy/en/man8/targetcli.8.html
cat <<EOF | sudo targetcli
backstores/fileio create test /tmp/test.img 100m
iscsi/ create iqn.2006-04.com.example:test-target
cd iscsi/iqn.2006-04.com.example:test-target/tpg1/
luns/ create /backstores/fileio/test
set attribute generate_node_acls=1
EOF

Warning: Could not load preferences file /root/.targetcli/prefs.bin.
targetcli shell version 2.1.53
Copyright 2011-2013 by Datera, Inc and others.
For help on commands, type 'help'.

/> Created fileio test with size 104857600
/> Created target iqn.2006-04.com.example:tes...

Read more...

tags: added: verification-done verification-done-jammy
removed: verification-needed verification-needed-jammy
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for python-rtslib-fb has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package python-rtslib-fb - 2.1.74-0ubuntu4.1

---------------
python-rtslib-fb (2.1.74-0ubuntu4.1) jammy; urgency=medium

  [ Sergio Durigan Junior ]
  * d/p/lp1988366-handle-target-kernel-module-new-attribute-cpus_allow.patch
    Handle new Linux kernel module attribute "cpus_allowed_list", fixing the
    error "Not a directory: '/sys/kernel/config/target/iscsi/cpus_allowed_list'"
    (LP: #1988366)

 -- Mauricio Faria de Oliveira <email address hidden> Mon, 10 Jun 2024 15:05:54 -0300

Changed in python-rtslib-fb (Ubuntu Jammy):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.