/proc/modules has Null references causing python parsing issues

Bug #1757143 reported by Frank Heimes
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ubuntu on IBM z Systems
Fix Released
Undecided
Unassigned
linux (Ubuntu)
Invalid
Undecided
Unassigned
plainbox-provider-checkbox (Ubuntu)
Fix Released
High
Jeff Lane 

Bug Description

EDIT: this originally looked like some sort of regression in the module_resource script of plainbox-provider-resource, however, on diving in a bit it seems the root cause is that /proc/modules in 4.15 is different than in previous kernels.

This is a line from 4.15's /proc/modules data:
e1000e 249856 0 - Live 0x (null)

And this is the same module info from 4.13:
e1000e 249856 0 - Live 0x0000000000000000

That null character at the end appears to be causing the script to choke.

For that reason, because /proc/modules has somehow changed to include null references (0xNULL is invalid, where 0x0000000 is parsable), I've added a kernel task to this bug.

Original Summary:

I ran into this issue (and came across some further glitches) while running canonical-certification-server on Ubuntu Server 18.04 on s390x (having 16.04 full selected in canonical-certification-server user interface):

$ /usr/lib/plainbox-provider-resource-generic/bin/module_resource
Traceback (most recent call last):
  File "/usr/lib/plainbox-provider-resource-generic/bin/module_resource", line 73, in <module>
    sys.exit(main())
  File "/usr/lib/plainbox-provider-resource-generic/bin/module_resource", line 62, in main
    for module in modules:
  File "/usr/lib/plainbox-provider-resource-generic/bin/module_resource", line 57, in get_modules
    yield get_module(line)
  File "/usr/lib/plainbox-provider-resource-generic/bin/module_resource", line 49, in get_module
    "offset": int(offset, 16)}
ValueError: invalid literal for int() with base 16: '0x'

result file is attached to the ticket, as well as the console log ...

Tags: s390x
Revision history for this message
Frank Heimes (fheimes) wrote :
Revision history for this message
Frank Heimes (fheimes) wrote :
Revision history for this message
Frank Heimes (fheimes) wrote :

Outcome: job passed
-------------[ Running job 23 / 113. Estimated time left: unknown ]-------------
------------------[ Collect information about kernel modules ]------------------
ID: com.canonical.certification::module
Category: com.canonical.plainbox::uncategorised
... 8< -------------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/nest-6gxdhm_0.80e53c0441ed77c4cf9047854d346bc637d09c9c37947a2f263c82c421d37b6a/module_resource", line 73, in <module>
    sys.exit(main())
  File "/tmp/nest-6gxdhm_0.80e53c0441ed77c4cf9047854d346bc637d09c9c37947a2f263c82c421d37b6a/module_resource", line 62, in main
    for module in modules:
  File "/tmp/nest-6gxdhm_0.80e53c0441ed77c4cf9047854d346bc637d09c9c37947a2f263c82c421d37b6a/module_resource", line 57, in get_modules
    yield get_module(line)
  File "/tmp/nest-6gxdhm_0.80e53c0441ed77c4cf9047854d346bc637d09c9c37947a2f263c82c421d37b6a/module_resource", line 49, in get_module
    "offset": int(offset, 16)}
ValueError: invalid literal for int() with base 16: '0x'
------------------------------------------------------------------------- >8 ---
Outcome: job failed
-------------[ Running job 24 / 113. Estimated time left: unknown ]-------------
-------------------[ Collect information about dpkg version ]-------------------
ID: com.canonical.certification::dpkg
Category: com.canonical.plainbox::uncategorised
... 8< -------------------------------------------------------------------------
version: 1.19.0.5
architecture: s390x
------------------------------------------------------------------------- >8 ---
Outcome: job passed
-------------[ Running job 25 / 113. Estimated time left: unknown ]-------------
-------------[ Check that data for a complete result are present ]--------------
ID: com.canonical.certification::miscellanea/submission-resources
Category: com.canonical.plainbox::miscellanea
Job cannot be started because:
 - required dependency 'com.canonical.certification::dkms_info_json' has failed
 - required dependency 'com.canonical.certification::dmi' has failed
 - required dependency 'com.canonical.certification::dmi_attachment' has failed
 - required dependency 'com.canonical.certification::module' has failed
 - required dependency 'com.canonical.certification::raw_devices_dmi_json' has failed
Outcome: job cannot be started

Frank Heimes (fheimes)
summary: - problem plainbox-provider while running canonical-certification-server
- on 18.04 / s390x
+ Problem with plainbox-provider while running canonical-certification-
+ server on 18.04 / s390x
Revision history for this message
Jeff Lane  (bladernr) wrote : Re: Problem with plainbox-provider while running canonical-certification-server on 18.04 / s390x

Frank,

Are you able to run the command directly from the shell? Does it produce any output at all other than the traceback?

THe command to run is:

/usr/lib/plainbox-provider-resource-generic/bin/module_resource

Changed in plainbox-provider-checkbox (Ubuntu):
importance: Undecided → High
status: New → Confirmed
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Is this some kind of a "dependency packing solution of modules into a binary"? And is it big-endian aware?

Revision history for this message
Frank Heimes (fheimes) wrote :

One CR got lost in the bug description.
So with the CR at the right position, you can see that I directly executed "/usr/lib/plainbox-provider-resource-generic/bin/module_resource" and the output of that starts with traceback in the next line(s) ...

$ /usr/lib/plainbox-provider-resource-generic/bin/module_resource
Traceback (most recent call last):
  File "/usr/lib/plainbox-provider-resource-generic/bin/module_resource", line 73, in <module>
    sys.exit(main())
  File "/usr/lib/plainbox-provider-resource-generic/bin/module_resource", line 62, in main
    for module in modules:
  File "/usr/lib/plainbox-provider-resource-generic/bin/module_resource", line 57, in get_modules
    yield get_module(line)
  File "/usr/lib/plainbox-provider-resource-generic/bin/module_resource", line 49, in get_module
    "offset": int(offset, 16)}
ValueError: invalid literal for int() with base 16: '0x'
$

Sorry for the bad formatting in the description - I just modified it.

description: updated
Revision history for this message
Jeff Lane  (bladernr) wrote :
Download full text (5.1 KiB)

Ahhh, thanks Frank. I was a bit confused initially :)

Frank, could you also add a dump of /proc/modules to this bug?

@xnox, this is a traceback occurring in a script in the certification suite. This bug is not an s390x issue, but more one involving the /proc/modules file on 18.04 and our parsing of it. This issue does not occur on 16.04.

I have verified this on amd64 as well:
ubuntu@xwing:~$ /usr/lib/plainbox-provider-resource-generic/bin/module_resource
Traceback (most recent call last):
  File "/usr/lib/plainbox-provider-resource-generic/bin/module_resource", line 73, in <module>
    sys.exit(main())
  File "/usr/lib/plainbox-provider-resource-generic/bin/module_resource", line 62, in main
    for module in modules:
  File "/usr/lib/plainbox-provider-resource-generic/bin/module_resource", line 57, in get_modules
    yield get_module(line)
  File "/usr/lib/plainbox-provider-resource-generic/bin/module_resource", line 49, in get_module
    "offset": int(offset, 16)}
ValueError: invalid literal for int() with base 16: '0x'

and here's the content of /proc/modules:
ubuntu@xwing:~$ cat /proc/modules
nls_iso8859_1 16384 1 - Live 0x (null)
intel_rapl 20480 0 - Live 0x (null)
x86_pkg_temp_thermal 16384 0 - Live 0x (null)
intel_powerclamp 16384 0 - Live 0x (null)
coretemp 16384 0 - Live 0x (null)
kvm_intel 204800 0 - Live 0x (null)
kvm 593920 1 kvm_intel, Live 0x (null)
irqbypass 16384 1 kvm, Live 0x (null)
joydev 24576 0 - Live 0x (null)
intel_cstate 20480 0 - Live 0x (null)
intel_rapl_perf 16384 0 - Live 0x (null)
input_leds 16384 0 - Live 0x (null)
mei_me 40960 0 - Live 0x (null)
mei 90112 1 mei_me, Live 0x (null)
shpchp 36864 0 - Live 0x (null)
ie31200_edac 16384 0 - Live 0x (null)
wmi_bmof 16384 0 - Live 0x (null)
lpc_ich 24576 0 - Live 0x (null)
mac_hid 16384 0 - Live 0x (null)
sch_fq_codel 20480 10 - Live 0x (null)
ib_iser 49152 0 - Live 0x (null)
rdma_cm 61440 1 ib_iser, Live 0x (null)
iw_cm 45056 1 rdma_cm, Live 0x (null)
ib_cm 53248 1 rdma_cm, Live 0x (null)
ib_core 225280 4 ib_iser,rdma_cm,iw_cm,ib_cm, Live 0x (null)
iscsi_tcp 20480 0 - Live 0x (null)
libiscsi_tcp 20480 1 iscsi_tcp, Live 0x (null)
libiscsi 53248 3 ib_iser,iscsi_tcp,libiscsi_tcp, Live 0x (null)
scsi_transport_iscsi 98304 4 ib_iser,iscsi_tcp,libiscsi, Live 0x (null)
ip_tables 28672 0 - Live 0x (null)
x_tables 40960 1 ip_tables, Live 0x (null)
autofs4 40960 2 - Live 0x (null)
btrfs 1122304 0 - Live 0x (null)
zstd_compress 163840 1 btrfs, Live 0x (null)
raid10 53248 0 - Live 0x (null)
raid456 143360 0 - Live 0x (null)
async_raid6_recov 20480 1 raid456, Live 0x (null)
async_memcpy 16384 2 raid456,async_raid6_recov, Live 0x (null)
async_pq 16384 2 raid456,async_raid6_recov, Live 0x (null)
async_xor 16384 3 raid456,async_raid6_recov,async_pq, Live 0x (null)
async_tx 16384 5 raid456,async_raid6_recov,async_memcpy,async...

Read more...

Revision history for this message
Frank Heimes (fheimes) wrote :
Download full text (6.3 KiB)

$ cat /proc/modules
iptable_filter 16384 0 - Live 0x (null)
macsec 49152 0 - Live 0x (null)
vsock_diag 16384 0 - Live 0x (null)
vsock 45056 1 vsock_diag,[permanent], Live 0x (null)
sctp_diag 16384 0 - Live 0x (null)
sctp 385024 3 sctp_diag, Live 0x (null)
dccp_diag 16384 0 - Live 0x (null)
dccp 126976 1 dccp_diag, Live 0x (null)
tcp_diag 16384 0 - Live 0x (null)
udp_diag 16384 0 - Live 0x (null)
raw_diag 16384 0 - Live 0x (null)
inet_diag 24576 5 sctp_diag,dccp_diag,tcp_diag,udp_diag,raw_diag, Live 0x (null)
unix_diag 16384 0 - Live 0x (null)
af_packet_diag 16384 0 - Live 0x (null)
netlink_diag 16384 0 - Live 0x (null)
binfmt_misc 20480 1 - Live 0x (null)
algif_rng 16384 0 - Live 0x (null)
salsa20_generic 16384 0 - Live 0x (null)
camellia_generic 32768 0 - Live 0x (null)
cast6_generic 24576 0 - Live 0x (null)
cast_common 16384 1 cast6_generic, Live 0x (null)
serpent_generic 28672 0 - Live 0x (null)
twofish_generic 16384 0 - Live 0x (null)
twofish_common 28672 1 twofish_generic, Live 0x (null)
lrw 20480 0 - Live 0x (null)
algif_skcipher 16384 0 - Live 0x (null)
tgr192 24576 0 - Live 0x (null)
wp512 36864 0 - Live 0x (null)
rmd320 20480 0 - Live 0x (null)
rmd256 20480 0 - Live 0x (null)
rmd160 20480 0 - Live 0x (null)
rmd128 20480 0 - Live 0x (null)
md4 16384 0 - Live 0x (null)
algif_hash 20480 0 - Live 0x (null)
af_alg 28672 3 algif_rng,algif_skcipher,algif_hash, Live 0x (null)
8021q 40960 0 - Live 0x (null)
garp 20480 1 8021q, Live 0x (null)
mrp 20480 1 8021q, Live 0x (null)
stp 16384 1 garp, Live 0x (null)
llc 16384 2 garp,stp, Live 0x (null)
dm_service_time 16384 1 - Live 0x (null)
dm_multipath 40960 2 dm_service_time, Live 0x (null)
scsi_dh_rdac 20480 0 - Live 0x (null)
scsi_dh_emc 16384 0 - Live 0x (null)
scsi_dh_alua 24576 2 - Live 0x (null)
qeth_l2 57344 2 - Live 0x (null)
genwqe_card 86016 0 - Live 0x (null)
crc_itu_t 16384 1 genwqe_card, Live 0x (null)
chsc_sch 20480 0 - Live 0x (null)
eadm_sch 16384 0 - Live 0x (null)
qeth 143360 1 qeth_l2, Live 0x (null)
ctcm 102400 0 - Live 0x (null)
ccwgroup 20480 2 qeth,ctcm, Live 0x (null)
fsm 16384 1 ctcm, Live 0x (null)
vfio_ccw 24576 0 - Live 0x (null)
vfio_mdev 16384 0 - Live 0x (null)
mdev 20480 2 vfio_ccw,vfio_mdev, Live 0x (null)
zcrypt_cex4 16384 0 - Live 0x (null)
vfio_iommu_type1 28672 0 - Live 0x (null)
vfio 36864 3 vfio_ccw,vfio_mdev,vfio_iommu_type1, Live 0x (null)
zcrypt 69632 1 zcrypt_cex4, Live 0x (null)
sch_fq_codel 20480 3 - Live 0x (null)
iscsi_tcp 20480 0 - Live 0x (null)
libiscsi_tcp 32768 1 iscsi_tcp, Live 0x (null)
rdma_ucm 28672 0 - Live 0x (null)
ib_umad 24576 0 - Live 0x (null)
ib_ipoib ...

Read more...

Revision history for this message
Jeff Lane  (bladernr) wrote :
Download full text (4.1 KiB)

So now I've got a dump of /proc/modules from 16.04.4 (4.13) and the issue is much more apparent:
ubuntu@xwing:~$ cat /proc/modules
nls_iso8859_1 16384 1 - Live 0x0000000000000000
ppdev 20480 0 - Live 0x0000000000000000
intel_rapl 20480 0 - Live 0x0000000000000000
x86_pkg_temp_thermal 16384 0 - Live 0x0000000000000000
intel_powerclamp 16384 0 - Live 0x0000000000000000
coretemp 16384 0 - Live 0x0000000000000000
kvm_intel 204800 0 - Live 0x0000000000000000
joydev 20480 0 - Live 0x0000000000000000
input_leds 16384 0 - Live 0x0000000000000000
kvm 589824 1 kvm_intel, Live 0x0000000000000000
irqbypass 16384 1 kvm, Live 0x0000000000000000
intel_cstate 20480 0 - Live 0x0000000000000000
intel_rapl_perf 16384 0 - Live 0x0000000000000000
mei_me 40960 0 - Live 0x0000000000000000
lpc_ich 24576 0 - Live 0x0000000000000000
wmi_bmof 16384 0 - Live 0x0000000000000000
mei 102400 1 mei_me, Live 0x0000000000000000
ie31200_edac 16384 0 - Live 0x0000000000000000
shpchp 36864 0 - Live 0x0000000000000000
parport_pc 32768 0 - Live 0x0000000000000000
parport 49152 2 ppdev,parport_pc, Live 0x0000000000000000
mac_hid 16384 0 - Live 0x0000000000000000
ib_iser 49152 0 - Live 0x0000000000000000
rdma_cm 57344 1 ib_iser, Live 0x0000000000000000
iw_cm 45056 1 rdma_cm, Live 0x0000000000000000
ib_cm 49152 1 rdma_cm, Live 0x0000000000000000
ib_core 217088 4 ib_iser,rdma_cm,iw_cm,ib_cm, Live 0x0000000000000000
iscsi_tcp 20480 0 - Live 0x0000000000000000
libiscsi_tcp 20480 1 iscsi_tcp, Live 0x0000000000000000
libiscsi 53248 3 ib_iser,iscsi_tcp,libiscsi_tcp, Live 0x0000000000000000
scsi_transport_iscsi 98304 4 ib_iser,iscsi_tcp,libiscsi, Live 0x0000000000000000
autofs4 40960 2 - Live 0x0000000000000000
btrfs 1101824 0 - Live 0x0000000000000000
raid10 49152 0 - Live 0x0000000000000000
raid456 143360 0 - Live 0x0000000000000000
async_raid6_recov 20480 1 raid456, Live 0x0000000000000000
async_memcpy 16384 2 raid456,async_raid6_recov, Live 0x0000000000000000
async_pq 16384 2 raid456,async_raid6_recov, Live 0x0000000000000000
async_xor 16384 3 raid456,async_raid6_recov,async_pq, Live 0x0000000000000000
async_tx 16384 5 raid456,async_raid6_recov,async_memcpy,async_pq,async_xor, Live 0x0000000000000000
xor 24576 2 btrfs,async_xor, Live 0x0000000000000000
hid_generic 16384 0 - Live 0x0000000000000000
hid_logitech_hidpp 32768 0 - Live 0x0000000000000000
hid_logitech_dj 20480 0 - Live 0x0000000000000000
usbhid 49152 0 - Live 0x0000000000000000
hid 118784 4 hid_generic,hid_logitech_hidpp,hid_logitech_dj,usbhid, Live 0x0000000000000000
raid6_pq 118784 4 btrfs,raid456,async_raid6_recov,async_pq, Live 0x0000000000000000
libcrc32c 16384 1 raid456, Live 0x0000000000000000
raid1 40960 0 - Live 0x0000000000000000
raid0 20480 0 - Live 0x0000000000000000
multipath 16384 0 - Live 0x0000000000000000
linear 16384 0 - Live 0x0000000000000000
i915 1830912 1 - Live 0x0000000000000000
crct10dif_pclmul 16384 0 - Live 0x0000000000000000
crc32_pclmul 16384 0 - Live 0x0000000000000000
ghash_clmulni_intel 16384 0 - Live 0x0000000000000000
pcbc 16384 0 - Live 0x0000000000000000
aesni_intel 188416 0 - Live 0x0000000000000000
drm_kms_helper 167936 1 i915, Live 0x0000000000000000
aes_x86_64 20480 1 aesni_intel, Live ...

Read more...

summary: - Problem with plainbox-provider while running canonical-certification-
- server on 18.04 / s390x
+ /proc/modules has Null references causing python parsing issues
Jeff Lane  (bladernr)
description: updated
Changed in plainbox-provider-checkbox (Ubuntu):
assignee: nobody → Jeff Lane (bladernr)
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1757143

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Jeff Lane  (bladernr) wrote :

After chatting with xnox on IRC, he discovered that the output of catting /proc/modules in 4.15 now depends on who is doing the catting.

So `cat /proc/modules` as a normal user will return null references while `sudo cat /proc/modules` will return the actual memory offsets, accurately.

For example:
ubuntu@xwing:~$ sudo cat /proc/modules |grep e1000e
e1000e 249856 0 - Live 0xffffffffc0225000
ptp 20480 2 igb,e1000e, Live 0xffffffffc00a9000
ubuntu@xwing:~$ cat /proc/modules |grep e1000e
e1000e 249856 0 - Live 0x (null)
ptp 20480 2 igb,e1000e, Live 0x (null)

So for the cert tooling, this should be a simple fix then, just to ensure that the module script runs as root.

Changed in plainbox-provider-checkbox (Ubuntu):
status: Confirmed → Triaged
status: Triaged → In Progress
status: In Progress → Fix Committed
Jeff Lane  (bladernr)
Changed in plainbox-provider-checkbox (Ubuntu):
status: Fix Committed → In Progress
Revision history for this message
Jeff Lane  (bladernr) wrote :

I'm going to go ahead and mark the kernel task as invalid as the workaround is to read /proc/modules as root.

Changed in plainbox-provider-checkbox (Ubuntu):
status: In Progress → Fix Committed
Changed in linux (Ubuntu):
status: Incomplete → Invalid
Jeff Lane  (bladernr)
Changed in plainbox-provider-checkbox (Ubuntu):
status: Fix Committed → Fix Released
Frank Heimes (fheimes)
Changed in ubuntu-z-systems:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.