2019-12-18 17:01:34 |
Eric Desrochers |
bug |
|
|
added bug |
2019-12-18 17:01:58 |
Eric Desrochers |
bug task added |
|
udev (Ubuntu) |
|
2019-12-18 17:03:09 |
Eric Desrochers |
description |
This is reproducible in Bionic and late.
Here's an example running 'focal':
$ lsb_release -cs
focal
$ uname -r
5.3.0-24-generic
How to trigger it:
$ sosreport -o block
or more precisely the command causing the situation inside the block plugin:
$ parted -s /dev/$(losetup -f) unit s print
https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52
but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors.
While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device.
What is the difference between loop2 and loop3 and other unused one ?
3 things so far I have noticed:
* The loop device need to be the next unused loop device (losetup -f)
* A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime
* I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices
/sys/block/loop2/stat
::::::::::::::
2 0 10 0 1 0 0 0 0 0 0
while /dev/loop3 doesn't
/sys/block/loop2/stat
::::::::::::::
0 0 0 0 0 0 0 0 0 0 0
Explanation of each column:
https://meet.google.com/linkredirect?authuser=0&dest=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Fblock%2Fstat.html
Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well.
If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device.
If I query loop3 with 'parted' right after, no errors.
If I reboot, and query loop3 again, then no I'll have an error.
To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f).
This was tested with focal/systemd whic his very close to latest upstream code.
This has been test with latest v5.5 kernel as well. For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device at boot. |
This is reproducible in Bionic and late.
Here's an example running 'focal':
$ lsb_release -cs
focal
$ uname -r
5.3.0-24-generic
How to trigger it:
$ sosreport -o block
or more precisely the command causing the situation inside the block plugin:
$ parted -s /dev/$(losetup -f) unit s print
https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52
but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors.
While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device.
What is the difference between loop2 and loop3 and other unused one ?
3 things so far I have noticed:
* The loop device need to be the next unused loop device (losetup -f)
* A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime
* I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device.
/sys/block/loop2/stat
::::::::::::::
2 0 10 0 1 0 0 0 0 0 0
while /dev/loop3 doesn't
/sys/block/loop3/stat
::::::::::::::
0 0 0 0 0 0 0 0 0 0 0
Explanation of each column:
https://meet.google.com/linkredirect?authuser=0&dest=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Fblock%2Fstat.html
Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well.
If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device.
If I query loop3 with 'parted' right after, no errors.
If I reboot, and query loop3 again, then no I'll have an error.
To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f).
This was tested with focal/systemd whic his very close to latest upstream code.
This has been test with latest v5.5 kernel as well. For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device at boot. |
|
2019-12-18 17:05:41 |
Eric Desrochers |
description |
This is reproducible in Bionic and late.
Here's an example running 'focal':
$ lsb_release -cs
focal
$ uname -r
5.3.0-24-generic
How to trigger it:
$ sosreport -o block
or more precisely the command causing the situation inside the block plugin:
$ parted -s /dev/$(losetup -f) unit s print
https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52
but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors.
While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device.
What is the difference between loop2 and loop3 and other unused one ?
3 things so far I have noticed:
* The loop device need to be the next unused loop device (losetup -f)
* A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime
* I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device.
/sys/block/loop2/stat
::::::::::::::
2 0 10 0 1 0 0 0 0 0 0
while /dev/loop3 doesn't
/sys/block/loop3/stat
::::::::::::::
0 0 0 0 0 0 0 0 0 0 0
Explanation of each column:
https://meet.google.com/linkredirect?authuser=0&dest=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Fblock%2Fstat.html
Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well.
If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device.
If I query loop3 with 'parted' right after, no errors.
If I reboot, and query loop3 again, then no I'll have an error.
To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f).
This was tested with focal/systemd whic his very close to latest upstream code.
This has been test with latest v5.5 kernel as well. For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device at boot. |
This is reproducible in Bionic and late.
Here's an example running 'focal':
$ lsb_release -cs
focal
$ uname -r
5.3.0-24-generic
How to trigger it:
$ sosreport -o block
or more precisely the command causing the situation inside the block plugin:
$ parted -s /dev/$(losetup -f) unit s print
https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52
but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors.
While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device.
What is the difference between loop2 and loop3 and other unused one ?
3 things so far I have noticed:
* The loop device need to be the next unused loop device (losetup -f)
* A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime
* I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device.
/sys/block/loop2/stat
::::::::::::::
2 0 10 0 1 0 0 0 0 0 0
2 = number of read I/Os processed
10 = number of sectors read
1 = number of write I/Os processed
Explanation of each column:
https://www.kernel.org/doc/html/latest/block/stat.html
while /dev/loop3 doesn't
/sys/block/loop3/stat
::::::::::::::
0 0 0 0 0 0 0 0 0 0 0
Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well.
If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device.
If I query loop3 with 'parted' right after, no errors.
If I reboot, and query loop3 again, then no I'll have an error.
To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f).
This was tested with focal/systemd whic his very close to latest upstream code.
This has been test with latest v5.5 kernel as well. For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device at boot. |
|
2019-12-18 17:07:44 |
Eric Desrochers |
tags |
|
sts |
|
2019-12-18 17:20:56 |
Eric Desrochers |
description |
This is reproducible in Bionic and late.
Here's an example running 'focal':
$ lsb_release -cs
focal
$ uname -r
5.3.0-24-generic
How to trigger it:
$ sosreport -o block
or more precisely the command causing the situation inside the block plugin:
$ parted -s /dev/$(losetup -f) unit s print
https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52
but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors.
While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device.
What is the difference between loop2 and loop3 and other unused one ?
3 things so far I have noticed:
* The loop device need to be the next unused loop device (losetup -f)
* A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime
* I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device.
/sys/block/loop2/stat
::::::::::::::
2 0 10 0 1 0 0 0 0 0 0
2 = number of read I/Os processed
10 = number of sectors read
1 = number of write I/Os processed
Explanation of each column:
https://www.kernel.org/doc/html/latest/block/stat.html
while /dev/loop3 doesn't
/sys/block/loop3/stat
::::::::::::::
0 0 0 0 0 0 0 0 0 0 0
Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well.
If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device.
If I query loop3 with 'parted' right after, no errors.
If I reboot, and query loop3 again, then no I'll have an error.
To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f).
This was tested with focal/systemd whic his very close to latest upstream code.
This has been test with latest v5.5 kernel as well. For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device at boot. |
This is reproducible in Bionic and late.
Here's an example running 'focal':
$ lsb_release -cs
focal
$ uname -r
5.3.0-24-generic
The error is:
blk_update_request: I/O error, dev loop2, sector 0
How to trigger it:
$ sosreport -o block
or more precisely the cmd causing the situation inside the block plugin:
$ parted -s /dev/$(losetup -f) unit s print
https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52
but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors.
While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device.
What is differentiate loop2 and loop3 and any other unused ones ?
3 things so far I have noticed:
* loop2 is the next unused loop device (losetup -f)
* A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime
* I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device.
/sys/block/loop2/stat
::::::::::::::
2 0 10 0 1 0 0 0 0 0 0
2 = number of read I/Os processed
10 = number of sectors read
1 = number of write I/Os processed
Explanation of each column:
https://www.kernel.org/doc/html/latest/block/stat.html
while /dev/loop3 doesn't
/sys/block/loop3/stat
::::::::::::::
0 0 0 0 0 0 0 0 0 0 0
Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well enough.
If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device.
If I query loop3 with 'parted' right after, no errors.
If I reboot, and query loop3 again, then no I'll have an error.
To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f).
This was tested with focal/systemd whic his very close to latest upstream code.
This has been test with latest v5.5 mainline kernel as well.
For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device (or block device) at boot. |
|
2019-12-18 17:23:29 |
Eric Desrochers |
description |
This is reproducible in Bionic and late.
Here's an example running 'focal':
$ lsb_release -cs
focal
$ uname -r
5.3.0-24-generic
The error is:
blk_update_request: I/O error, dev loop2, sector 0
How to trigger it:
$ sosreport -o block
or more precisely the cmd causing the situation inside the block plugin:
$ parted -s /dev/$(losetup -f) unit s print
https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52
but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors.
While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device.
What is differentiate loop2 and loop3 and any other unused ones ?
3 things so far I have noticed:
* loop2 is the next unused loop device (losetup -f)
* A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime
* I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device.
/sys/block/loop2/stat
::::::::::::::
2 0 10 0 1 0 0 0 0 0 0
2 = number of read I/Os processed
10 = number of sectors read
1 = number of write I/Os processed
Explanation of each column:
https://www.kernel.org/doc/html/latest/block/stat.html
while /dev/loop3 doesn't
/sys/block/loop3/stat
::::::::::::::
0 0 0 0 0 0 0 0 0 0 0
Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well enough.
If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device.
If I query loop3 with 'parted' right after, no errors.
If I reboot, and query loop3 again, then no I'll have an error.
To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f).
This was tested with focal/systemd whic his very close to latest upstream code.
This has been test with latest v5.5 mainline kernel as well.
For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device (or block device) at boot. |
This is reproducible in Bionic and late.
Here's an example running 'focal':
$ lsb_release -cs
focal
$ uname -r
5.3.0-24-generic
The error is:
blk_update_request: I/O error, dev loop2, sector 0
How to trigger it:
$ sosreport -o block
or more precisely the cmd causing the situation inside the block plugin:
$ parted -s /dev/$(losetup -f) unit s print
https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52
but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors.
While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device.
What is differentiate loop2 and loop3 and any other unused ones ?
3 things so far I have noticed:
* loop2 is the next unused loop device (losetup -f)
* A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime
* I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device.
/sys/block/loop2/stat
::::::::::::::
2 0 10 0 1 0 0 0 0 0 0
2 = number of read I/Os processed
10 = number of sectors read
1 = number of write I/Os processed
Explanation of each column:
https://www.kernel.org/doc/html/latest/block/stat.html
while /dev/loop3 doesn't
/sys/block/loop3/stat
::::::::::::::
0 0 0 0 0 0 0 0 0 0 0
Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well enough.
If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device.
If I query loop3 with 'parted' right after, no errors.
If I reboot, and query loop3 again, then no I'll have an error.
To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f).
This was tested with focal/systemd whic his very close to latest upstream code.
This has been test with latest v5.5 mainline kernel as well.
For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device (or block device) at boot. |
|
2019-12-18 17:36:42 |
Eric Desrochers |
description |
This is reproducible in Bionic and late.
Here's an example running 'focal':
$ lsb_release -cs
focal
$ uname -r
5.3.0-24-generic
The error is:
blk_update_request: I/O error, dev loop2, sector 0
How to trigger it:
$ sosreport -o block
or more precisely the cmd causing the situation inside the block plugin:
$ parted -s /dev/$(losetup -f) unit s print
https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52
but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors.
While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device.
What is differentiate loop2 and loop3 and any other unused ones ?
3 things so far I have noticed:
* loop2 is the next unused loop device (losetup -f)
* A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime
* I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device.
/sys/block/loop2/stat
::::::::::::::
2 0 10 0 1 0 0 0 0 0 0
2 = number of read I/Os processed
10 = number of sectors read
1 = number of write I/Os processed
Explanation of each column:
https://www.kernel.org/doc/html/latest/block/stat.html
while /dev/loop3 doesn't
/sys/block/loop3/stat
::::::::::::::
0 0 0 0 0 0 0 0 0 0 0
Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well enough.
If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device.
If I query loop3 with 'parted' right after, no errors.
If I reboot, and query loop3 again, then no I'll have an error.
To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f).
This was tested with focal/systemd whic his very close to latest upstream code.
This has been test with latest v5.5 mainline kernel as well.
For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device (or block device) at boot. |
This is reproducible in Bionic and late.
Here's an example running 'focal':
$ lsb_release -cs
focal
$ uname -r
5.3.0-24-generic
The error is:
blk_update_request: I/O error, dev loop2, sector 0
How to trigger it:
$ sosreport -o block
or more precisely the cmd causing the situation inside the block plugin:
$ parted -s $(losetup -f) unit s print
https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52
but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors.
While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device.
What is differentiate loop2 and loop3 and any other unused ones ?
3 things so far I have noticed:
* loop2 is the next unused loop device (losetup -f)
* A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime
* I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device.
/sys/block/loop2/stat
::::::::::::::
2 0 10 0 1 0 0 0 0 0 0
2 = number of read I/Os processed
10 = number of sectors read
1 = number of write I/Os processed
Explanation of each column:
https://www.kernel.org/doc/html/latest/block/stat.html
while /dev/loop3 doesn't
/sys/block/loop3/stat
::::::::::::::
0 0 0 0 0 0 0 0 0 0 0
Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well enough.
If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device.
If I query loop3 with 'parted' right after, no errors.
If I reboot, and query loop3 again, then no I'll have an error.
To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f).
This was tested with focal/systemd whic his very close to latest upstream code.
This has been test with latest v5.5 mainline kernel as well.
For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device (or block device) at boot. |
|
2019-12-18 17:39:50 |
Dan Streetman |
bug |
|
|
added subscriber Dan Streetman |
2019-12-18 18:08:14 |
Eric Desrochers |
bug watch added |
|
https://github.com/sosreport/sos/issues/1897 |
|
2019-12-18 18:13:37 |
Eric Desrochers |
description |
This is reproducible in Bionic and late.
Here's an example running 'focal':
$ lsb_release -cs
focal
$ uname -r
5.3.0-24-generic
The error is:
blk_update_request: I/O error, dev loop2, sector 0
How to trigger it:
$ sosreport -o block
or more precisely the cmd causing the situation inside the block plugin:
$ parted -s $(losetup -f) unit s print
https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52
but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors.
While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device.
What is differentiate loop2 and loop3 and any other unused ones ?
3 things so far I have noticed:
* loop2 is the next unused loop device (losetup -f)
* A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime
* I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device.
/sys/block/loop2/stat
::::::::::::::
2 0 10 0 1 0 0 0 0 0 0
2 = number of read I/Os processed
10 = number of sectors read
1 = number of write I/Os processed
Explanation of each column:
https://www.kernel.org/doc/html/latest/block/stat.html
while /dev/loop3 doesn't
/sys/block/loop3/stat
::::::::::::::
0 0 0 0 0 0 0 0 0 0 0
Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well enough.
If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device.
If I query loop3 with 'parted' right after, no errors.
If I reboot, and query loop3 again, then no I'll have an error.
To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f).
This was tested with focal/systemd whic his very close to latest upstream code.
This has been test with latest v5.5 mainline kernel as well.
For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device (or block device) at boot. |
This is reproducible in Bionic and late.
Here's an example running 'focal':
$ lsb_release -cs
focal
$ uname -r
5.3.0-24-generic
The error is:
blk_update_request: I/O error, dev loop2, sector 0
and on more recent kernel:
kernel: [18135.185709] blk_update_request: I/O error, dev loop18, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
How to trigger it:
$ sosreport -o block
or more precisely the cmd causing the situation inside the block plugin:
$ parted -s $(losetup -f) unit s print
https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52
but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors.
While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device.
What is differentiate loop2 and loop3 and any other unused ones ?
3 things so far I have noticed:
* loop2 is the next unused loop device (losetup -f)
* A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime
* I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device.
/sys/block/loop2/stat
::::::::::::::
2 0 10 0 1 0 0 0 0 0 0
2 = number of read I/Os processed
10 = number of sectors read
1 = number of write I/Os processed
Explanation of each column:
https://www.kernel.org/doc/html/latest/block/stat.html
while /dev/loop3 doesn't
/sys/block/loop3/stat
::::::::::::::
0 0 0 0 0 0 0 0 0 0 0
Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well enough.
If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device.
If I query loop3 with 'parted' right after, no errors.
If I reboot, and query loop3 again, then no I'll have an error.
To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f).
This was tested with focal/systemd whic his very close to latest upstream code.
This has been test with latest v5.5 mainline kernel as well.
For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device (or block device) at boot. |
|
2019-12-18 19:21:49 |
Eric Desrochers |
bug task added |
|
snapd (Ubuntu) |
|
2019-12-18 23:05:49 |
Kai Kasurinen |
bug |
|
|
added subscriber Kai Kasurinen |
2019-12-19 12:53:51 |
John Lenton |
snapd (Ubuntu): status |
New |
Invalid |
|
2019-12-19 17:04:09 |
Eric Desrochers |
bug task added |
|
linux (Ubuntu) |
|
2019-12-19 17:30:09 |
Ubuntu Kernel Bot |
linux (Ubuntu): status |
New |
Incomplete |
|
2019-12-25 18:23:53 |
Eric Desrochers |
bug task added |
|
parted (Ubuntu) |
|
2020-03-30 20:48:53 |
Dan Streetman |
systemd (Ubuntu): status |
New |
Invalid |
|
2020-08-06 21:12:28 |
Kai Kasurinen |
udev (Ubuntu): status |
New |
Invalid |
|
2021-02-22 16:50:57 |
Mauricio Faria de Oliveira |
linux (Ubuntu): status |
Incomplete |
In Progress |
|
2021-02-22 16:51:03 |
Mauricio Faria de Oliveira |
linux (Ubuntu): importance |
Undecided |
Medium |
|
2021-02-22 16:51:09 |
Mauricio Faria de Oliveira |
linux (Ubuntu): assignee |
|
Mauricio Faria de Oliveira (mfo) |
|
2021-02-22 16:51:15 |
Mauricio Faria de Oliveira |
parted (Ubuntu): status |
New |
Invalid |
|
2021-02-22 16:53:58 |
Mauricio Faria de Oliveira |
linux (Ubuntu): assignee |
Mauricio Faria de Oliveira (mfo) |
Eric Desrochers (slashd) |
|
2021-05-25 11:18:37 |
Eric Desrochers |
linux (Ubuntu): assignee |
Eric Desrochers (slashd) |
|
|
2023-03-14 23:06:11 |
Mauricio Faria de Oliveira |
bug |
|
|
added subscriber Mauricio Faria de Oliveira |
2023-03-14 23:07:08 |
Mauricio Faria de Oliveira |
nominated for series |
|
Ubuntu Jammy |
|
2023-03-14 23:07:08 |
Mauricio Faria de Oliveira |
bug task added |
|
parted (Ubuntu Jammy) |
|
2023-03-14 23:07:08 |
Mauricio Faria de Oliveira |
bug task added |
|
udev (Ubuntu Jammy) |
|
2023-03-14 23:07:08 |
Mauricio Faria de Oliveira |
bug task added |
|
linux (Ubuntu Jammy) |
|
2023-03-14 23:07:08 |
Mauricio Faria de Oliveira |
bug task added |
|
systemd (Ubuntu Jammy) |
|
2023-03-14 23:07:08 |
Mauricio Faria de Oliveira |
bug task added |
|
snapd (Ubuntu Jammy) |
|
2023-03-14 23:07:27 |
Mauricio Faria de Oliveira |
linux (Ubuntu): status |
In Progress |
Fix Released |
|
2023-03-14 23:08:12 |
Mauricio Faria de Oliveira |
linux (Ubuntu Jammy): status |
New |
Fix Released |
|
2023-03-14 23:08:24 |
Mauricio Faria de Oliveira |
parted (Ubuntu Jammy): status |
New |
Invalid |
|
2023-03-14 23:08:43 |
Mauricio Faria de Oliveira |
bug task deleted |
parted (Ubuntu Jammy) |
|
|
2023-03-14 23:08:52 |
Mauricio Faria de Oliveira |
bug task deleted |
snapd (Ubuntu Jammy) |
|
|
2023-03-14 23:09:01 |
Mauricio Faria de Oliveira |
bug task deleted |
udev (Ubuntu Jammy) |
|
|
2023-03-14 23:09:14 |
Mauricio Faria de Oliveira |
bug task deleted |
systemd (Ubuntu Jammy) |
|
|
2023-03-20 14:17:02 |
Dan Streetman |
removed subscriber Dan Streetman |
|
|
|
2023-03-29 20:25:43 |
Jorge Merlino |
description |
This is reproducible in Bionic and late.
Here's an example running 'focal':
$ lsb_release -cs
focal
$ uname -r
5.3.0-24-generic
The error is:
blk_update_request: I/O error, dev loop2, sector 0
and on more recent kernel:
kernel: [18135.185709] blk_update_request: I/O error, dev loop18, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
How to trigger it:
$ sosreport -o block
or more precisely the cmd causing the situation inside the block plugin:
$ parted -s $(losetup -f) unit s print
https://github.com/sosreport/sos/blob/master/sos/plugins/block.py#L52
but if I run it on the next next unused loop device, in this case /dev/loop3 (which is also unused), no errors.
While I agree that sosreport shouldn't query unused loop devices, there is definitely something going on with the next unused loop device.
What is differentiate loop2 and loop3 and any other unused ones ?
3 things so far I have noticed:
* loop2 is the next unused loop device (losetup -f)
* A reboot is needed (if some loop modification (snap install, mount loop, ...) has been made at runtime
* I have also noticed that loop2 (or whatever the next unused one is) have some stat as oppose to other unused loop devices. The stat exist already right after the system boot for the next unused loop device.
/sys/block/loop2/stat
::::::::::::::
2 0 10 0 1 0 0 0 0 0 0
2 = number of read I/Os processed
10 = number of sectors read
1 = number of write I/Os processed
Explanation of each column:
https://www.kernel.org/doc/html/latest/block/stat.html
while /dev/loop3 doesn't
/sys/block/loop3/stat
::::::::::::::
0 0 0 0 0 0 0 0 0 0 0
Which tells me that something during the boot process most likely acquired (on purpose or not) the next unused loop and possibly didn't released it well enough.
If loop2 is generating errors, and I install a snap, the snap squashfs will take loop2, making loop3 the next unused loop device.
If I query loop3 with 'parted' right after, no errors.
If I reboot, and query loop3 again, then no I'll have an error.
To triggers the errors it need to be after a reboot and it only impact the first unused loop device available (losetup -f).
This was tested with focal/systemd whic his very close to latest upstream code.
This has been test with latest v5.5 mainline kernel as well.
For now, I don't think it's a kernel problem, I'm more thinking of a userspace misbehaviour dealing with loop device (or block device) at boot. |
[Impact]
* There's an I/O error on fsync() in a detached loop device if it has
been previously attached. The issue is that write cache is enabled in
the attach path in loop_configure() but it isn't disabled in the detach
path; thus it remains enabled in the block device regardless of whether
it is attached or not.
* fsync() on detached loop devices can be called by partition tools and
commands run by sosreport, so the unexpected kernel error message might
surprise users or even distract from the actual issue being
investigatedr. It might also trigger alerts in
logging/monitoring/alerting stacks
[Fix]
* Disable write cache in the detach path
[Test Plan]
* Attach and detach an image to a loop device and test fsync return
value aterwards
# DEV=/dev/loop7
# IMG=/tmp/image
# truncate --size 1M $IMG
# losetup $DEV $IMG
# losetup -d $DEV
Before:
# strace -e fsync parted -s $DEV print 2>&1 | grep fsync
fsync(3) = -1 EIO (Input/output error)
Warning: Error fsyncing/closing /dev/loop7: Input/output error
[ 982.529929] blk_update_request: I/O error, dev loop7, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
After:
# strace -e fsync parted -s $DEV print 2>&1 | grep fsync
fsync(3) = 0
[Where problems could occur]
* The detach path for block devices is modified. Worst case scenario
would be an error when detaching loop devices. |
|
2023-03-29 20:26:00 |
Jorge Merlino |
linux (Ubuntu): milestone |
|
focal-updates |
|
2023-03-29 20:26:13 |
Jorge Merlino |
linux (Ubuntu): milestone |
focal-updates |
|
|
2023-03-29 20:26:59 |
Jorge Merlino |
nominated for series |
|
Ubuntu Focal |
|
2023-03-29 20:26:59 |
Jorge Merlino |
bug task added |
|
parted (Ubuntu Focal) |
|
2023-03-29 20:26:59 |
Jorge Merlino |
bug task added |
|
udev (Ubuntu Focal) |
|
2023-03-29 20:26:59 |
Jorge Merlino |
bug task added |
|
linux (Ubuntu Focal) |
|
2023-03-29 20:26:59 |
Jorge Merlino |
bug task added |
|
systemd (Ubuntu Focal) |
|
2023-03-29 20:26:59 |
Jorge Merlino |
bug task added |
|
snapd (Ubuntu Focal) |
|
2023-03-29 20:27:20 |
Jorge Merlino |
bug task deleted |
udev (Ubuntu Focal) |
|
|
2023-03-29 20:27:27 |
Jorge Merlino |
bug task deleted |
systemd (Ubuntu Focal) |
|
|
2023-03-29 20:27:33 |
Jorge Merlino |
bug task deleted |
snapd (Ubuntu Focal) |
|
|
2023-03-29 20:27:43 |
Jorge Merlino |
bug task deleted |
parted (Ubuntu Focal) |
|
|
2023-03-29 20:27:59 |
Jorge Merlino |
linux (Ubuntu Focal): status |
New |
In Progress |
|
2023-03-29 20:28:04 |
Jorge Merlino |
linux (Ubuntu Focal): assignee |
|
Jorge Merlino (jorge-merlino) |
|
2023-03-30 07:35:04 |
Stefan Bader |
linux (Ubuntu Focal): importance |
Undecided |
Medium |
|
2023-03-30 07:35:12 |
Stefan Bader |
linux (Ubuntu Jammy): importance |
Undecided |
Medium |
|
2023-03-30 20:08:13 |
Jorge Merlino |
nominated for series |
|
Ubuntu Bionic |
|
2023-03-30 20:08:13 |
Jorge Merlino |
bug task added |
|
parted (Ubuntu Bionic) |
|
2023-03-30 20:08:13 |
Jorge Merlino |
bug task added |
|
udev (Ubuntu Bionic) |
|
2023-03-30 20:08:13 |
Jorge Merlino |
bug task added |
|
linux (Ubuntu Bionic) |
|
2023-03-30 20:08:13 |
Jorge Merlino |
bug task added |
|
systemd (Ubuntu Bionic) |
|
2023-03-30 20:08:13 |
Jorge Merlino |
bug task added |
|
snapd (Ubuntu Bionic) |
|
2023-03-30 20:08:21 |
Jorge Merlino |
bug task deleted |
parted (Ubuntu Bionic) |
|
|
2023-03-30 20:08:28 |
Jorge Merlino |
bug task deleted |
snapd (Ubuntu Bionic) |
|
|
2023-03-30 20:08:36 |
Jorge Merlino |
bug task deleted |
systemd (Ubuntu Bionic) |
|
|
2023-03-30 20:08:43 |
Jorge Merlino |
bug task deleted |
udev (Ubuntu Bionic) |
|
|
2023-03-30 20:08:56 |
Jorge Merlino |
linux (Ubuntu Bionic): status |
New |
In Progress |
|
2023-03-30 20:08:59 |
Jorge Merlino |
linux (Ubuntu Bionic): assignee |
|
Jorge Merlino (jorge-merlino) |
|
2023-04-06 13:22:28 |
Stefan Bader |
linux (Ubuntu Focal): status |
In Progress |
Fix Committed |
|
2023-04-06 13:22:32 |
Stefan Bader |
linux (Ubuntu Bionic): importance |
Undecided |
Medium |
|
2023-04-19 17:46:33 |
Ubuntu Kernel Bot |
tags |
sts |
kernel-spammed-focal-linux sts verification-needed-focal |
|
2023-04-20 20:27:32 |
Jorge Merlino |
tags |
kernel-spammed-focal-linux sts verification-needed-focal |
kernel-spammed-focal-linux sts verification-done-focal |
|
2023-05-11 13:00:15 |
Luke Nowakowski-Krijger |
linux (Ubuntu Bionic): status |
In Progress |
Fix Committed |
|
2023-05-22 09:24:28 |
Launchpad Janitor |
linux (Ubuntu Focal): status |
Fix Committed |
Fix Released |
|
2023-05-22 09:24:28 |
Launchpad Janitor |
cve linked |
|
2023-1075 |
|
2023-05-22 09:24:28 |
Launchpad Janitor |
cve linked |
|
2023-1118 |
|
2023-05-22 22:11:36 |
Ubuntu Kernel Bot |
tags |
kernel-spammed-focal-linux sts verification-done-focal |
kernel-spammed-focal-linux kernel-spammed-focal-linux-bluefield sts verification-needed-focal |
|
2023-05-23 14:49:16 |
Jorge Merlino |
tags |
kernel-spammed-focal-linux kernel-spammed-focal-linux-bluefield sts verification-needed-focal |
kernel-spammed-focal-linux kernel-spammed-focal-linux-bluefield sts verification-done-focal |
|
2023-06-06 12:31:40 |
Ubuntu Kernel Bot |
tags |
kernel-spammed-focal-linux kernel-spammed-focal-linux-bluefield sts verification-done-focal |
kernel-spammed-focal-linux kernel-spammed-focal-linux-aws kernel-spammed-focal-linux-bluefield sts verification-needed-focal |
|
2023-06-06 12:52:45 |
Ubuntu Kernel Bot |
tags |
kernel-spammed-focal-linux kernel-spammed-focal-linux-aws kernel-spammed-focal-linux-bluefield sts verification-needed-focal |
kernel-spammed-focal-linux kernel-spammed-focal-linux-aws kernel-spammed-focal-linux-azure kernel-spammed-focal-linux-bluefield sts verification-needed-focal |
|
2023-10-11 01:08:50 |
Nobuto Murata |
bug |
|
|
added subscriber Nobuto Murata |