120 sec kernel timeout is seen during SCSI remove_device.

Bug #1914456 reported by Jitendra Sayanekar
28
This bug affects 2 people
Affects Status Importance Assigned to Milestone
open-iscsi (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

Ubuntu Release
--------------
#lsb_release -rd
Description: Ubuntu 18.04.5 LTS
Release: 18.04

#cat /proc/version_signature
Ubuntu 4.15.0-122.124-generic 4.15.18

Package version
---------------
#apt-cache policy open-iscsi
open-iscsi:
  Installed: 2.0.874-5ubuntu2.10
  Candidate: 2.0.874-5ubuntu2.10

Problem statement and details
-----------------------------
During the automation testing of vendor's iSCSI target using Open-iscsi initiator, Initiator host reported 120 second kernel hang issue. The automation test was doing SCSI remove_device operation when the issue observered.
Automation test perform following sequence of operations,
1. Establish iSCSI session
2. Create bunch of iSCSI LUNs.
3. Discover LUNs through sysfs scan
4. Format LUNs
5. Perform IO
6. Remove LUN
7. Delete LUN

Observations from initiator host:
1. Already discovered iSCSI LUNs went to offline state.
2. New LUNs are not being discovered.
3. NOP-in/NOP-out PDU exchange works fine from the iSCSI session.

Note: Single iSCSI session is present between initiator and target.

Expected behavior
-----------------
SCSI remove_device should succeed and automation test should continue.
Issue is observed even with following commit, which has fix for similar issue.
https://kernel.ubuntu.com/git/ubuntu/ubuntu-bionic.git/commit/?id=27dfa4073289ee5737d45b4cfa40b11f5cdeeaa5

Stack trace
-----------
[91832.800739] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[91832.809982] Call Trace:
[91832.809994] __schedule+0x24e/0x880
[91832.810002] ? __enqueue_entity+0x5c/0x60
[91832.810006] ? select_task_rq_fair+0x642/0xab0
[91832.810008] schedule+0x2c/0x80
[91832.810010] schedule_preempt_disabled+0xe/0x10
[91832.810012] __mutex_lock.isra.5+0x276/0x4e0
[91832.810017] ? kernfs_name_hash+0x17/0x80
[91832.810020] __mutex_lock_slowpath+0x13/0x20
[91832.810021] ? __mutex_lock_slowpath+0x13/0x20
[91832.810023] mutex_lock+0x2f/0x40
[91832.810030] scsi_remove_device+0x1e/0x40
[91832.810033] sdev_store_delete+0x55/0xa0
[91832.810036] dev_attr_store+0x1b/0x30
[91832.810039] sysfs_kf_write+0x3c/0x50
[91832.810040] kernfs_fop_write+0x125/0x1a0
[91832.810046] __vfs_write+0x1b/0x40
[91832.810048] vfs_write+0xb1/0x1a0
[91832.810050] SyS_write+0x5c/0xe0
[91832.810055] do_syscall_64+0x73/0x130
[91832.810058] entry_SYSCALL_64_after_hwframe+0x41/0xa6

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in open-iscsi (Ubuntu):
status: New → Confirmed
Michael Reed (mreed8855)
information type: Public → Private
Revision history for this message
Jitendra Sayanekar (jsayanek) wrote :

Please note that,
Problem observed here is not iSCSI specific as it is seen while deleting the SCSI device from sysfs i.e. Linux SCSI subsystem is performing SCSI sysfs operation that caused 120 second timeout.

Michael Reed (mreed8855)
description: updated
information type: Private → Public
Revision history for this message
Jitendra Sayanekar (jsayanek) wrote :

Just for reference,
Following command is used to delete SCSI device.
echo "1" >> /sys/block/<device_name>/device/delete

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi Jitendra, thanks for your report! Could you collect a sosreport for us, in order we can have more information (even to try reproducing the issue in-house)?
For that, please execute the following commands as root user:

apt-get install sosreport
sosreport

A "sosreport-XXXXX.tar.xz" file will be created in your /tmp directory with some information from the system; if you feel there are information there you don't want to share publicly, let me know and we can consider options.
Thanks in advance,

Guilherme

Revision history for this message
Jitendra Sayanekar (jsayanek) wrote :

Thanks for the response Guilherme.
I will work with my team to see if there is anything specific that needs to be filter out. Will provide you an update soon.

Revision history for this message
Jitendra Sayanekar (jsayanek) wrote :

Hi Guilherme,
There is some confidential data present in sosreport and it is scattered all over the place. So team is evaluating if .xz file can be sanitize.

Meanwhile, Can you please let me know, specific information that I can extract from the system to share with you?

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi Jitendra, thanks for your prompt response. I'd like to ask you the following data in order to speed-up the debug process:

1) Right before the test, please collect the outputs of: "dmesg", "lspci -vvv", "lsblk", "ls -l /sys/block", "mount"

2) After the I/O test + SCSI removal, collect please a "dmesg", "lsblk" and "mount" please.

This is an initial data collection, will help to understand the state of the system before and after the issue. About the sosreport, let's forget about it for a while, I guess this data is good enough for now.
Thanks in advance!

Revision history for this message
Jitendra Sayanekar (jsayanek) wrote :

Hi Guilherme,
Here is the update from my side.
1. sosreport command hung at below message,
   Setting up archive ...
   Setting up plugins ...
   Running plugins. Please wait ...

   Finishing plugins [Running: block btrfs]

   Plugin block timed out
   Plugin btrfs timed out
2. Automation test is long running test that perform below operations in sequence for infinite iterations.
  a. Create bunch of iSCSI LUNs.
  b. Discover LUNs through sysfs scan
  c. Format LUNs
  d. Perform IO
  e. Remove LUN
  f. Delete LUN
3. It takes couple of days to reproduce the issue and its not 100% reproducible.
4. Currently system is in repro state.
So I would like to request,
If it is possible to get onto WebEx and collect specific information from system? I would like to capture as much info as possible before trying another repro attempt.

Thanks for the support.

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi Jitendra, thanks for the heads-up. Can you collect the output of the following commands, in the machines's current state ? (no need to reproduce for that)

"dmesg", "lspci -vvv", "lsblk", "ls -l /sys/block", "mount"

That'd help a lot to determine what is your hardware and maybe help me to reproduce in-house.
Thanks,

Guilherme

Revision history for this message
Jitendra Sayanekar (jsayanek) wrote :

Hi Guilherme,
Please find attachment containing output of commands,
"dmesg", "lspci -vvv", "lsblk", "ls -l /sys/block", "mount".

Also please note that Ubuntu host is running in a virtualized environment.

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi Jitendra, thanks for the data you provided. I have some questions in order I can better simulate your environment:

(a) You said "vendor's iSCSI target" <- can you elaborate more about this, like which vendor/solution are we talking about?

(b) You mention, in the last comment: "note that Ubuntu host is running in a virtualized environment" <- the scsi removal is being executed in a virtualized env? If so, what is the hypervisor/VMM used? KVM/qemu, VMWare, etc...
Any clarification regarding the environment virtualization is very useful.

(c) "Currently system is in repro state" <- this is interesting, is it still in reproduce state? If so, can you run the following commands as root user?

echo 1 > /proc/sys/kernel/sysrq
dmesg -c > /root/dmesg.out

echo w > /proc/sysrq
dmesg > /root/dmesg.w

echo l > /proc/sysrq
dmesg > /root/dmesg.l

This would help to understand if there's a potential deadlock ongoing.
Thanks in advance,

Guilherme

Revision history for this message
Jitendra Sayanekar (jsayanek) wrote :

Hey Guilherme,
Please find below response,

(#a) We are using Cisco Hyperflex iSCSI storage. Following link has more details about it.
https://www.cisco.com/c/en/us/td/docs/hyperconverged_systems/HyperFlex_HX_DataPlatformSoftware/AdminGuide/4-5/b-hxdp-admin-guide-4-5/m-hxdp-iscsi-manage.html

(#b) Please ignore the comment about the "Ubuntu host running in a Virtualized Environment". Ubuntu iSCSI initiator is running on a bare metal.

(#c) Yes, the setup is still in repro state. Please refer attached files for output of above commands.

Revision history for this message
Jitendra Sayanekar (jsayanek) wrote :
Revision history for this message
Jitendra Sayanekar (jsayanek) wrote :

Following command failed
root# echo w > /proc/sysrq
bash: /proc/sysrq: No such file or directory

So tried below command and it worked,
root# echo w > /proc/sysrq-trigger

Revision history for this message
Jitendra Sayanekar (jsayanek) wrote :
Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Thanks for the clarification Jitendra, and for correcting my command! It was indeed wrong, apologies - the correct file to write is the one you've used, "/proc/sysrq-trigger".

So, it seems we have a lock "situation" in iSCSI/scsi, I'll need to study more the code to understand how that is triggered and if it's expected when removing a scsi device suddenly. I want to try reproducing this - the data you provided is useful for that.
Cheers,

Guilherme

Revision history for this message
Jitendra Sayanekar (jsayanek) wrote :

Thank you Guilherme for the response and taking time to look at this.
Thought of checking with you,
 if there is any further update on reproducing issue at your end? or
 need more information from the repro setup that I have?

Regards
Jitendra

Revision history for this message
Jitendra Sayanekar (jsayanek) wrote :

Hi Guilherme,

Sorry to bother you again.
Just thought of checking with you if you have any update on below,
1. Are you able to repro issue at your end?
2. Were you able to confirm if its a lock issue?

Thanks
Jitendra

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi Jitendra, my sincere apologies for the delay, I'm still trying to reproduce this.
I hope to have more meaningful updates for you until the end of this week.

Meanwhile, let me ask you something: would you be wiling to collect a kdump if the issue reproduces again? That'd be a helpful tool to investigate the issue. If so, I can suggest a configuration setting for you.

Thanks!

Revision history for this message
Jitendra Sayanekar (jsayanek) wrote :

Thank you so much for the update, Guilherme.

I can surely try to reproduce issue again to collect kdump, Please share the configuration setting for kdump.

One question,
Is there anything more that can be captured from the current repro setup? If not, then I will repurpose the same setup.

Thanks
Jitendra

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi Jitendra, I am sorry for the delay - I've tried to reproduce both using regular SCSI device in a virtual machine and by using iSCSI, and neither was a successful reproducer. Based on the stack traces I see in your "echo w" output, it seems to be related with your special iSCSI target.
Also, was the SCSI device removed holding a btrfs filesystem?

I don't see any benefit in keeping the machine in this state, please go ahead and repurpose that to try reproducing; but I'd like you to set kdump before, if possible, so we collect the dump over there. Also, please try using the latest Bionic kernel 4.15.0-139.

In order to setup kdump, please run the following as root user:
1) apt-get update; apt-get install linux-crashdump

2) Answer the installer questions using the default responses

3) Edit the file "/etc/default/grub.d/kdump-tools.cfg" and change the crashkernel setting to be something like "crashkernel=440M" - you can try a bit less memory, but since it's hard to reproduce, it's safer to keep a large value to prevent kdump failure

4) Please execute, as root user: echo "kernel.hung_task_panic=1" >> /etc/sysctl.conf

5) Reboot the machine and check the output of "kdump-config show" - it should show that kdump is ready. If so, please try a dummy kdump to check if it's working, by running:

echo 1 > /proc/sys/kernel/sysrq
echo c > /proc/sysrq-trigger

6) If it works, the node should be rebooted and you should have a dump collected on /var/crash/ .
In that case, go ahead and try to reproduce.

Thanks for your effort here - I'll be out next week, as soon as I'm back I'll continue the work.
Cheers,

Guilherme

Revision history for this message
Jitendra Sayanekar (jsayanek) wrote :

Thanks Guilherme for the details to setup KDUMP.
I tried these steps on Ubuntu VM (just to ensure I don't mess up with physical setup) and but landed into some issues.
I am still working on to configure KDUMP and repro the issue.
Will keep you posted as things progress.

Regards
Jitendra

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi Jitendra, thanks for the effort! Let me know your issues with kdump, hope I can help.
Cheers,

Guilherme

Revision history for this message
Jitendra Sayanekar (jsayanek) wrote :

Hi Guilherme,
I have configured kdump, enabled "/proc/sys/kernel/hung_task_panic" and started the test.
Observed couple of test failures but those are unrelated.
Will keep monitoring setup and update you as soon as I hit repro.
Jitendra

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Perfect, thank you!

Revision history for this message
Jitendra Sayanekar (jsayanek) wrote :

Hi Guilherme,
Quick update on this, test is still running and haven't seen kernel panic so far.
I will keep it running for few more days.
Regards
Jitendra

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Cool, that's indeed good news Jitendra. Thanks for the heads-up.
Cheers,

Guilherme

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi Jitendra, how are you doing? Is there any news on this bug? If you cannot reproduce it anymore, we could mark it as "Invalid" - and "re-open" in case it reproduces again.
Cheers!

Revision history for this message
Jitendra Sayanekar (jsayanek) wrote :

Hi Guilherme,

I am doing good. Thank you.

About the bug, I am still trying to reproduce it. Because of some other issues I could not get to the number of test cycles after which this bug reproduce.
Now, I am working on two different setups to reproduce this. I suggest to wait until mid next week and then we can decide.

Please let me know if you are ok with this.

Regards
Jitendra

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Sure Jitendra, all fine by me!! Take your time - if you manage to reproduce, let me know.
Cheers,

Guilherme

Revision history for this message
Jitendra Sayanekar (jsayanek) wrote :

Hi Guilherme,

We have started doing focused long-run, performance, stress testing on multiple setups. This is expected to take more than a month time.
So based on the internal discussion, we request to keep this bug open until July end.

Please let me know if you have any concerns.

Regards
Jitendra

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Hi Jitendra! No problem at all, let's keep it open for the time you need.
Let me know in case you have any news.
Cheers,

Guilherme

Changed in open-iscsi (Ubuntu):
assignee: nobody → Guilherme G. Piccoli (gpiccoli)
importance: Undecided → Medium
Changed in open-iscsi (Ubuntu):
assignee: Guilherme G. Piccoli (gpiccoli) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.