Ubuntu

iscsid tries to reconnect existing session at startup, failing to do so and hanging the system

Reported by Stéphane Graber on 2011-09-15
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
open-iscsi (Ubuntu)
Undecided
Unassigned
Precise
Medium
Ante Karamatić

Bug Description

[Impact]
This bug affects iSCSI when acting as an initiator only.

Works: everything when not using an iSCSI root fs.
Works: an iSCSI root fs when not using iSCSI for any other mounts after the root fs is mounted.
Doesn't work: further iSCSI mounts after using an iSCSI root fs. For example: OpenStack won't work on a node using an iSCSI root fs, since OpenStack uses further iSCSI mounts.

[Original Description]

When starting open-iscsi with an already established session (from iscsistart), iscsid tries to reconnect it and fails to do it (wrong AuthMethod).

Before Oneiric, a bug prevented iscsid from starting, making it "work" when root is on iscsi. That's as long as you don't need to mount another lun.

In Oneiric, this bug got fixed, exposing the open-iscsi bug. The workaround for now (bug 838809) is to exit the open-iscsi init script when detecting we already have a session established from the initramfs.

Ideally, open-iscsi should be able to start, detect that a session is already established and either not touch it at all or be able to reconnect it with the right settings.

Dave Walker (davewalker) on 2011-10-07
tags: added: server-o-ro
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in open-iscsi (Ubuntu):
status: New → Confirmed
Ante Karamatić (ivoks) wrote :

Unmarking it as a duplicate, cause it's not. This workaround is not needed with open-iscsi 2.0.873 and newer. Workaround also means that iscsid won't be running, making these systems unusable for OpenStack compute nodes.

Robie Basak (racb) wrote :

From IRC, Ante reports that is fixed in the newest Quantal version, but needs an SRU for Precise.

Changed in open-iscsi (Ubuntu):
status: Confirmed → Fix Released
Robie Basak (racb) wrote :

Current status as I understood on it (from Ante on IRC):

Fixed properly in Quantal.

Workaround present in Precise that just prevents iscsid from starting, but then one can not mount any other target, which means the system won't work with nova compute. So the fix that went into Quantal needs to be SRU'd.

The upstream commit is possibly https://github.com/mikechristie/open-iscsi/commit/5383b4b373bdea6cc50b2099201dde33de80d145

Robie Basak (racb) on 2012-07-17
Changed in open-iscsi (Ubuntu Precise):
status: New → Triaged
status: Triaged → In Progress
milestone: none → ubuntu-12.04.1
assignee: nobody → Ante Karamatić (ivoks)
Robie Basak (racb) on 2012-07-30
description: updated
Changed in open-iscsi (Ubuntu Precise):
importance: Undecided → Medium
Robie Basak (racb) on 2012-08-07
Changed in open-iscsi (Ubuntu Precise):
milestone: ubuntu-12.04.1 → precise-updates
Stephen Gran (sgran) wrote :

Hello,

We're still seeing iscsid not starting on a node with iscsi root. A quick test of the version in quantal looks like it starts iscsid and consequently handles things like target failovers properly.

We have diskless nodes that we would like to use as openstack compute nodes. While this bug is unresolved in precise, it's going to be difficult. How much effort is it to just backport 2.0.873 from Quantal?

Cheers,

Robie Basak (racb) wrote :

Anything but cherry-picking a fix would most likely violate SRU policy (https://wiki.ubuntu.com/StableReleaseUpdates). This is the only way a fix can go into precise-updates.

If you want a full backport, then this would need to go into precise-backports. Users would need to enable backports to make use of it, since it isn't the default. See https://wiki.ubuntu.com/UbuntuBackports for details of this process.

Right now, I think the easiest way for this bug to make progress is for someone to figure out what to cherry-pick, test it and then go through the SRU procedure at https://wiki.ubuntu.com/StableReleaseUpdates#Procedure

Stephen Gran (sgran) wrote :

Hello,

That seems to go against what is being said by both racb and stgraber above. The quick hack of not starting the daemon breaks using precise, an LTS release, as a compute node for openstack when isci is in use for block storage. This seems a bit extreme to me.

Can you please reconsider?

Thanks,

Robie Basak (racb) wrote :

Please reconsider what?

Robie Basak (racb) wrote :

This bug having been "In Progress" for over a year, I presume it's not actually in progress any more.

Changed in open-iscsi (Ubuntu Precise):
status: In Progress → Triaged
Stephen Gran (sgran) wrote :

Adding this fix into a stable release update for precise.

Robie Basak (racb) wrote :

Yes we can, if someone can put forward a suitable patch. I said above:

"Right now, I think the easiest way for this bug to make progress is for someone to figure out what to cherry-pick, test it and then go through the SRU procedure at https://wiki.ubuntu.com/StableReleaseUpdates#Procedure"

I don't understand how you came to the conclusion that we couldn't do this. Backporting 2.0.873 from Quantal, as you requested, is likely to violate SRU policy (see the SRU policy itself for rationale). However, cherry-picking a suitable patch should be fine.

Stephen Gran (sgran) wrote :

so, the diffstat between the version I know doesn't work and the version I know does work is:
 335 files changed, 70473 insertions(+), 6135 deletions(-)

If what you're saying is, "you figure it out, that sounds like work to me", that's ok, I understand that. We have a working backport of the new version, and it works for us. I'm not going to figure out which of the 70k lines of code churn made the difference, sorry.

I was just trying to say that the bug that was band-aided to get open-iscsi out the door for precise is now fixed. If you prefer the band-aid at the expense of arguing for a backport, that is of course your prerogative.

Cheers,

Robie Basak (racb) wrote :

We will not backport 335 changed files to precise-updates. Not knowing what changes are in there introduces a significant chance of regression for existing open-iscsi users who are not affected by this bug. I'm certain that the SRU team would find this an entirely unacceptable risk. It defeats the whole point of having a stable release, and I understand that the development release is already fixed (and thus so will the next LTS).

Other routes are available, such as the backports repository, but backporting such churn and making it available to users automatically is not an option due to the risk of regression. Only a minimal patch is acceptable for an automatic update. I hope you understand this reasoning.

This bug remains open for Precise. Volunteers to drive any of these options are welcome.

Stephen Gran (sgran) wrote :

Have you actually asked the SRU team about this? The version of open-iscsi in precise has a bad hack that means that you can't use diskless nodes as openstack compute hosts. I would have thought that fixing a piece of software used by the UEC platform in an LTS release would qualify as something worth doing. If you don't want to ask yourself, can you point me to who to ask?

Robie Basak (racb) wrote :

> The version of open-iscsi in precise has a bad hack that means that you can't use diskless nodes as openstack compute hosts. I would have thought that fixing a piece of software used by the UEC platform in an LTS release would qualify as something worth doing.

Absolutely - it is definitely worth doing. I've said this multiple times, and at no point have I said otherwise. The *way* this needs to be done is with a minimal patch, so as to minimise regressions for other users, as per SRU policy. But nobody has brought such a patch forward.

I have not asked the SRU team. Instead, I'm going on my familiarity with SRU procedure, which is well documented (https://wiki.ubuntu.com/StableReleaseUpdates). But if you would like to consult them, then please do. Exceptions can be made. For the venue? Try https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel or https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel-discuss as you feel is appropriate.

Lupu (wolfit-ro) wrote :

Has this bug been fixed in 12.04.4? ( or should I install it from source )

- apt cache reports Version: 2.0.873-3ubuntu5~ubuntu12.04.1
- and apps are reporting: iscsiadm version 2.0-871
- the init script still exits when iscsi was ran from initramfs

Thanx.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers