Cinder doesn't restore LIO target ACLs for target after Cinder node reboots
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Cinder |
Fix Released
|
High
|
Mitsuhiro Tanino |
Bug Description
The problem:
If Cinder node reboots or crashes, Cinder doesn't restore LIO target ACL rules for existing targets. That leads to:
1) On Cinder node targetcli reports for example:
o- iqn.2010-
| o- tpgt1 .......
| o- acls .......
| o- luns .......
| | o- lun0 [iblock/
| o- portals .......
| o- 10.10.21.32:3260 .......
There is expected [1 ACL] instead of [0 ACL] because that volume was in-use at the moment of reboot.
In syslog also there is repeated each second:
iSCSI Initiator Node: iqn.2015-
2) On Compute node which hosts this VM, you can see in syslog:
iscsid: conn 0 login rejected: initiator failed authorization with target
3) On VM that uses the volume you can see IO error on every try to use the volume.
The environment:
Openstack multinode setup; each component is setup from git branch stable/liberty (cinder last commit is 6d0981b252835a5
Cinder host: Ubuntu 14.04, python 2.7.6, Cinder from stable/liberty (6d0981b252835a
Config part for these volumes:
[fast-1]
volume_
volume_
volume_
iscsi_protocol = iscsi
iscsi_helper = lioadm
The way to reproduce:
1) Install and setup Cinder from stable/liberty github version with config [fast-1] part and LIO
2) Create Volume and attach it to any VM.
3) Simulate 'crash' - reboot the Cinder node without stopping VM.
4) After reboot start Cinder and look for errors and missing ACL (Volume is not usable anymore).
The clues:
1) After Volume attachment /etc/target/
2) May be connected to patches:
the first one - https:/
The workaround:
Nethertheless there is a workaround (but it seems painful). If you use "sudo targetctl save backup/
Targetcli configs (IMPORTANT):
1) Just right after Volume creation and attachment - https:/
2) After Cinder reboot and restart - https:/
3) Diff between them (that's the actual bug occurance) - https:/
NOTE: in all the LIO configs there are already broken volumes, so just look at the diff-file first.
Changed in cinder: | |
importance: | Undecided → High |
Let me try to reproduce and investigate.