Comment 0 for bug 1431406

Revision history for this message
Michael Steffens (michael-steffens-b) wrote :

We are regularly encountering this situation when deleting stacks managed Heat. It can be reproduced without heat, however, just using Nova and Cinder:

1. Create a Windows VM, for example using CloudBase's image windows_server_2012_r2_standard_eval_kvm_20140607.

2. Create a Volume (50 GB) an attach volume to the instance.

3. Log into instance. Start Computer Mangement -> Disk Management.

4. Online the disk. Initialize and format the volume, assign drive letter D. Create some small garbage data on D:.

5. In Glance detach volume from instance. (Without shutting down the instance first. This is apparently what Heat does when deleting a stack.)

On the compute node you will now see dmesg and syslog being flooded with messages like

   [768938.979494] connection18:0: detected conn error (1020)

about once per second. On the compute node

  iscsiadm --mode session --print=1

displays the initiatior iSCSI session still logged in, while on the Cinder storage node

  tgtadm --lld iscsi --op show --mode target

shows that the iSCSI target is gone. The recurring connection errors on the compute node persist until manually logging off the iSCSI session. You may argue that performing the detachment while the volume is online and in use is unclean, therefore the issue being Heat's responsibility. However, even if that was the case, such an operation should not result in stale iSCSI sessions accumulating until manual intervention via root shell on the compute node.

Additional information

 - We couldn't reproduce this problem with Linux instances. Even when detaching a volume while mounted and in use by the instance, iSCSI session are cleaned up gracefully.

 - We can reproduce this problem with both Icehouse and Juno.

 - We can reproduce the problem with both single and multi node OpenStack configurations, the latter using separate host for compute and storage.