I analyzed another failure over the weekend [1], and it's the same failure I see here. For reasons I don't understand, it looks like the iscsid container is regenerating its IQN on startup, essentially overwriting the IQN initialization done by puppet. This creates a situation where the iSCSI subsystem thinks that Nova is trying to access a volume using the wrong IQN, which results in an iSCSI login failure. This failure can be seen in the log journal [2], where this appears:
Jan 06 11:48:13 centos-7-ovh-gra1-0001714610 kernel: iSCSI Initiator Node: iqn.1994-05.com.redhat:755d7cc2a05d is not authorized to access iSCSI target portal group: 1.
The IQN is supposed to be initialized once, during the iscsid container's puppet_config step. The same IQN value is shared by baremetal services [3] and containers [4]. That value is iqn.1994-05.com.redhat:dc122233454b, which differs from the iqn.1994-05.com.redhat:755d7cc2a05d value flagged in the login failure message. This is what causes the tempest tests to fail!
I tracked the problem further, to the iscsid container's stdout [5]. I don't know if I'm reading this correctly, but it suggests the IQN initialization code is running twice. The logs in [5] for "Copying service configuration files" don't look correct (the source and destination directory of the files look wrong), and there's no mention of the .initiator_reset file that's supposed to ensure the IQN is only reset once [6].
Also, from the log journal [2], you see the iscsid container's puppet_config performing the IQN reset at Jan 06 11:16:21:
Jan 06 11:16:21 centos-7-ovh-gra1-0001714610 puppet-user[35337]: (/Stage[main]/Tripleo::Profile::Base::Iscsid/Exec[reset-iscsi-initiator-name]/returns) executed successfully
But, at Jan 06 11:29:44 (13 minutes later), you see it running the commands that generate the IQN that triggers the login failure:
Jan 06 11:29:44 centos-7-ovh-gra1-0001714610 dockerd-current[25619]: Generating new iSCSI initiator name
Jan 06 11:29:44 centos-7-ovh-gra1-0001714610 dockerd-current[25619]: ++ echo 'Generating new iSCSI initiator name'
Jan 06 11:29:44 centos-7-ovh-gra1-0001714610 dockerd-current[25619]: +++ /sbin/iscsi-iname
Jan 06 11:29:44 centos-7-ovh-gra1-0001714610 dockerd-current[25619]: ++ echo InitiatorName=iqn.1994-05.com.redhat:755d7cc2a05d
We need to take a closer look at how the iscsid container is being launched to determine why the puppet code is executing more than once.
We also need to investigate how the puppet-generated configuration files are saved, and then restored when the container is launched. That code may not be properly handling dot-files such as the ".initiator_reset" file mentioned above.
I analyzed another failure over the weekend [1], and it's the same failure I see here. For reasons I don't understand, it looks like the iscsid container is regenerating its IQN on startup, essentially overwriting the IQN initialization done by puppet. This creates a situation where the iSCSI subsystem thinks that Nova is trying to access a volume using the wrong IQN, which results in an iSCSI login failure. This failure can be seen in the log journal [2], where this appears:
Jan 06 11:48:13 centos- 7-ovh-gra1- 0001714610 kernel: iSCSI Initiator Node: iqn.1994- 05.com. redhat: 755d7cc2a05d is not authorized to access iSCSI target portal group: 1.
[1] https:/ /review. openstack. org/530559 logs.openstack. org/82/ 531182/ 1/gate/ tripleo- ci-centos- 7-scenario002- multinode- oooq-container/ c99bd6b/ logs/subnode- 2/var/log/ journal. txt.gz# _Jan_06_ 11_48_13
[2] http://
The IQN is supposed to be initialized once, during the iscsid container's puppet_config step. The same IQN value is shared by baremetal services [3] and containers [4]. That value is iqn.1994- 05.com. redhat: dc122233454b, which differs from the iqn.1994- 05.com. redhat: 755d7cc2a05d value flagged in the login failure message. This is what causes the tempest tests to fail!
[3] http:// logs.openstack. org/82/ 531182/ 1/gate/ tripleo- ci-centos- 7-scenario002- multinode- oooq-container/ c99bd6b/ logs/subnode- 2/etc/iscsi/ initiatorname. iscsi.gz logs.openstack. org/82/ 531182/ 1/gate/ tripleo- ci-centos- 7-scenario002- multinode- oooq-container/ c99bd6b/ logs/subnode- 2/var/log/ config- data/iscsid/ etc/iscsi/ initiatorname. iscsi.gz
[4] http://
I tracked the problem further, to the iscsid container's stdout [5]. I don't know if I'm reading this correctly, but it suggests the IQN initialization code is running twice. The logs in [5] for "Copying service configuration files" don't look correct (the source and destination directory of the files look wrong), and there's no mention of the .initiator_reset file that's supposed to ensure the IQN is only reset once [6].
[5] http:// logs.openstack. org/82/ 531182/ 1/gate/ tripleo- ci-centos- 7-scenario002- multinode- oooq-container/ c99bd6b/ logs/subnode- 2/var/log/ extra/docker/ containers/ iscsid/ stdout. log.txt. gz /github. com/openstack/ puppet- tripleo/ blob/master/ manifests/ profile/ base/iscsid. pp#L36
[6] https:/
Also, from the log journal [2], you see the iscsid container's puppet_config performing the IQN reset at Jan 06 11:16:21:
Jan 06 11:16:21 centos- 7-ovh-gra1- 0001714610 puppet-user[35337]: (/Stage[ main]/Tripleo: :Profile: :Base:: Iscsid/ Exec[reset- iscsi-initiator -name]/ returns) executed successfully
But, at Jan 06 11:29:44 (13 minutes later), you see it running the commands that generate the IQN that triggers the login failure:
Jan 06 11:29:44 centos- 7-ovh-gra1- 0001714610 dockerd- current[ 25619]: Generating new iSCSI initiator name 7-ovh-gra1- 0001714610 dockerd- current[ 25619]: ++ echo 'Generating new iSCSI initiator name' 7-ovh-gra1- 0001714610 dockerd- current[ 25619]: +++ /sbin/iscsi-iname 7-ovh-gra1- 0001714610 dockerd- current[ 25619]: ++ echo InitiatorName= iqn.1994- 05.com. redhat: 755d7cc2a05d
Jan 06 11:29:44 centos-
Jan 06 11:29:44 centos-
Jan 06 11:29:44 centos-
We need to take a closer look at how the iscsid container is being launched to determine why the puppet code is executing more than once.
We also need to investigate how the puppet-generated configuration files are saved, and then restored when the container is launched. That code may not be properly handling dot-files such as the ".initiator_reset" file mentioned above.