After a significant amount of investigation I've found out several things about this bug:
1) the unitdata.set() that happens at [1] is only actually written to the file at the end of successful hook/action execution.
2) In my deployment, running the action openstack-upgrade with --wait shows that the action actually fails, despite having upgraded the packages successfully. It fails at [2] and does not even get to perform a db migration sync.
3) Diving deeper into [2], at [3] it gets a list of relation interfaces to process and render config files based on. When running 3 manila units, the cluster interface is present with class relations.openstack-ha.peers.OpenstackHAPeers. When running only 1 manila unit, the cluster interface is not present, and the upgrade does not fail (for me, in my deployment)
4) Diving deeper at [3], at [4] it processes the interfaces, notice an interesting "except TypeError" there, more on this later. The constructor invoked through the variable class instantiation eventually reaches [5], where it iterates through the interfaces and at [6] it hits the problem where it tries to set the attribute to the relation adapters class. The attribute being 'cluster', and the value being the class OpenstackHAPeers.
When it fails, there are absolutely no errors in the logs. The action run with --wait finishes with the following fields:
message: can't set attribute
status: failed
Those are easily overlooked due to the other stdout printed, such as packages installed during the upgrade. Fortunately this is an action that can be repeated over and over until successful.
That message "can't set attribute" is actually an AttributeError (not a TypeError). By adding an extra try/except block I can capture it and confirm it. No traceback information though (if I did things correctly). Also, adding the AttributeError to the outer try/except definition does not solve the problem because it is just invoked again and fails the same way.
TEST DATA: <relations.openstack-ha.peers.OpenstackHAPeers object at 0x7fea21c85cf8> cluster <charms_openstack.adapters.PeerHARelationAdapter object at 0x7fea21c9c908> <charm.openstack.manila.ManilaRelationAdapters object at 0x7fea21c85780> {'_charm_instance_weakref': <weakref at 0x7fea21c7a3b8; to 'ManilaCharmRocky' at 0x7fea21c85ba8>, '_relations': {'options', 'amqp'}, 'options': <charms_openstack.adapters.DefaultConfigurationAdapter object at 0x7fea21c9c3c8>, '_adapters': {'amqp': <class 'charm.openstack.manila.TransportURLAdapter'>, 'shared_db': <class 'charms_openstack.adapters.DatabaseRelationAdapter'>, 'cluster': <class 'charms_openstack.adapters.PeerHARelationAdapter'>, 'coordinator_memcached': <class 'charms_openstack.adapters.MemcacheRelationAdapter'>}, 'amqp': <charm.openstack.manila.TransportURLAdapter object at 0x7fea21c9cb00>}
I was unsure whether this was correct or not, so I compared to the placement charm which does not fail. It prints the following:
TEST DATA: <relations.openstack-ha.peers.OpenstackHAPeers object at 0x7f5bfd08e908> cluster <charms_openstack.adapters.PeerHARelationAdapter object at 0x7f5bfd08ec88> <charms_openstack.adapters.OpenStackAPIRelationAdapters object at 0x7f5bfd08ea58> {'_charm_instance_weakref': <weakref at 0x7f5bfd086c28; to 'PlacementCharm' at 0x7f5bfd08e470>, '_relations': {'options'}, 'options': <charms_openstack.adapters.APIConfigurationAdapter object at 0x7f5bfd08e9e8>, '_adapters': {'amqp': <class 'charms_openstack.adapters.RabbitMQRelationAdapter'>, 'shared_db': <class 'charms_openstack.adapters.DatabaseRelationAdapter'>, 'cluster': <class 'charms_openstack.adapters.PeerHARelationAdapter'>, 'coordinator_memcached': <class 'charms_openstack.adapters.MemcacheRelationAdapter'>}}
Diff'ing those two I noticed specific classes such as charm.openstack.manila.ManilaRelationAdapters and charm.openstack.manila.TransportURLAdapter that may be causing the issue. I edited the manila charm files and removed that customization from [7] and their definitions from that file. The result was:
TEST DATA: <relations.openstack-ha.peers.OpenstackHAPeers object at 0x7fdb7cd9a518> cluster <charms_openstack.adapters.PeerHARelationAdapter object at 0x7fdb7cdab6a0> <charms_openstack.adapters.OpenStackAPIRelationAdapters object at 0x7fdb7cd9a550> {'_charm_instance_weakref': <weakref at 0x7fdb7cd98f48; to 'ManilaCharmRocky' at 0x7fdb7cd9a940>, '_relations': {'options', 'amqp'}, 'options': <charms_openstack.adapters.DefaultConfigurationAdapter object at 0x7fdb7cdab160>, '_adapters': {'amqp': <class 'charms_openstack.adapters.RabbitMQRelationAdapter'>, 'shared_db': <class 'charms_openstack.adapters.DatabaseRelationAdapter'>, 'cluster': <class 'charms_openstack.adapters.PeerHARelationAdapter'>, 'coordinator_memcached': <class 'charms_openstack.adapters.MemcacheRelationAdapter'>}, 'amqp': <charms_openstack.adapters.RabbitMQRelationAdapter object at 0x7fdb7cdab6d8>}
and it still failed exactly in the same place. While looping through the interfaces, it goes first through the 'amqp' interface (that's why that is an extra field when compared to placement's), and it sets the attribute correctly, but 'cluster' attribute will not.
I also noticed that for placement there is an APIConfigurationAdapter class while for manila there is DefaultConfigurationAdapter, however I haven't found where that is declared in the code to make that adjustment.
After a significant amount of investigation I've found out several things about this bug:
1) the unitdata.set() that happens at [1] is only actually written to the file at the end of successful hook/action execution.
2) In my deployment, running the action openstack-upgrade with --wait shows that the action actually fails, despite having upgraded the packages successfully. It fails at [2] and does not even get to perform a db migration sync.
3) Diving deeper into [2], at [3] it gets a list of relation interfaces to process and render config files based on. When running 3 manila units, the cluster interface is present with class relations. openstack- ha.peers. OpenstackHAPeer s. When running only 1 manila unit, the cluster interface is not present, and the upgrade does not fail (for me, in my deployment)
4) Diving deeper at [3], at [4] it processes the interfaces, notice an interesting "except TypeError" there, more on this later. The constructor invoked through the variable class instantiation eventually reaches [5], where it iterates through the interfaces and at [6] it hits the problem where it tries to set the attribute to the relation adapters class. The attribute being 'cluster', and the value being the class OpenstackHAPeers.
When it fails, there are absolutely no errors in the logs. The action run with --wait finishes with the following fields:
message: can't set attribute
status: failed
Those are easily overlooked due to the other stdout printed, such as packages installed during the upgrade. Fortunately this is an action that can be repeated over and over until successful.
That message "can't set attribute" is actually an AttributeError (not a TypeError). By adding an extra try/except block I can capture it and confirm it. No traceback information though (if I did things correctly). Also, adding the AttributeError to the outer try/except definition does not solve the problem because it is just invoked again and fails the same way.
So I added extra logging such as:
hookenv.log("TEST LOG: {} {} {} {} {}".format( relation, adapter_ name,adapter, self,self. __dict_ _))
just above [6] and it printed the following:
TEST DATA: <relations. openstack- ha.peers. OpenstackHAPeer s object at 0x7fea21c85cf8> cluster <charms_ openstack. adapters. PeerHARelationA dapter object at 0x7fea21c9c908> <charm. openstack. manila. ManilaRelationA dapters object at 0x7fea21c85780> {'_charm_ instance_ weakref' : <weakref at 0x7fea21c7a3b8; to 'ManilaCharmRocky' at 0x7fea21c85ba8>, '_relations': {'options', 'amqp'}, 'options': <charms_ openstack. adapters. DefaultConfigur ationAdapter object at 0x7fea21c9c3c8>, '_adapters': {'amqp': <class 'charm. openstack. manila. TransportURLAda pter'>, 'shared_db': <class 'charms_ openstack. adapters. DatabaseRelatio nAdapter' >, 'cluster': <class 'charms_ openstack. adapters. PeerHARelationA dapter' >, 'coordinator_ memcached' : <class 'charms_ openstack. adapters. MemcacheRelatio nAdapter' >}, 'amqp': <charm. openstack. manila. TransportURLAda pter object at 0x7fea21c9cb00>}
I was unsure whether this was correct or not, so I compared to the placement charm which does not fail. It prints the following:
TEST DATA: <relations. openstack- ha.peers. OpenstackHAPeer s object at 0x7f5bfd08e908> cluster <charms_ openstack. adapters. PeerHARelationA dapter object at 0x7f5bfd08ec88> <charms_ openstack. adapters. OpenStackAPIRel ationAdapters object at 0x7f5bfd08ea58> {'_charm_ instance_ weakref' : <weakref at 0x7f5bfd086c28; to 'PlacementCharm' at 0x7f5bfd08e470>, '_relations': {'options'}, 'options': <charms_ openstack. adapters. APIConfiguratio nAdapter object at 0x7f5bfd08e9e8>, '_adapters': {'amqp': <class 'charms_ openstack. adapters. RabbitMQRelatio nAdapter' >, 'shared_db': <class 'charms_ openstack. adapters. DatabaseRelatio nAdapter' >, 'cluster': <class 'charms_ openstack. adapters. PeerHARelationA dapter' >, 'coordinator_ memcached' : <class 'charms_ openstack. adapters. MemcacheRelatio nAdapter' >}}
Diff'ing those two I noticed specific classes such as charm.openstack .manila. ManilaRelationA dapters and charm.openstack .manila. TransportURLAda pter that may be causing the issue. I edited the manila charm files and removed that customization from [7] and their definitions from that file. The result was:
TEST DATA: <relations. openstack- ha.peers. OpenstackHAPeer s object at 0x7fdb7cd9a518> cluster <charms_ openstack. adapters. PeerHARelationA dapter object at 0x7fdb7cdab6a0> <charms_ openstack. adapters. OpenStackAPIRel ationAdapters object at 0x7fdb7cd9a550> {'_charm_ instance_ weakref' : <weakref at 0x7fdb7cd98f48; to 'ManilaCharmRocky' at 0x7fdb7cd9a940>, '_relations': {'options', 'amqp'}, 'options': <charms_ openstack. adapters. DefaultConfigur ationAdapter object at 0x7fdb7cdab160>, '_adapters': {'amqp': <class 'charms_ openstack. adapters. RabbitMQRelatio nAdapter' >, 'shared_db': <class 'charms_ openstack. adapters. DatabaseRelatio nAdapter' >, 'cluster': <class 'charms_ openstack. adapters. PeerHARelationA dapter' >, 'coordinator_ memcached' : <class 'charms_ openstack. adapters. MemcacheRelatio nAdapter' >}, 'amqp': <charms_ openstack. adapters. RabbitMQRelatio nAdapter object at 0x7fdb7cdab6d8>}
and it still failed exactly in the same place. While looping through the interfaces, it goes first through the 'amqp' interface (that's why that is an extra field when compared to placement's), and it sets the attribute correctly, but 'cluster' attribute will not.
I also noticed that for placement there is an APIConfiguratio nAdapter class while for manila there is DefaultConfigur ationAdapter, however I haven't found where that is declared in the code to make that adjustment.
[1] https:/ /github. com/openstack/ charms. openstack/ blob/10627ee5f9 91c268f174d6d10 0e218a0e1867af1 /charms_ openstack/ charm/core. py#L1142
[2] https:/ /github. com/openstack/ charms. openstack/ blob/10627ee5f9 91c268f174d6d10 0e218a0e1867af1 /charms_ openstack/ charm/core. py#L1145
[3] https:/ /github. com/openstack/ charms. openstack/ blob/10627ee5f9 91c268f174d6d10 0e218a0e1867af1 /charms_ openstack/ charm/core. py#L957
[4] https:/ /github. com/openstack/ charms. openstack/ blob/10627ee5f9 91c268f174d6d10 0e218a0e1867af1 /charms_ openstack/ charm/core. py#L961
[5] https:/ /github. com/openstack/ charms. openstack/ blob/d049eee8f4 7e3913123762b6c d4f493e8ff0d18d /charms_ openstack/ adapters. py#L1236
[6] https:/ /github. com/openstack/ charms. openstack/ blob/d049eee8f4 7e3913123762b6c d4f493e8ff0d18d /charms_ openstack/ adapters. py#L1313
[7] https:/ /github. com/openstack/ charm-manila/ blob/f2ab722fe8 f6c8083d6b7a0a1 d962bf44ff795cc /src/lib/ charm/openstack /manila. py#L198