undercloud ffu breaks novajoin

Bug #1905526 reported by Michele Baldessari
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Michele Baldessari

Bug Description

After a queens -> train FFU novajoin is broken (seen first via rhbz#1901157)

2020-11-24 20:01:31.569 7 CRITICAL join [-] Unhandled error: amqp.exceptions.AccessRefused: (0, 0): (403) ACCESS_REFUSED - Login was refused using authentication mechan
ism AMQPLAIN. For details see the broker logfile.
2020-11-24 20:01:31.569 7 ERROR join Traceback (most recent call last):
2020-11-24 20:01:31.569 7 ERROR join File "/usr/bin/novajoin-notify", line 10, in <module>
2020-11-24 20:01:31.569 7 ERROR join sys.exit(main())
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/novajoin/notifications.py", line 422, in main
2020-11-24 20:01:31.569 7 ERROR join server.start()
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/oslo_messaging/server.py", line 269, in wrapper
2020-11-24 20:01:31.569 7 ERROR join log_after, timeout_timer)
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/oslo_messaging/server.py", line 189, in run_once
2020-11-24 20:01:31.569 7 ERROR join post_fn = fn()
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/oslo_messaging/server.py", line 268, in <lambda>
2020-11-24 20:01:31.569 7 ERROR join states[state].run_once(lambda: fn(self, *args, **kwargs),
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/oslo_messaging/server.py", line 413, in start
2020-11-24 20:01:31.569 7 ERROR join self.listener = self._create_listener()
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/oslo_messaging/notify/listener.py", line 165, in _create_listener
2020-11-24 20:01:31.569 7 ERROR join self._batch_timeout
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/oslo_messaging/transport.py", line 154, in _listen_for_notifications
2020-11-24 20:01:31.569 7 ERROR join targets_and_priorities, pool, batch_size, batch_timeout
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 671, in listen_for_notifications
2020-11-24 20:01:31.569 7 ERROR join conn = self._get_connection(rpc_common.PURPOSE_LISTEN)
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 562, in _get_connection
2020-11-24 20:01:31.569 7 ERROR join purpose=purpose)
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/common.py", line 432, in __init__
2020-11-24 20:01:31.569 7 ERROR join self.connection = connection_pool.create(purpose)
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/pool.py", line 148, in create
2020-11-24 20:01:31.569 7 ERROR join return self.connection_cls(self.conf, self.url, purpose)
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 603, in __init__
2020-11-24 20:01:31.569 7 ERROR join self.ensure_connection()
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/oslo_messaging/_drivers/impl_rabbit.py", line 718, in ensure_connection
2020-11-24 20:01:31.569 7 ERROR join self.connection.ensure_connection(errback=on_error)
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/kombu/connection.py", line 405, in ensure_connection
2020-11-24 20:01:31.569 7 ERROR join callback)
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/kombu/utils/functional.py", line 332, in retry_over_time
2020-11-24 20:01:31.569 7 ERROR join return fun(*args, **kwargs)
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/kombu/connection.py", line 261, in connect
2020-11-24 20:01:31.569 7 ERROR join return self.connection
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/kombu/connection.py", line 802, in connection
2020-11-24 20:01:31.569 7 ERROR join self._connection = self._establish_connection()
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/kombu/connection.py", line 757, in _establish_connection
2020-11-24 20:01:31.569 7 ERROR join conn = self.transport.establish_connection()
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/kombu/transport/pyamqp.py", line 130, in establish_connection
2020-11-24 20:01:31.569 7 ERROR join conn.connect()
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/amqp/connection.py", line 313, in connect
2020-11-24 20:01:31.569 7 ERROR join self.drain_events(timeout=self.connect_timeout)
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/amqp/connection.py", line 500, in drain_events
2020-11-24 20:01:31.569 7 ERROR join while not self.blocking_read(timeout):
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/amqp/connection.py", line 506, in blocking_read
2020-11-24 20:01:31.569 7 ERROR join return self.on_inbound_frame(frame)
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/amqp/method_framing.py", line 55, in on_frame
2020-11-24 20:01:31.569 7 ERROR join callback(channel, method_sig, buf, None)
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/amqp/connection.py", line 510, in on_inbound_method
2020-11-24 20:01:31.569 7 ERROR join method_sig, payload, content,
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/amqp/abstract_channel.py", line 126, in dispatch_method
2020-11-24 20:01:31.569 7 ERROR join listener(*args)
2020-11-24 20:01:31.569 7 ERROR join File "/usr/lib/python3.6/site-packages/amqp/connection.py", line 639, in _on_close
2020-11-24 20:01:31.569 7 ERROR join (class_id, method_id), ConnectionError)
2020-11-24 20:01:31.569 7 ERROR join amqp.exceptions.AccessRefused: (0, 0): (403) ACCESS_REFUSED - Login was refused using authentication mechanism AMQPLAIN. For detail s see the broker logfile.

Server side it gives us:
2020-11-24 20:12:31.426 [error] <0.5096.1> Error on AMQP connection <0.5096.1> (192.168.24.1:57254 -> 192.168.24.1:5672, state: starting):
AMQPLAIN login refused: user 'guest' - invalid credentials

So the novajoin connects via:
 [root@undercloud-0 novajoin]# grep -v ^# join.conf |grep -i '[a-z]' |grep ^transport_url
transport_url=rabbit://guest:<email address hidden>:5672/?ssl=0

Everyone else connects via a randomly generated username:
 [root@undercloud-0 puppet-generated]# grep -ir ^transport_url | cut -f2- -d: |sort |uniq -c |sort -n -r
     16 transport_url=rabbit://bf2065d34572247c380ac04816dccfe2cbc12542:<email address hidden>:5672/?ssl=0
      1 transport_url=rabbit://guest:<email address hidden>:5672/?ssl=0

So it seems to me that novajoin is using 'guest' where it should not. Although on a clean new osp16 deployments we have:
[root@undercloud-0 puppet-generated]# grep -ir ^transport_url | cut -f2- -d: |sort |uniq -c |sort -n -r
     17 transport_url=rabbit://guest:<email address hidden>:5672/?ssl=0

That random rabbit user name came from:
 [root@undercloud-0 stack]# grep RpcUser tripleo-undercloud-passwords.yaml
  RpcUserName: bf2065d34572247c380ac04816dccfe2cbc12542

So on the undercloud on queens we randomized the rabbitmq username, whereas on train we use guest as the rabbitmq undercloud username. Now the upgrade in python-tripleoclient, correctly migrates the username as well except for novajoin.

On a clean queens install we have:
 [root@undercloud-0 rabbitmq]# grep rabbit /home/stack/undercloud-passwords.conf
undercloud_rabbit_cookie=a5e7e00657aaca82805fdd65b894f405389deb54
undercloud_rabbit_password=38175989903a8fffb7ec822aedea5e7f872a165f
undercloud_rabbit_username=cc6497d1e4eee62095687259149df899dc97d406

These get translated as follows during a UC upgrade:
 [root@undercloud-0 openstack-tripleo-heat-templates]# grep -e Rabbit -e Rpc -e Notify /home/stack/tripleo-undercloud-passwords.yaml
  NotifyPassword: y2dkTWz1Oc74PmRcB1IxmevVy
  RabbitCookie: 5059f7a464386fc91088d10fcf2f13bae36a4db1
  RabbitPassword: 33d8b46fd381503901201c6644166e59f7a42191
  RpcPassword: 33d8b46fd381503901201c6644166e59f7a42191
  RpcUserName: bf2065d34572247c380ac04816dccfe2cbc12542

The problem is that novajoin uses RabbitUserName and we only translate the rabbit_user_name to RpcUserName which novajoin does not use
        tripleo::profile::base::novajoin::oslomsg_rpc_password: {get_param: RpcPassword}
        tripleo::profile::base::novajoin::oslomsg_rpc_username: {get_param: RabbitUserName}

Changed in tripleo:
milestone: wallaby-1 → wallaby-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 12.4.2

This issue was fixed in the openstack/tripleo-heat-templates 12.4.2 release.

Changed in tripleo:
milestone: wallaby-2 → wallaby-3
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 11.4.0

This issue was fixed in the openstack/tripleo-heat-templates 11.4.0 release.

Changed in tripleo:
milestone: wallaby-3 → wallaby-rc1
Revision history for this message
Michele Baldessari (michele) wrote :
Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.