Heartbeat in pthreads in nova-wallaby crashes with greenlet error
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu Cloud Archive |
Invalid
|
Undecided
|
Unassigned | ||
Yoga |
Triaged
|
Medium
|
Unassigned | ||
oslo.messaging |
Fix Released
|
Undecided
|
Unassigned | ||
python-oslo.messaging (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Jammy |
Incomplete
|
Undecided
|
Unassigned |
Bug Description
When performing a heartbeat to rabbit (inside a nova-compute process), there is a greenlet error which causes a hard crash.
I'm not exactly sure what details are relevant, but can provide more info if there's something that will be useful!
This is on RHEL7 (essentially... somewhat custom image based on it)
Log snippet:
```
2021-07-07 19:34:52,686 DEBUG [oslo.messaging
2021-07-07 19:34:52,699 DEBUG [amqp.connectio
2021-07-07 19:34:52,699 DEBUG [amqp.connectio
2021-07-07 19:34:52,700 DEBUG [amqp.connectio
2021-07-07 19:34:52,701 DEBUG [amqp.connectio
2021-07-07 19:34:52,718 DEBUG [amqp] /opt/openstack/
2021-07-07 19:34:52,719 DEBUG [amqp] /opt/openstack/
2021-07-07 19:34:52,720 DEBUG [amqp] /opt/openstack/
2021-07-07 19:34:52,721 DEBUG [amqp.connectio
Traceback (most recent call last):
File "/opt/openstack
timer()
File "/opt/openstack
cb(*args, **kw)
File "/opt/openstack
waiter.switch()
greenlet.error: cannot switch to a different thread
```
Versions:
```
oslo.messaging=
nova==23.0.2 (packaged locally from stable/wallaby as of July 3, 2021)
```
-------
[Impact]
The Nova default value of heartbeat_
[Test Plan]
* Deploy Openstack Yoga on Jammy and ensure nova-compute has debug=True
* ensure "oslo_messaging
* By default a heartbeat is checked 2 times every 60 seconds
* Check /var/log/
[Regression Potential]
Changing the default to False will mean that while services not running under wsgi will be fixed, services that are running under wsgi will revert back to using their native threading method i.e. greenthreads which is considered suboptimal and in very loaded environments this could have a perceived impact
on api performance. A separate bug https:/
no longer affects: | nova |
description: | updated |
Changed in python-oslo.messaging (Ubuntu): | |
status: | New → Fix Released |
Changed in python-oslo.messaging (Ubuntu Jammy): | |
status: | New → In Progress |
no longer affects: | nova (Ubuntu) |
no longer affects: | nova (Ubuntu Jammy) |
Hello,
Thanks for reported this bug.
Here are some notes related to the releases of the "heartbeat in pthread" patches.
The heartbeat in a python thread is the default value since oslo.messaging 12.6.0 [1]. This version was released during Wallaby (9 months ago - October 2020) [2].
Please, can you tell us if this behavior is systematic or not?
Can you reproduce it all the time, or does this an isolated case?
[1] Tags who contains add5ab4 /releases. openstack. org/victoria/ index.html# wallaby- oslo-messaging
```
$ git tag --contains add5ab4
12.6.0
12.6.1
12.7.0
12.7.1
12.8.0
```
[2] https:/