RuntimeError during calling log_opts_values

Bug #1856312 reported by norman shen on 2019-12-13
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Undecided
Unassigned
oslo.config
Undecided
Stephen Finucane

Bug Description

During starting up nova-compute service, we are hit by the following error message

+ sed -i s/HOST_IP// /tmp/logging-nova-compute.conf
+ exec nova-compute --config-file /etc/nova/nova.conf --config-file /tmp/pod-shared/nova-console.conf --config-file /tmp/pod-shared/nova-libvirt.conf --config-file /tmp/pod-shared/nova-hypervisor.conf --log-config-append /tmp/logging-nova-compute.conf
2019-12-13 06:53:09.556 29036 WARNING oslo_config.cfg [-] Deprecated: Option "use_neutron" from group "DEFAULT" is deprecated for removal (
nova-network is deprecated, as are any related configuration options.
). Its value may be silently ignored in the future.
2019-12-13 06:53:12.000 29036 INFO nova.compute.rpcapi [req-eec76cc3-35a9-4d1d-bb91-4c484f6ef855 - - - - -] Automatically selected compute RPC version 5.0 from minimum service version 35
2019-12-13 06:53:12.000 29036 INFO nova.compute.rpcapi [req-eec76cc3-35a9-4d1d-bb91-4c484f6ef855 - - - - -] Automatically selected compute RPC version 5.0 from minimum service version 35
2019-12-13 06:53:12.029 29036 INFO nova.virt.driver [req-eec76cc3-35a9-4d1d-bb91-4c484f6ef855 - - - - -] Loading compute driver 'libvirt.LibvirtDriver'
2019-12-13 06:53:12.029 29036 INFO nova.virt.driver [req-eec76cc3-35a9-4d1d-bb91-4c484f6ef855 - - - - -] Loading compute driver 'libvirt.LibvirtDriver'
2019-12-13 06:53:22.064 29036 WARNING oslo_config.cfg [req-eec76cc3-35a9-4d1d-bb91-4c484f6ef855 - - - - -] Deprecated: Option "firewall_driver" from group "DEFAULT" is deprecated for removal (
nova-network is deprecated, as are any related configuration options.
). Its value may be silently ignored in the future.
2019-12-13 06:53:22.192 29036 WARNING os_brick.initiator.connectors.remotefs [req-eec76cc3-35a9-4d1d-bb91-4c484f6ef855 - - - - -] Connection details not present. RemoteFsClient may not initialize properly.
2019-12-13 06:53:22.409 29036 WARNING oslo_config.cfg [req-eec76cc3-35a9-4d1d-bb91-4c484f6ef855 - - - - -] Deprecated: Option "linuxnet_interface_driver" from group "DEFAULT" is deprecated for removal (
nova-network is deprecated, as are any related configuration options.
). Its value may be silently ignored in the future.
2019-12-13 06:53:22.414 29036 WARNING oslo_config.cfg [req-eec76cc3-35a9-4d1d-bb91-4c484f6ef855 - - - - -] Deprecated: Option "metadata_port" from group "DEFAULT" is deprecated for removal (
nova-network is deprecated, as are any related configuration options.
). Its value may be silently ignored in the future.
2019-12-13 06:53:22.440 29036 INFO nova.service [-] Starting compute node (version 18.0.0)
2019-12-13 06:53:22.570 29036 WARNING oslo_config.cfg [req-eec76cc3-35a9-4d1d-bb91-4c484f6ef855 - - - - -] Deprecated: Option "api_endpoint" from group "ironic" is deprecated for removal (Endpoint lookup uses the service catalog via common keystoneauth1 Adapter configuration options. In the current release, api_endpoint will override this behavior, but will be ignored and/or removed in a future release. To achieve the same result, use the endpoint_override option instead.). Its value may be silently ignored in the future.
2019-12-13 06:53:22.440 29036 INFO nova.service [-] Starting compute node (version 18.0.0)
2019-12-13 06:53:22.594 29036 WARNING oslo_config.cfg [req-eec76cc3-35a9-4d1d-bb91-4c484f6ef855 - - - - -] Deprecated: Option "api_endpoint" from group "ironic" is deprecated. Use option "endpoint-override" from group "ironic".
2019-12-13 06:53:22.911 29036 CRITICAL nova [req-eec76cc3-35a9-4d1d-bb91-4c484f6ef855 - - - - -] Unhandled error: RuntimeError: dictionary changed size during iteration
2019-12-13 06:53:22.911 29036 ERROR nova Traceback (most recent call last):
2019-12-13 06:53:22.911 29036 ERROR nova File "/var/lib/openstack/bin/nova-compute", line 8, in <module>
2019-12-13 06:53:22.911 29036 ERROR nova sys.exit(main())
2019-12-13 06:53:22.911 29036 ERROR nova File "/var/lib/openstack/local/lib/python2.7/site-packages/nova/inspur/cmd/compute.py", line 71, in main
2019-12-13 06:53:22.911 29036 ERROR nova service.wait()
2019-12-13 06:53:22.911 29036 ERROR nova File "/var/lib/openstack/local/lib/python2.7/site-packages/nova/service.py", line 460, in wait
2019-12-13 06:53:22.911 29036 ERROR nova _launcher.wait()
2019-12-13 06:53:22.911 29036 ERROR nova File "/var/lib/openstack/local/lib/python2.7/site-packages/oslo_service/service.py", line 392, in wait
2019-12-13 06:53:22.911 29036 ERROR nova status, signo = self._wait_for_exit_or_signal()
2019-12-13 06:53:22.911 29036 ERROR nova File "/var/lib/openstack/local/lib/python2.7/site-packages/oslo_service/service.py", line 367, in _wait_for_exit_or_signal
2019-12-13 06:53:22.911 29036 ERROR nova self.conf.log_opt_values(LOG, logging.DEBUG)
2019-12-13 06:53:22.911 29036 ERROR nova File "/var/lib/openstack/local/lib/python2.7/site-packages/oslo_config/cfg.py", line 2579, in log_opt_values
2019-12-13 06:53:22.911 29036 ERROR nova for group_name in self._groups:
2019-12-13 06:53:22.911 29036 ERROR nova RuntimeError: dictionary changed size during iteration
2019-12-13 06:53:22.911 29036 ERROR nova
2019-12-13 06:53:22.911 29036 CRITICAL nova [req-eec76cc3-35a9-4d1d-bb91-4c484f6ef855 - - - - -] Unhandled error: RuntimeError: dictionary changed size during iteration
2019-12-13 06:53:22.911 29036 ERROR nova Traceback (most recent call last):
2019-12-13 06:53:22.911 29036 ERROR nova File "/var/lib/openstack/bin/nova-compute", line 8, in <module>
2019-12-13 06:53:22.911 29036 ERROR nova sys.exit(main())
2019-12-13 06:53:22.911 29036 ERROR nova File "/var/lib/openstack/local/lib/python2.7/site-packages/nova/inspur/cmd/compute.py", line 71, in main
2019-12-13 06:53:22.911 29036 ERROR nova service.wait()
2019-12-13 06:53:22.911 29036 ERROR nova File "/var/lib/openstack/local/lib/python2.7/site-packages/nova/service.py", line 460, in wait
2019-12-13 06:53:22.911 29036 ERROR nova _launcher.wait()
2019-12-13 06:53:22.911 29036 ERROR nova File "/var/lib/openstack/local/lib/python2.7/site-packages/oslo_service/service.py", line 392, in wait
2019-12-13 06:53:22.911 29036 ERROR nova status, signo = self._wait_for_exit_or_signal()
2019-12-13 06:53:22.911 29036 ERROR nova File "/var/lib/openstack/local/lib/python2.7/site-packages/oslo_service/service.py", line 367, in _wait_for_exit_or_signal
2019-12-13 06:53:22.911 29036 ERROR nova self.conf.log_opt_values(LOG, logging.DEBUG)
2019-12-13 06:53:22.911 29036 ERROR nova File "/var/lib/openstack/local/lib/python2.7/site-packages/oslo_config/cfg.py", line 2579, in log_opt_values
2019-12-13 06:53:22.911 29036 ERROR nova for group_name in self._groups:

2019-12-13 06:53:22.911 29036 ERROR nova RuntimeError: dictionary changed size during iteration

Looks like cfg._groups is changing while iterating through dictionary object.

After some debugging, I believe the problem is introduced by this commit https://review.opendev.org/#/c/564440/ where a new group is registered by `register_dynamic_opts`. This situation seems rare enough and I am suspecting we are caught because server's load is very high. But still, I think it's good to have a better way to lock the object.

Currently the workaround is to disable `log_options`, and I think the default value should also be False as well.

Matt Riedemann (mriedem) wrote :

Does this happen every time and is 100% reproducible or is it intermittent somehow?

Fix proposed to branch: master
Review: https://review.opendev.org/698954

Changed in oslo.config:
assignee: nobody → Stephen Finucane (stephenfinucane)
status: New → In Progress
Matt Riedemann (mriedem) on 2019-12-13
Changed in nova:
status: New → Invalid
norman shen (jshen28) wrote :

this is not 100% reproducible. But recently I have been hit by this error many times. guessing may be this has something to do with we are deploying openstack in containers?

Reviewed: https://review.opendev.org/698954
Committed: https://git.openstack.org/cgit/openstack/oslo.config/commit/?id=e3e2ba55eeeb86a9bc0624bb2592e46583e839e7
Submitter: Zuul
Branch: master

commit e3e2ba55eeeb86a9bc0624bb2592e46583e839e7
Author: Stephen Finucane <email address hidden>
Date: Fri Dec 13 16:18:33 2019 +0000

    Ensure option groups don't change during logging

    oslo.config allows us to configure groups and options dynamically. This
    can cause a race with our logging as we attempt to iterate through
    option groups that are changing under our feet. This wouldn't be a huge
    issue, since these are just logs are we can always log again, if needed,
    but we store groups in a dictionary and Python doesn't like us changing
    the size of a dict it's iterating through:

      RuntimeError: dictionary changed size during iteration

    Given that we're only reading through this option group and don't need
    to worry about a group _disappearing_, the solution is pretty simple:
    create a copy of our option group names ahead of time so we don't need
    to worry about new ones coming and messing things up.

    No tests are included since this is a race and the only way I see to
    reproduce this would involve lots of ugly threading.

    Change-Id: Id3b28465d645a24f0fcebff2dd68a9bd30e21594
    Signed-off-by: Stephen Finucane <email address hidden>
    Closes-Bug: #1856312

Changed in oslo.config:
status: In Progress → Fix Released

Change abandoned by Stephen Finucane (<email address hidden>) on branch: stable/ocata
Review: https://review.opendev.org/699527
Reason: Agreed

Change abandoned by Stephen Finucane (<email address hidden>) on branch: stable/pike
Review: https://review.opendev.org/699526

Change abandoned by Stephen Finucane (<email address hidden>) on branch: stable/queens
Review: https://review.opendev.org/699525

Reviewed: https://review.opendev.org/699522
Committed: https://git.openstack.org/cgit/openstack/oslo.config/commit/?id=22c286c63b15b6f0acf6f696b98226a9257bf745
Submitter: Zuul
Branch: stable/train

commit 22c286c63b15b6f0acf6f696b98226a9257bf745
Author: Stephen Finucane <email address hidden>
Date: Fri Dec 13 16:18:33 2019 +0000

    Ensure option groups don't change during logging

    oslo.config allows us to configure groups and options dynamically. This
    can cause a race with our logging as we attempt to iterate through
    option groups that are changing under our feet. This wouldn't be a huge
    issue, since these are just logs are we can always log again, if needed,
    but we store groups in a dictionary and Python doesn't like us changing
    the size of a dict it's iterating through:

      RuntimeError: dictionary changed size during iteration

    Given that we're only reading through this option group and don't need
    to worry about a group _disappearing_, the solution is pretty simple:
    create a copy of our option group names ahead of time so we don't need
    to worry about new ones coming and messing things up.

    No tests are included since this is a race and the only way I see to
    reproduce this would involve lots of ugly threading.

    Change-Id: Id3b28465d645a24f0fcebff2dd68a9bd30e21594
    Signed-off-by: Stephen Finucane <email address hidden>
    Closes-Bug: #1856312
    (cherry picked from commit e3e2ba55eeeb86a9bc0624bb2592e46583e839e7)

tags: added: in-stable-train

Reviewed: https://review.opendev.org/699523
Committed: https://git.openstack.org/cgit/openstack/oslo.config/commit/?id=0a9c6c156a6eedf149528fd853f72a8ddd83e147
Submitter: Zuul
Branch: stable/stein

commit 0a9c6c156a6eedf149528fd853f72a8ddd83e147
Author: Stephen Finucane <email address hidden>
Date: Fri Dec 13 16:18:33 2019 +0000

    Ensure option groups don't change during logging

    oslo.config allows us to configure groups and options dynamically. This
    can cause a race with our logging as we attempt to iterate through
    option groups that are changing under our feet. This wouldn't be a huge
    issue, since these are just logs are we can always log again, if needed,
    but we store groups in a dictionary and Python doesn't like us changing
    the size of a dict it's iterating through:

      RuntimeError: dictionary changed size during iteration

    Given that we're only reading through this option group and don't need
    to worry about a group _disappearing_, the solution is pretty simple:
    create a copy of our option group names ahead of time so we don't need
    to worry about new ones coming and messing things up.

    No tests are included since this is a race and the only way I see to
    reproduce this would involve lots of ugly threading.

    Change-Id: Id3b28465d645a24f0fcebff2dd68a9bd30e21594
    Signed-off-by: Stephen Finucane <email address hidden>
    Closes-Bug: #1856312
    (cherry picked from commit e3e2ba55eeeb86a9bc0624bb2592e46583e839e7)
    (cherry picked from commit 22c286c63b15b6f0acf6f696b98226a9257bf745)

tags: added: in-stable-stein

This issue was fixed in the openstack/oslo.config 7.0.0 release.

Reviewed: https://review.opendev.org/699524
Committed: https://git.openstack.org/cgit/openstack/oslo.config/commit/?id=c1394f6e2d3e55a8d4e4332915d26fbfd5f4ee9e
Submitter: Zuul
Branch: stable/rocky

commit c1394f6e2d3e55a8d4e4332915d26fbfd5f4ee9e
Author: Stephen Finucane <email address hidden>
Date: Fri Dec 13 16:18:33 2019 +0000

    Ensure option groups don't change during logging

    oslo.config allows us to configure groups and options dynamically. This
    can cause a race with our logging as we attempt to iterate through
    option groups that are changing under our feet. This wouldn't be a huge
    issue, since these are just logs are we can always log again, if needed,
    but we store groups in a dictionary and Python doesn't like us changing
    the size of a dict it's iterating through:

      RuntimeError: dictionary changed size during iteration

    Given that we're only reading through this option group and don't need
    to worry about a group _disappearing_, the solution is pretty simple:
    create a copy of our option group names ahead of time so we don't need
    to worry about new ones coming and messing things up.

    No tests are included since this is a race and the only way I see to
    reproduce this would involve lots of ugly threading.

    Change-Id: Id3b28465d645a24f0fcebff2dd68a9bd30e21594
    Signed-off-by: Stephen Finucane <email address hidden>
    Closes-Bug: #1856312
    (cherry picked from commit e3e2ba55eeeb86a9bc0624bb2592e46583e839e7)
    (cherry picked from commit 22c286c63b15b6f0acf6f696b98226a9257bf745)
    (cherry picked from commit 0a9c6c156a6eedf149528fd853f72a8ddd83e147)

tags: added: in-stable-rocky

This issue was fixed in the openstack/oslo.config 6.11.2 release.

This issue was fixed in the openstack/oslo.config 6.8.2 release.

This issue was fixed in the openstack/oslo.config 6.4.2 release.

Mark Goddard (mgoddard) wrote :

I raised a very similar issue in nova-scheduler on the mailing list today: http://lists.openstack.org/pipermail/openstack-discuss/2020-April/014312.html

Only affects one specific dimension of the kolla-ansible support matrix - CentOS 8, binary (RDO) on Train. The code path in oslo.config is a little different to this one.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers