The scenario discussed in this bug so far was in the context of neutron-openvswitch-agent and a neutron patch resolved that aspect but the problem has resurfaced in environments using neutron-l3-agent with very large numbers of ports and the cause/fix is as follows:
msgpack patch [1] landed in version v1.0.0 and made the following changes:
* set Unpacker() default max_buffer_size to 1M (was 0)
* set max_buffer_size to INT_MAX i.e. 2147483647 (1G) if max_buffer_size is set to 0
* set indiviual buffer sizes to max_buffer_size
This meant that oslo.privsep in Openstack Ussuri, which does not set a max_buffer_size, was ending up with e.g. max_str_len of 1024 * 1024 (1M) which is a limit it easily hits on hosts with large numbers of ports and the result being that oslo.privsep crashes with a log like:
Note that this log is only logged to the journal and not neutron-l3-agent.log since it is happening in privsep.
Patch [2] landed in oslo.privsep 2.8.0 that sets max_buffer_size to 100M. This version of privsep is available with Openstack Zed and beyond.
So in order to fix this for Openstack Ussuri upwards we need to look at backporting patch [2] down to python3-oslo.privsep 2.1.1 (in focal-updates). It is very small patch so I think it should be safe to do.
The scenario discussed in this bug so far was in the context of neutron- openvswitch- agent and a neutron patch resolved that aspect but the problem has resurfaced in environments using neutron-l3-agent with very large numbers of ports and the cause/fix is as follows:
msgpack patch [1] landed in version v1.0.0 and made the following changes:
* set Unpacker() default max_buffer_size to 1M (was 0)
* set max_buffer_size to INT_MAX i.e. 2147483647 (1G) if max_buffer_size is set to 0
* set indiviual buffer sizes to max_buffer_size
Prior to this (e.g. in v0.6.2 which ships with Openstack Ussuri in Ubuntu Focal) max_buffer_size default was 0 and individal buffers had varying and small defaults (see https:/ /github. com/msgpack/ msgpack- python/ blob/997b524f06 176aaa6bd255a04 6a8746e99b4f87d /msgpack/ _unpacker. pyx#L368). They were also set to a default if max_buffer_size was 0 but *before* max_buffer_size was set to INT_MAX if it was also 0.
This meant that oslo.privsep in Openstack Ussuri, which does not set a max_buffer_size, was ending up with e.g. max_str_len of 1024 * 1024 (1M) which is a limit it easily hits on hosts with large numbers of ports and the result being that oslo.privsep crashes with a log like:
$ journalctl -D /var/log/journal/ --unit neutron-l3-agent --grep ValueError 58f2a1d3307dad1 d3 -- l3-agent[ 32426]: ValueError: 1050710 exceeds max_str_ len(1048576)
-- Boot 8fde93ff89f443f
Jul 22 12:16:58 ubuntu neutron-
Note that this log is only logged to the journal and not neutron- l3-agent. log since it is happening in privsep.
Patch [2] landed in oslo.privsep 2.8.0 that sets max_buffer_size to 100M. This version of privsep is available with Openstack Zed and beyond.
So in order to fix this for Openstack Ussuri upwards we need to look at backporting patch [2] down to python3- oslo.privsep 2.1.1 (in focal-updates). It is very small patch so I think it should be safe to do.
[1] https:/ /github. com/msgpack/ msgpack- python/ commit/ c356035a576c38d b5ca232ede07b29 1087f1b8b2 /github. com/openstack/ oslo.privsep/ commit/ c223dbced7d5a8d 1920fe764cbce42 cf844538e1
[2] https:/