python3-msgpack package broken due to outdated cython

Bug #1937261 reported by Christian Rohmann
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
Invalid
Undecided
Unassigned
Ussuri
Fix Released
Medium
Unassigned
neutron
New
Undecided
Unassigned

Bug Description

After a successful upgrade of the control-plance from Train -> Ussuri on Ubuntu Bionic, we upgraded a first compute / network node and immediately ran into issues with Neutron:

We noticed that Neutron is extremely slow in setting up and wiring the network ports, so slow it would never finish and throw all sorts of errors (RabbitMQ connection timeouts, full sync required, ...)

We were now able to reproduce the error on our Ussuri DEV cloud as well:

1) First we used strace -ffff -p $PID_OF_NEUTRON_LINUXBRIDGE_AGENT and noticed that the data exchange on the unix socket between the rootwrap-daemon and the main process is really really slow.
One could actually read line by line the read calls to the fd of the socket.

2) We then (after adding lots of log lines and other intensive manual debugging) used py-spy (https://github.com/benfred/py-spy) via "py-spy top --pid $PID" on the running neutron-linuxbridge-agent process and noticed all the CPU time (process was at 100% most of the time) was spent in msgpack/fallback.py

3) Since the issue was not observed in TRAIN we compared the msgpack version used and noticed that TRAIN was using version 0.5.6 while Ussuri upgraded this dependency to 0.6.2.

4) We then downgraded to version 0.5.6 of msgpack (ignoring the actual dependencies)

--- cut ---
apt policy python3-msgpack
python3-msgpack:
  Installed: 0.6.2-1~cloud0
  Candidate: 0.6.2-1~cloud0
  Version table:
 *** 0.6.2-1~cloud0 500
        500 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-updates/ussuri/main amd64 Packages
     0.5.6-1 500
        500 http://de.archive.ubuntu.com/ubuntu bionic/main amd64 Packages
        100 /var/lib/dpkg/status
--- cut ---

vs.

--- cut ---
apt policy python3-msgpack
python3-msgpack:
  Installed: 0.5.6-1
  Candidate: 0.6.2-1~cloud0
  Version table:
     0.6.2-1~cloud0 500
        500 http://ubuntu-cloud.archive.canonical.com/ubuntu bionic-updates/ussuri/main amd64 Packages
 *** 0.5.6-1 500
        500 http://de.archive.ubuntu.com/ubuntu bionic/main amd64 Packages
        100 /var/lib/dpkg/status
--- cut ---

and et voila: The Neutron-Linuxbridge-Agent worked just like before (building one port every few seconds) and all network ports eventually converged to ACTIVE.

I could not yet spot which commit of msgpack changes (https://github.com/msgpack/msgpack-python/compare/0.5.6...v0.6.2) might have caused this issue, but I am really certain that this is a major issue for Ussuri on Ubuntu Bionic.

There are "similar" issues with
 * https://bugs.launchpad.net/oslo.privsep/+bug/1844822
 * https://bugs.launchpad.net/oslo.privsep/+bug/1896734

both related to msgpack or the size of messages exchanged.

affects: ubuntu → neutron
summary: - linuxbridge agent broken due to msgpack upgrade 0.6.2 for Ussuri on
+ linuxbridge agent broken due to msgpack upgrade to 0.6.2 for Ussuri on
Bionic
summary: - linuxbridge agent broken due to msgpack upgrade to 0.6.2 for Ussuri on
- Bionic
+ msgpack upgrade to 0.6.2 for Ussuri on Bionic breaks linuxbridge agent
summary: - msgpack upgrade to 0.6.2 for Ussuri on Bionic breaks linuxbridge agent
+ msgpack upgrade to 0.6.2 breaks linuxbridge agent
description: updated
affects: neutron (Ubuntu) → cloud-archive
Revision history for this message
Christian Rohmann (christian-rohmann) wrote (last edit ): Re: msgpack upgrade to 0.6.2 breaks linuxbridge agent

I dug a little deeper into the issue ... apparently the python3-msgpack package provided by Ubuntu Cloud Archive does not contain the cmsg extension - that's why msgpack is using its pure python, but slow, fallback.

python3-msgpack=0.5.6 from Ubuntu Bionic contains:

--- cut --
# dpkg -L python3-msgpack
/.
/usr
/usr/lib
/usr/lib/python3
/usr/lib/python3/dist-packages
/usr/lib/python3/dist-packages/msgpack
/usr/lib/python3/dist-packages/msgpack/__init__.py
/usr/lib/python3/dist-packages/msgpack/_packer.cpython-36m-x86_64-linux-gnu.so
/usr/lib/python3/dist-packages/msgpack/_unpacker.cpython-36m-x86_64-linux-gnu.so
/usr/lib/python3/dist-packages/msgpack/_version.py
/usr/lib/python3/dist-packages/msgpack/exceptions.py
/usr/lib/python3/dist-packages/msgpack/fallback.py
/usr/lib/python3/dist-packages/msgpack-0.5.6.egg-info
/usr/lib/python3/dist-packages/msgpack-0.5.6.egg-info/PKG-INFO
/usr/lib/python3/dist-packages/msgpack-0.5.6.egg-info/dependency_links.txt
/usr/lib/python3/dist-packages/msgpack-0.5.6.egg-info/top_level.txt
/usr/share
/usr/share/doc
/usr/share/doc/python3-msgpack
/usr/share/doc/python3-msgpack/README.rst.gz
/usr/share/doc/python3-msgpack/changelog.Debian.gz
/usr/share/doc/python3-msgpack/copyright
/usr/share/python3
/usr/share/python3/dist
/usr/share/python3/dist/python3-msgpack
--- cut

while python3-msgpack=0.6.2 from Cloud Archive lists ...

--- cut ---
# dpkg -L python3-msgpack
/.
/usr
/usr/lib
/usr/lib/python3
/usr/lib/python3/dist-packages
/usr/lib/python3/dist-packages/msgpack
/usr/lib/python3/dist-packages/msgpack/__init__.py
/usr/lib/python3/dist-packages/msgpack/_version.py
/usr/lib/python3/dist-packages/msgpack/exceptions.py
/usr/lib/python3/dist-packages/msgpack/fallback.py
/usr/lib/python3/dist-packages/msgpack-0.6.2.egg-info
/usr/lib/python3/dist-packages/msgpack-0.6.2.egg-info/PKG-INFO
/usr/lib/python3/dist-packages/msgpack-0.6.2.egg-info/dependency_links.txt
/usr/lib/python3/dist-packages/msgpack-0.6.2.egg-info/top_level.txt
/usr/share
/usr/share/doc
/usr/share/doc/python3-msgpack
/usr/share/doc/python3-msgpack/README.rst.gz
/usr/share/doc/python3-msgpack/changelog.Debian.gz
/usr/share/doc/python3-msgpack/copyright
/usr/share/python3
/usr/share/python3/dist
/usr/share/python3/dist/python3-msgpack
--- cut ---

this is likely due to a missing cython on the build system which is gracefully "ignored" by the msgpack setup.py, see: https://github.com/msgpack/msgpack-python/blob/38dba9634e4efa7886a777b9e7c739dc148da457/setup.py#L54

TL;DR: python3-msgpack provided for Ussuri on Ubuntu-Bionic lacks cmsg due to a potentially missing cython dependency on build system.

Revision history for this message
Christian Rohmann (christian-rohmann) wrote (last edit ):
Download full text (7.9 KiB)

One more update on the actual cause of the missing cmsg shared object of msgpack in the Ubuntu Cloud Archive package for Ussuri on Ubuntu Bionic ...

Even with cython available the build simply fails, but that that is gracefully ignored:

--- cut ---

[...]
running build_py
creating /root/SOURCE/python-msgpack-0.6.2/.pybuild/cpython3_3.6/build/msgpack
copying msgpack/_version.py -> /root/SOURCE/python-msgpack-0.6.2/.pybuild/cpython3_3.6/build/msgpack
copying msgpack/__init__.py -> /root/SOURCE/python-msgpack-0.6.2/.pybuild/cpython3_3.6/build/msgpack
copying msgpack/fallback.py -> /root/SOURCE/python-msgpack-0.6.2/.pybuild/cpython3_3.6/build/msgpack
copying msgpack/exceptions.py -> /root/SOURCE/python-msgpack-0.6.2/.pybuild/cpython3_3.6/build/msgpack
running build_ext
cythonize: 'msgpack/_cmsgpack.pyx'

Error compiling Cython file:
------------------------------------------------------------
...
# coding: utf-8

from cpython cimport *
from cpython.bytearray cimport PyByteArray_Check, PyByteArray_CheckExact
^
------------------------------------------------------------ ...

Read more...

summary: - msgpack upgrade to 0.6.2 breaks linuxbridge agent
+ python3-msgpack package broken due to outdated cython
Revision history for this message
James Page (james-page) wrote :

From buildlog of msgpack on bionic/ussuri:

cythonize: 'msgpack/_cmsgpack.pyx'

Error compiling Cython file:
------------------------------------------------------------
...
# coding: utf-8

from cpython cimport *
from cpython.bytearray cimport PyByteArray_Check, PyByteArray_CheckExact
^
------------------------------------------------------------

msgpack/_packer.pyx:4:0: 'cpython/bytearray.pxd' not found

Error compiling Cython file:
------------------------------------------------------------
...
# coding: utf-8

from cpython cimport *
from cpython.bytearray cimport PyByteArray_Check, PyByteArray_CheckExact
^
------------------------------------------------------------

msgpack/_packer.pyx:4:0: 'cpython/bytearray/PyByteArray_Check.pxd' not found

Error compiling Cython file:
------------------------------------------------------------
...
# coding: utf-8

from cpython cimport *
from cpython.bytearray cimport PyByteArray_Check, PyByteArray_CheckExact
^
------------------------------------------------------------

msgpack/_packer.pyx:4:0: 'cpython/bytearray/PyByteArray_CheckExact.pxd' not found

Error compiling Cython file:
------------------------------------------------------------
...
cdef int DEFAULT_RECURSE_LIMIT=511
cdef long long ITEM_LIMIT = (2**32)-1

cdef inline int PyBytesLike_Check(object o):
    return PyBytes_Check(o) or PyByteArray_Check(o)
                                               ^
------------------------------------------------------------

msgpack/_packer.pyx:49:48: 'PyByteArray_Check' is not a constant, variable or function identifier

Error compiling Cython file:
------------------------------------------------------------
...
cdef inline int PyBytesLike_Check(object o):
    return PyBytes_Check(o) or PyByteArray_Check(o)

cdef inline int PyBytesLike_CheckExact(object o):
    return PyBytes_CheckExact(o) or PyByteArray_CheckExact(o)
                                                         ^
------------------------------------------------------------

msgpack/_packer.pyx:53:58: 'PyByteArray_CheckExact' is not a constant, variable or function identifier

Revision history for this message
James Page (james-page) wrote :

I suspect that the cython in bionic is to old to support msgpack - we'll need to figure out how to resolve that.

James Page (james-page)
Changed in cloud-archive:
status: New → Confirmed
status: Confirmed → Invalid
no longer affects: python-msgpack (Ubuntu)
no longer affects: python-oslo.privsep (Ubuntu)
Revision history for this message
James Page (james-page) wrote :

cython from focal and rebuild of python-msgpack now in bionic-ussuri/proposed (will take a little time to build and publish).

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello:

Is this error still affecting Neutron or oslo.privsep?

Thank you and regards.

Revision history for this message
James Page (james-page) wrote : Re: [Bug 1937261] Re: python3-msgpack package broken due to outdated cython
Download full text (3.9 KiB)

No - I think it's just the lack of native extension support in msgpack so
those tasks can be removed.

On Mon, Jul 26, 2021 at 3:41 PM Rodolfo Alonso <email address hidden>
wrote:

> Hello:
>
> Is this error still affecting Neutron or oslo.privsep?
>
> Thank you and regards.
>
> --
> You received this bug notification because you are a member of Ubuntu
> OpenStack, which is subscribed to Ubuntu Cloud Archive.
> https://bugs.launchpad.net/bugs/1937261
>
> Title:
> python3-msgpack package broken due to outdated cython
>
> Status in Ubuntu Cloud Archive:
> Invalid
> Status in Ubuntu Cloud Archive ussuri series:
> Fix Committed
> Status in neutron:
> New
> Status in oslo.privsep:
> New
>
> Bug description:
> After a successful upgrade of the control-plance from Train -> Ussuri
> on Ubuntu Bionic, we upgraded a first compute / network node and
> immediately ran into issues with Neutron:
>
> We noticed that Neutron is extremely slow in setting up and wiring the
> network ports, so slow it would never finish and throw all sorts of
> errors (RabbitMQ connection timeouts, full sync required, ...)
>
> We were now able to reproduce the error on our Ussuri DEV cloud as
> well:
>
> 1) First we used strace -ffff -p $PID_OF_NEUTRON_LINUXBRIDGE_AGENT and
> noticed that the data exchange on the unix socket between the
> rootwrap-daemon and the main process is really really slow.
> One could actually read line by line the read calls to the fd of the
> socket.
>
> 2) We then (after adding lots of log lines and other intensive manual
> debugging) used py-spy (https://github.com/benfred/py-spy) via "py-spy
> top --pid $PID" on the running neutron-linuxbridge-agent process and
> noticed all the CPU time (process was at 100% most of the time) was
> spent in msgpack/fallback.py
>
> 3) Since the issue was not observed in TRAIN we compared the msgpack
> version used and noticed that TRAIN was using version 0.5.6 while
> Ussuri upgraded this dependency to 0.6.2.
>
> 4) We then downgraded to version 0.5.6 of msgpack (ignoring the actual
> dependencies)
>
> --- cut ---
> apt policy python3-msgpack
> python3-msgpack:
> Installed: 0.6.2-1~cloud0
> Candidate: 0.6.2-1~cloud0
> Version table:
> *** 0.6.2-1~cloud0 500
> 500 http://ubuntu-cloud.archive.canonical.com/ubuntu
> bionic-updates/ussuri/main amd64 Packages
> 0.5.6-1 500
> 500 http://de.archive.ubuntu.com/ubuntu bionic/main amd64
> Packages
> 100 /var/lib/dpkg/status
> --- cut ---
>
>
> vs.
>
> --- cut ---
> apt policy python3-msgpack
> python3-msgpack:
> Installed: 0.5.6-1
> Candidate: 0.6.2-1~cloud0
> Version table:
> 0.6.2-1~cloud0 500
> 500 http://ubuntu-cloud.archive.canonical.com/ubuntu
> bionic-updates/ussuri/main amd64 Packages
> *** 0.5.6-1 500
> 500 http://de.archive.ubuntu.com/ubuntu bionic/main amd64
> Packages
> 100 /var/lib/dpkg/status
> --- cut ---
>
>
> and et voila: The Neutron-Linuxbridge-Agent worked just like before
> (building one port every few seconds) and all network ports eventually
> converged to ACTIVE.
>
> I coul...

Read more...

Revision history for this message
Christian Rohmann (christian-rohmann) wrote (last edit ):

@James Page:

1) We ran your package for a few days and things seem to be smooth now.
Thanks again for picking this up so quickly.

2) May I suggest to keep this an issue with oslo.privsep and have them add either a (unit) test to ensure every supported platform always has the native extension for pack/unpack available and to throw a warning at privsep init. Maybe even going as far as refusing the startup without an explicit override flag if msgpack native is not available is sensible to me. The OpenStack project usage of oslo.privset exchanges vast amounts of data (see https://bugs.launchpad.net/oslo.privsep/+bug/1896734), using pure python fallback for msgpack is simply not an option in any real world scenarios. Even without very many interfaces in Neutron, just switching to debug logs *sic* could break things as well, rendering an operation to have no proper logs.

And having a condition of intermittent errors or an extreme degradation is really the worse of issues to have as they are really hard to debug. Especially since things seems to work in the beginning and one does not know where to look as absolutely no helpful errors are thrown in the logs.

Revision history for this message
Christian Rohmann (christian-rohmann) wrote :

I now also raised an issue with python-msgpack about their build script being too nice and gracefully ignoring all potential build errors of the extension: https://github.com/msgpack/msgpack-python/issues/481

Revision history for this message
James Page (james-page) wrote :

python-msgpack promoted to Ussuri updates pocket.

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Is this really neutron issue? I think we can remove neutron from the list of the projects, no?

Revision history for this message
Takashi Kajinami (kajinamit) wrote :

This is a problem with python-msgpack in Ubuntu, and does not look like a bug in oslo.privsep or neutron.

no longer affects: oslo.privsep
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.