Comment 7 for bug 2063094

Revision history for this message
Adrien Nader (adrien) wrote :

I'm seeing very different traces. Sorry. :)

I have processes which appear stuck and which have been using more than 11 weeks of CPU time over the past 132 days. No idea if it was in only one run (I think I looked at CPU time less than 132 days ago, and even less than 11 weeks ago, so it probably hasn't been stuck for 11 weeks). This happens in three LXC containers which run debian stable. It seems I ran out of space on the containers, but definitely not 11 weeks ago, maybe two weeks ago at most.

Traces I see:

  Thread 1 (Thread 0x7f9ca69f4040 (LWP 115) "unattended-upgr"):
  #0 0x00007f9ca6a7ad26 in _xbegin () at /usr/lib/gcc/x86_64-linux-gnu/12/include/rtmintrin.h:52
  #1 __GI___lll_trylock_elision (futex=0x2b0c9a0, adapt_count=0x89) at ../sysdeps/unix/sysv/linux/x86/elision-trylock.c:46
  #2 0x0000000000a840d8 in _PyRuntime ()
  #3 0x000000000058a52a in PyArg_ParseTupleAndKeywords (args=<optimized out>, keywords=<optimized out>, format=<optimized out>, kwlist=<optimized out>) at ../Python/getargs.c:1394
  #4 0x000000000058a245 in acquire_timed (timeout=<optimized out>, lock=0x7fffcf0f55f0) at ../Modules/_threadmodule.c:98
  #5 lock_PyThread_acquire_lock (self=0x2b0c9a0, args=<optimized out>, kwds=<optimized out>) at ../Modules/_threadmodule.c:179
  #6 0x0000000000550bbe in method_vectorcall_VARARGS_KEYWORDS (func=0x7f9ca6976f70, args=0x7f9ca6cff398, nargsf=<optimized out>, kwnames=<optimized out>) at ../Objects/descrobject.c:364
  #7 0x000000000053ac2c in _PyObject_VectorcallTstate (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=0x7f9ca6976f70, tstate=0xa840d8 <_PyRuntime+166328>) at ../Include/internal/pycore_call.h:92
  #8 PyObject_Vectorcall (callable=0x7f9ca6976f70, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at ../Objects/call.c:299
  #9 0x000000000052b940 in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at ../Python/ceval.c:4772
  #10 0x000000000052360b in _PyEval_EvalFrame (throwflag=0, frame=0x7f9ca6cff020, tstate=0xa840d8 <_PyRuntime+166328>) at ../Include/internal/pycore_ceval.h:73
  #11 _PyEval_Vector (args=0x0, argcount=0, kwnames=0x0, locals=<optimized out>, func=0x7f9ca69ba020, tstate=0xa840d8 <_PyRuntime+166328>) at ../Python/ceval.c:6435
  #12 PyEval_EvalCode (co=0x2959f10, globals=<optimized out>, locals=<optimized out>) at ../Python/ceval.c:1154
  #13 0x0000000000647497 in run_eval_code_obj (tstate=0xa840d8 <_PyRuntime+166328>, co=0x2959f10, globals=0x7f9ca69dec80, locals=0x7f9ca69dec80) at ../Python/pythonrun.c:1714
  #14 0x0000000000644d4f in run_mod (mod=<optimized out>, filename=<optimized out>, globals=0x7f9ca69dec80, locals=0x7f9ca69dec80, flags=<optimized out>, arena=<optimized out>) at ../Python/pythonrun.c:1735
  #15 0x0000000000651010 in pyrun_file (fp=fp@entry=0x2906830, filename=filename@entry=0x7f9ca6918f80, start=start@entry=257, globals=globals@entry=0x7f9ca69dec80, locals=locals@entry=0x7f9ca69dec80, closeit=closeit@entry=1, flags=0x7fffcf0f5a68) at ../Python/pythonrun.c:1630
  #16 0x0000000000650d5b in _PyRun_SimpleFileObject (fp=fp@entry=0x2906830, filename=filename@entry=0x7f9ca6918f80, closeit=closeit@entry=1, flags=flags@entry=0x7fffcf0f5a68) at ../Python/pythonrun.c:440
  #17 0x0000000000650b84 in _PyRun_AnyFileObject (fp=0x2906830, filename=0x7f9ca6918f80, closeit=1, flags=0x7fffcf0f5a68) at ../Python/pythonrun.c:79
  #18 0x000000000064f90f in pymain_run_file_obj (skip_source_first_line=0, filename=0x7f9ca6918f80, program_name=0x7f9ca69e02b0) at ../Modules/main.c:360
  #19 pymain_run_file (config=0xa6a120 <_PyRuntime+59904>) at ../Modules/main.c:379
  #20 pymain_run_python (exitcode=0x7fffcf0f5a64) at ../Modules/main.c:601
  #21 Py_RunMain () at ../Modules/main.c:680
  #22 0x00000000006275c7 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at ../Modules/main.c:734
  #23 0x00007f9ca6a1c1ca in ?? () from /lib/x86_64-linux-gnu/libc.so.6
  #24 0x00007fffcf0f5c80 in ?? ()
  #25 0x0000000000627530 in frame_dummy ()
  #26 0x0000000300400040 in ?? ()
  #27 0x00007fffcf0f5c98 in ?? ()
  #28 0x00007fffcf0f5c98 in ?? ()
  #29 0x108cfa692055372a in ?? ()
  #30 0x0000000000000000 in ?? ()

Another case:

  Thread 1 (Thread 0x7f9ca69f4040 (LWP 115) "unattended-upgr"):
  #0 0x00007f9ca643818d in _dbus_header_cache_check (field=5, header=0x9156278) at ../../../dbus/dbus-marshal-header.c:252
  #1 _dbus_header_get_field_basic (header=header@entry=0x9156278, field=field@entry=5, type=type@entry=117, value=value@entry=0x7fffcf0f5484) at ../../../dbus/dbus-marshal-header.c:1374
  #2 0x00007f9ca643cde1 in dbus_message_get_reply_serial (message=message@entry=0x9156270) at ../../../dbus/dbus-message.c:1202
  #3 0x00007f9ca6432034 in check_for_reply_unlocked (client_serial=3814221, connection=0x2a0f1e0) at ../../../dbus/dbus-connection.c:2287
  #4 check_for_reply_and_update_dispatch_unlocked (connection=connection@entry=0x2a0f1e0, pending=pending@entry=0x2a40c60) at ../../../dbus/dbus-connection.c:2355
  #5 0x00007f9ca6432233 in _dbus_connection_block_pending_call (pending=0x2a40c60) at ../../../dbus/dbus-connection.c:2474
  #6 0x00007f9ca644404a in dbus_pending_call_block (pending=<optimized out>) at ../../../dbus/dbus-pending-call.c:768
  #7 0x00007f9ca6432872 in dbus_connection_send_with_reply_and_block (connection=0x2a0f1e0, message=message@entry=0xb4df650, timeout_milliseconds=timeout_milliseconds@entry=2000, error=error@entry=0x7fffcf0f5600) at ../../../dbus/dbus-connection.c:3586
  #8 0x00007f9ca647ed3d in Connection_send_message_with_reply_and_block (self=0x7f9ca5ebb890, args=<optimized out>) at ../dbus_bindings/conn-methods.c:501
  #9 0x00000000005230d0 in method_vectorcall_VARARGS (func=0x7f9ca6506f70, args=0x7f9ca6cff508, nargsf=<optimized out>, kwnames=<optimized out>) at ../Objects/descrobject.c:330
  #10 0x000000000053ac2c in _PyObject_VectorcallTstate (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=0x7f9ca6506f70, tstate=0xa840d8 <_PyRuntime+166328>) at ../Include/internal/pycore_call.h:92
  #11 PyObject_Vectorcall (callable=0x7f9ca6506f70, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at ../Objects/call.c:299
  #12 0x000000000052b940 in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at ../Python/ceval.c:4772
  #13 0x000000000052360b in _PyEval_EvalFrame (throwflag=0, frame=0x7f9ca6cff020, tstate=0xa840d8 <_PyRuntime+166328>) at ../Include/internal/pycore_ceval.h:73
  #14 _PyEval_Vector (args=0x0, argcount=0, kwnames=0x0, locals=<optimized out>, func=0x7f9ca69ba020, tstate=0xa840d8 <_PyRuntime+166328>) at ../Python/ceval.c:6435
  #15 PyEval_EvalCode (co=0x2959f10, globals=<optimized out>, locals=<optimized out>) at ../Python/ceval.c:1154
  #16 0x0000000000647497 in run_eval_code_obj (tstate=0xa840d8 <_PyRuntime+166328>, co=0x2959f10, globals=0x7f9ca69dec80, locals=0x7f9ca69dec80) at ../Python/pythonrun.c:1714
  #17 0x0000000000644d4f in run_mod (mod=<optimized out>, filename=<optimized out>, globals=0x7f9ca69dec80, locals=0x7f9ca69dec80, flags=<optimized out>, arena=<optimized out>) at ../Python/pythonrun.c:1735
  #18 0x0000000000651010 in pyrun_file (fp=fp@entry=0x2906830, filename=filename@entry=0x7f9ca6918f80, start=start@entry=257, globals=globals@entry=0x7f9ca69dec80, locals=locals@entry=0x7f9ca69dec80, closeit=closeit@entry=1, flags=0x7fffcf0f5a68) at ../Python/pythonrun.c:1630
  #19 0x0000000000650d5b in _PyRun_SimpleFileObject (fp=fp@entry=0x2906830, filename=filename@entry=0x7f9ca6918f80, closeit=closeit@entry=1, flags=flags@entry=0x7fffcf0f5a68) at ../Python/pythonrun.c:440
  #20 0x0000000000650b84 in _PyRun_AnyFileObject (fp=0x2906830, filename=0x7f9ca6918f80, closeit=1, flags=0x7fffcf0f5a68) at ../Python/pythonrun.c:79
  #21 0x000000000064f90f in pymain_run_file_obj (skip_source_first_line=0, filename=0x7f9ca6918f80, program_name=0x7f9ca69e02b0) at ../Modules/main.c:360
  #22 pymain_run_file (config=0xa6a120 <_PyRuntime+59904>) at ../Modules/main.c:379
  #23 pymain_run_python (exitcode=0x7fffcf0f5a64) at ../Modules/main.c:601
  #24 Py_RunMain () at ../Modules/main.c:680
  #25 0x00000000006275c7 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at ../Modules/main.c:734
  #26 0x00007f9ca6a1c1ca in ?? () from /lib/x86_64-linux-gnu/libc.so.6
  #27 0x00007fffcf0f5c80 in ?? ()
  #28 0x0000000000627530 in frame_dummy ()
  #29 0x0000000300400040 in ?? ()
  #30 0x00007fffcf0f5c98 in ?? ()
  #31 0x00007fffcf0f5c98 in ?? ()
  #32 0x108cfa692055372a in ?? ()
  #33 0x0000000000000000 in ?? ()

It seems I only see these two backtraces. I'll dive into the python side later on.

Using dbus-monitor --system, I see the following activity every three seconds:

method call time=1717064374.442063 sender=:1.1 -> destination=org.freedesktop.login1 serial=3814267 path=/org/freedesktop/login1; interface=org.freedesktop.login1.Manager; member=Inhibit
   string "shutdown"
   string "Unattended Upgrades Shutdown"
   string "Stop ongoing upgrades or perform upgrades before shutdown"
   string "delay"

Every ... I see errors in batch (they are only spaced every 30µs) for all the messages that have been sent since the last error batch:

error time=1717064399.444494 sender=org.freedesktop.DBus -> destination=:1.1 error_name=org.freedesktop.DBus.Error.TimedOut reply_serial=3814267
   string "Failed to activate service 'org.freedesktop.login1': timed out (service_start_timeout=25000ms)"

In summary: 9 messages sent 3 seconds apart and 25 seconds after the first of these, errors for all of the 9 messages.

These LXC containers have been created years ago using LXC templates and I never gave a thought to logind in their context.

The rate of dbus messages is pretty low and dbus-daemon CPU time is quite low at less than 14 minutes total.