autopkgtest fails because mailman3 takes too long to start

Bug #2004658 reported by Olivier Gayot
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
mailman3 (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

The autopkgtest test suite for mailman3 is often failing on Ubuntu because the service takes too long to start.

What the mailman3-api test does is it restarts the mailman3 service (using a legacy systemV command) and then tries to connect to it using curl after 10 seconds.

```
service mailman3 restart
# wait for mailman3 to come back after restart
sleep 10
curl -s --user "$admin_user:$admin_pass" http://localhost:$port/3.1/system/versions
```
Sadly, the service takes longer than 10 seconds to start. Here's an excerpt (with debug logs enabled) in an amd64 VM:

Feb 03 15:17:13 2023 (6613) Master stopped
Feb 03 15:17:16 2023 (11627) Master started
Feb 03 15:17:29 2023 (11646) command runner started.
Feb 03 15:17:30 2023 (11644) archive runner started.
Feb 03 15:17:30 2023 (11653) retry runner started.
Feb 03 15:17:31 2023 (11651) pipeline runner started.
Feb 03 15:17:31 2023 (11647) in runner started.
Feb 03 15:17:31 2023 (11654) task runner started.
Feb 03 15:17:31 2023 (11656) digest runner started.
Feb 03 15:17:32 2023 (11650) out runner started.
Feb 03 15:17:32 2023 (11648) lmtp runner started.
Feb 03 15:17:32 2023 (11654) Task runner evicted 0 expired pendings
Feb 03 15:17:32 2023 (11654) Task runner deleted 0 orphaned workflows
Feb 03 15:17:32 2023 (11654) Task runner deleted 0 orphaned requests
Feb 03 15:17:32 2023 (11654) Task runner deleted 0 orphaned messages
Feb 03 15:17:32 2023 (11654) Task runner evicted expired cache entries
Feb 03 15:17:32 2023 (11655) virgin runner started.
Feb 03 15:17:32 2023 (11649) nntp runner started.
Feb 03 15:17:32 2023 (11652) rest runner started.
[2023-02-03 15:17:32 +0100] [11652] [INFO] Starting gunicorn 20.1.0
[2023-02-03 15:17:32 +0100] [11652] [INFO] Listening at: http://127.0.0.1:8001 (11652)
[2023-02-03 15:17:32 +0100] [11652] [INFO] Using worker: sync
[2023-02-03 15:17:32 +0100] [11811] [INFO] Booting worker with pid: 11811
[2023-02-03 15:17:32 +0100] [11816] [INFO] Booting worker with pid: 11816

^ in the example above, the webserver was started 19 seconds after the master process was started.

It could just be a performance issue of the test-bed. Although I am surprised to see that so far Debian is having better success with their runs.

Since the "sleep 10" is a pretty arbitrary value, I suggest with bump it up to 30 seconds and see if it helps.

Olivier Gayot (ogayot)
Changed in mailman3 (Ubuntu):
assignee: nobody → Olivier Gayot (ogayot)
Revision history for this message
Olivier Gayot (ogayot) wrote :

Running the test locally with the default memory size (1GiB) of autopkgtest-virt-qemu brings the test-bed down ; and it makes it feel as if mailman takes forever to start.

I looked and noticed that mailman3 consumes > 750 MiB at startup.

Testing again locally with 1592 MiB of memory (which is theoretically the amount of memory of a typical test-bed in the infrastructure) feels much better.

If increasing the "sleep" value does not help on the long run, I suggest we add mailman3 to big_packages.

Revision history for this message
Olivier Gayot (ogayot) wrote :

Attaching debdiff which gives more time to mailman3 to start.

If the patch does not help on the long run, we could add mailman3 to the list of big_packages.

Successful autopkgtest run on PPA:
https://autopkgtest.ubuntu.com/results/autopkgtest-lunar-ogayot-lunar-proposed/lunar/amd64/m/mailman3/20230203_172856_d71c9@/log.gz

Changed in mailman3 (Ubuntu):
assignee: Olivier Gayot (ogayot) → nobody
Revision history for this message
Simon Quigley (tsimonq2) wrote :

Hi Olivier, a couple of thoughts on this one.

For such a small change, usually I would not be sure it's worth keeping this as an Ubuntu delta. That being said, I familiarized myself with bug 1960547 and the RFH bug and won't consider it a blocker for this upload.

Please follow up with the Release Team if the autopkgtests fail as a result of this upload.

Thanks!

Changed in mailman3 (Ubuntu):
status: New → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mailman3 - 3.3.8-1ubuntu1

---------------
mailman3 (3.3.8-1ubuntu1) lunar; urgency=medium

  * Give mailman3 more time to start in the mailman3-api test. (LP: #2004658).
  * If the mailman3-api test fails because the service has not started fast
    enough, make sure the reason is visible in the output.

 -- Olivier Gayot <email address hidden> Fri, 03 Feb 2023 13:46:38 +0100

Changed in mailman3 (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.