autopkgtest fails because mailman3 takes too long to start
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
mailman3 (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
The autopkgtest test suite for mailman3 is often failing on Ubuntu because the service takes too long to start.
What the mailman3-api test does is it restarts the mailman3 service (using a legacy systemV command) and then tries to connect to it using curl after 10 seconds.
```
service mailman3 restart
# wait for mailman3 to come back after restart
sleep 10
curl -s --user "$admin_
```
Sadly, the service takes longer than 10 seconds to start. Here's an excerpt (with debug logs enabled) in an amd64 VM:
Feb 03 15:17:13 2023 (6613) Master stopped
Feb 03 15:17:16 2023 (11627) Master started
Feb 03 15:17:29 2023 (11646) command runner started.
Feb 03 15:17:30 2023 (11644) archive runner started.
Feb 03 15:17:30 2023 (11653) retry runner started.
Feb 03 15:17:31 2023 (11651) pipeline runner started.
Feb 03 15:17:31 2023 (11647) in runner started.
Feb 03 15:17:31 2023 (11654) task runner started.
Feb 03 15:17:31 2023 (11656) digest runner started.
Feb 03 15:17:32 2023 (11650) out runner started.
Feb 03 15:17:32 2023 (11648) lmtp runner started.
Feb 03 15:17:32 2023 (11654) Task runner evicted 0 expired pendings
Feb 03 15:17:32 2023 (11654) Task runner deleted 0 orphaned workflows
Feb 03 15:17:32 2023 (11654) Task runner deleted 0 orphaned requests
Feb 03 15:17:32 2023 (11654) Task runner deleted 0 orphaned messages
Feb 03 15:17:32 2023 (11654) Task runner evicted expired cache entries
Feb 03 15:17:32 2023 (11655) virgin runner started.
Feb 03 15:17:32 2023 (11649) nntp runner started.
Feb 03 15:17:32 2023 (11652) rest runner started.
[2023-02-03 15:17:32 +0100] [11652] [INFO] Starting gunicorn 20.1.0
[2023-02-03 15:17:32 +0100] [11652] [INFO] Listening at: http://
[2023-02-03 15:17:32 +0100] [11652] [INFO] Using worker: sync
[2023-02-03 15:17:32 +0100] [11811] [INFO] Booting worker with pid: 11811
[2023-02-03 15:17:32 +0100] [11816] [INFO] Booting worker with pid: 11816
^ in the example above, the webserver was started 19 seconds after the master process was started.
It could just be a performance issue of the test-bed. Although I am surprised to see that so far Debian is having better success with their runs.
Since the "sleep 10" is a pretty arbitrary value, I suggest with bump it up to 30 seconds and see if it helps.
Changed in mailman3 (Ubuntu): | |
assignee: | nobody → Olivier Gayot (ogayot) |
Running the test locally with the default memory size (1GiB) of autopkgtest- virt-qemu brings the test-bed down ; and it makes it feel as if mailman takes forever to start.
I looked and noticed that mailman3 consumes > 750 MiB at startup.
Testing again locally with 1592 MiB of memory (which is theoretically the amount of memory of a typical test-bed in the infrastructure) feels much better.
If increasing the "sleep" value does not help on the long run, I suggest we add mailman3 to big_packages.