Promoter: running more than one promotion at once is causing overload to the docker dm subsystem
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Won't Fix
|
High
|
Gabriele Cerami |
Bug Description
On the promoter server, when more than one promotion process is running, the logs are showing weird errors:
for example
failed: [localhost] (item=[u'etcd', u'f106094e961c5
failed: [localhost] (item=[
we then see
[4147491.688016] XFS (dm-1): Ending clean mount
[4147491.808620] XFS (dm-1): Unmounting Filesystem
[4147623.826142] device-mapper: thin: Deletion of thin device 38037 failed.
[4147623.837952] device-mapper: ioctl: remove_all left 1 open device(s)
[4148168.722891] device-mapper: thin: Deletion of thin device 38448 failed.
in dmesg, and
Apr 18 13:44:13 promoter-
Apr 18 13:44:14 promoter-
in the journal
We had a lot of promotion in the past two days, and sometimes two releases are promoting at the same time, with all the containers needed to be pushed and pulled. We are suspecting these are causing IO overload on the server
Changed in tripleo: | |
assignee: | nobody → Gabriele Cerami (gcerami) |
Changed in tripleo: | |
milestone: | rocky-1 → rocky-2 |
Changed in tripleo: | |
status: | Triaged → Won't Fix |
Uploaded patch to do only a single promotion at a time /review. rdoproject. org/r/13429
https:/