Activity log for bug #2025160

Date Who What changed Old value New value Message
2023-06-27 15:17:02 Ponnuvel Palaniyappan bug added bug
2023-06-27 15:17:40 Ponnuvel Palaniyappan description Juju controllers use a lot of socket connections that they frequently hit the open files limit 64000. ``` 2023-06-07 11:35:41 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 320ms 2023-06-07 11:35:41 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 640ms 2023-06-07 11:35:42 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:43 ERROR juju.worker.dependency engine.go:695 "instance-poller" manifold worker returned unexpected error: unexpected: Get "http://10.134.171.3/MAAS/api/2.0/machines/?agent_name=fb611b32-8f03-4ab6-8e68-398a7f50ca86&id=we7mgc&id=amd4rr&id=4tgxyw&id=mar6yg&id=k4h8pt&id=tfsgh6&id=pwnn8y&id=4gt4ka": dial tcp 10.134.171.3:80: socket: too many open files 2023-06-07 11:35:43 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:44 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:45 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:46 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:47 ERROR juju.apiserver.metricsender metricsender.go:114 Post "https://api.jujucharms.com/omnibus/v3/metrics": dial tcp: lookup api.jujucharms.com on 127.0.0.53:53: dial udp 127.0.0.53:53: socket: too many open files github.com/juju/juju/apiserver/facades/agent/metricsender.(*HTTPSender).Send:31: 2023-06-07 11:35:47 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:48 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:48 WARNING juju.apiserver.metricsmanager metricsmanager.go:242 failed to send metrics for model-b2b523b4-18d9-4b62-8615-d2f3d96e3e51: Post "https://api.jujucharms.com/omnibus/v3/metrics": dial tcp: lookup api.jujucharms.com on 127.0.0.53:53: dial udp 127.0.0.53:53: socket: too many open files 2023-06-07 11:35:49 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s ``` At the time this was oberserved the controllers were at 2.9.42. Since then they've been upgraded to 2.9.43. I doubt this matters - just noting it. Juju controllers use a lot of socket connections that they frequently hit the open files limit 64000. ``` 2023-06-07 11:35:41 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 320ms 2023-06-07 11:35:41 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 640ms 2023-06-07 11:35:42 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:43 ERROR juju.worker.dependency engine.go:695 "instance-poller" manifold worker returned unexpected error: unexpected: Get "http://10.134.171.3/MAAS/api/2.0/machines/?agent_name=fb611b32-8f03-4ab6-8e68-398a7f50ca86&id=we7mgc&id=amd4rr&id=4tgxyw&id=mar6yg&id=k4h8pt&id=tfsgh6&id=pwnn8y&id=4gt4ka": dial tcp 10.134.171.3:80: socket: too many open files 2023-06-07 11:35:43 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:44 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:45 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:46 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:47 ERROR juju.apiserver.metricsender metricsender.go:114 Post "https://api.jujucharms.com/omnibus/v3/metrics": dial tcp: lookup api.jujucharms.com on 127.0.0.53:53: dial udp 127.0.0.53:53: socket: too many open files github.com/juju/juju/apiserver/facades/agent/metricsender.(*HTTPSender).Send:31: 2023-06-07 11:35:47 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:48 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:48 WARNING juju.apiserver.metricsmanager metricsmanager.go:242 failed to send metrics for model-b2b523b4-18d9-4b62-8615-d2f3d96e3e51: Post "https://api.jujucharms.com/omnibus/v3/metrics": dial tcp: lookup api.jujucharms.com on 127.0.0.53:53: dial udp 127.0.0.53:53: socket: too many open files 2023-06-07 11:35:49 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s ``` At the time this was oberserved the controllers were at 2.9.42. Since then they've been upgraded to 2.9.43. I doubt this matters - just noting it. This looks similar to https://bugs.launchpad.net/juju/+bug/1979957.
2023-06-27 15:20:08 Ponnuvel Palaniyappan description Juju controllers use a lot of socket connections that they frequently hit the open files limit 64000. ``` 2023-06-07 11:35:41 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 320ms 2023-06-07 11:35:41 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 640ms 2023-06-07 11:35:42 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:43 ERROR juju.worker.dependency engine.go:695 "instance-poller" manifold worker returned unexpected error: unexpected: Get "http://10.134.171.3/MAAS/api/2.0/machines/?agent_name=fb611b32-8f03-4ab6-8e68-398a7f50ca86&id=we7mgc&id=amd4rr&id=4tgxyw&id=mar6yg&id=k4h8pt&id=tfsgh6&id=pwnn8y&id=4gt4ka": dial tcp 10.134.171.3:80: socket: too many open files 2023-06-07 11:35:43 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:44 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:45 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:46 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:47 ERROR juju.apiserver.metricsender metricsender.go:114 Post "https://api.jujucharms.com/omnibus/v3/metrics": dial tcp: lookup api.jujucharms.com on 127.0.0.53:53: dial udp 127.0.0.53:53: socket: too many open files github.com/juju/juju/apiserver/facades/agent/metricsender.(*HTTPSender).Send:31: 2023-06-07 11:35:47 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:48 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:48 WARNING juju.apiserver.metricsmanager metricsmanager.go:242 failed to send metrics for model-b2b523b4-18d9-4b62-8615-d2f3d96e3e51: Post "https://api.jujucharms.com/omnibus/v3/metrics": dial tcp: lookup api.jujucharms.com on 127.0.0.53:53: dial udp 127.0.0.53:53: socket: too many open files 2023-06-07 11:35:49 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s ``` At the time this was oberserved the controllers were at 2.9.42. Since then they've been upgraded to 2.9.43. I doubt this matters - just noting it. This looks similar to https://bugs.launchpad.net/juju/+bug/1979957. Juju controllers use a lot of socket connections that they frequently hit the open files limit 64000. ``` 2023-06-07 11:35:41 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 320ms 2023-06-07 11:35:41 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 640ms 2023-06-07 11:35:42 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:43 ERROR juju.worker.dependency engine.go:695 "instance-poller" manifold worker returned unexpected error: unexpected: Get "http://10.134.171.3/MAAS/api/2.0/machines/?agent_name=fb611b32-8f03-4ab6-8e68-398a7f50ca86&id=we7mgc&id=amd4rr&id=4tgxyw&id=mar6yg&id=k4h8pt&id=tfsgh6&id=pwnn8y&id=4gt4ka": dial tcp 10.134.171.3:80: socket: too many open files 2023-06-07 11:35:43 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:44 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:45 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:46 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:47 ERROR juju.apiserver.metricsender metricsender.go:114 Post "https://api.jujucharms.com/omnibus/v3/metrics": dial tcp: lookup api.jujucharms.com on 127.0.0.53:53: dial udp 127.0.0.53:53: socket: too many open files github.com/juju/juju/apiserver/facades/agent/metricsender.(*HTTPSender).Send:31: 2023-06-07 11:35:47 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:48 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s 2023-06-07 11:35:48 WARNING juju.apiserver.metricsmanager metricsmanager.go:242 failed to send metrics for model-b2b523b4-18d9-4b62-8615-d2f3d96e3e51: Post "https://api.jujucharms.com/omnibus/v3/metrics": dial tcp: lookup api.jujucharms.com on 127.0.0.53:53: dial udp 127.0.0.53:53: socket: too many open files 2023-06-07 11:35:49 WARNING juju.worker.httpserver log.go:198 http: Accept error: accept tcp [::]:17070: accept4: too many open files; retrying in 1s ``` At the time this was oberserved the controllers were at 2.9.42. Since then they've been upgraded to 2.9.43. I doubt this matters - just noting it. After restarting the controllers, they started gradually going up again - in about 2 days the fd usage of one of the controllers reached ~15000 and growing. This looks similar to https://bugs.launchpad.net/juju/+bug/1979957.
2023-06-28 14:00:22 Ponnuvel Palaniyappan summary juju controllers git open file limit juju controllers hit open file limit
2023-06-29 10:38:23 Joseph Phillips juju: status New Incomplete
2023-06-29 11:35:39 Ponnuvel Palaniyappan attachment added juju_engine_report https://bugs.launchpad.net/juju/+bug/2025160/+attachment/5682833/+files/juju_engine_report
2023-07-04 08:10:37 Ponnuvel Palaniyappan attachment added machine-2-2023-06-08T16-44-14.977.log.gz https://bugs.launchpad.net/juju/+bug/2025160/+attachment/5683785/+files/machine-2-2023-06-08T16-44-14.977.log.gz
2023-09-07 04:17:15 Launchpad Janitor juju: status Incomplete Expired
2023-10-29 07:32:21 Ponnuvel Palaniyappan juju: status Expired New
2023-10-30 02:26:14 Nobuto Murata bug added subscriber Nobuto Murata
2023-11-23 10:43:46 Joseph Phillips juju: status New Incomplete
2024-01-15 09:24:14 Benjamin Allot bug added subscriber The Canonical Sysadmins
2024-03-16 04:17:08 Launchpad Janitor juju: status Incomplete Expired
2024-03-18 14:52:48 Junien F juju: status Expired New
2024-04-18 09:27:17 Joseph Phillips juju: status New Incomplete
2024-06-11 10:37:13 James Simpson juju: status Incomplete New