Mimic version active ceph-mgr memory leak
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
openstack-helm-infra |
New
|
Undecided
|
Unassigned |
Bug Description
Deploy ceph-client and observe active ceph-mgr container logs. It depends on the host memory. I use 3 32GB hosts to deploy ceph and their memory will exhaust in 3-4 days. I have tried this with different kubernetes setup like kubeadm and kubespray and I see this behavior consistently. This is a serious problem. Killing the active ceph-mgr so that k8s will create another pods would be a work around temporarily
[15/Feb/
2019-02-15 14:57:50.539 7f936f6ab700 0 -- xx.xx.xx.xx:6820/1 >> yy.yy.yy.yy:0/1 conn(0x3d03800 :6820 s=STATE_
2019-02-15 14:57:50.547 7f936f6ab700 0 -- xx.xx.xx.xx:6820/1 >> xx.xx.xx.xx:0/1 conn(0x3e48000 :6820 s=STATE_
2019-02-15 14:57:50.859 7f9367ced700 1 mgr send_beacon active
2019-02-15 14:57:51.539 7f936f6ab700 0 -- xx.xx.xx.xx:6820/1 >> xx.xx.xx.xx:0/1 conn(0x3e4b100 :6820 s=STATE_
2019-02-15 14:57:51.539 7f936b4f4700 0 client.0 ms_handle_reset on xx.xx.xx.xx:6820/1
2019-02-15 14:57:51.539 7f936f6ab700 0 -- xx.xx.xx.xx:6820/1 >> xx.xx.xx.xx:0/1 conn(0x3e4b800 :6820 s=STATE_
2019-02-15 14:57:51.539 7f93706ad700 0 -- xx.xx.xx.xx:6820/1 >> yy.yy.yy.yy:0/1 conn(0x3e4aa00 :6820 s=STATE_
2019-02-15 14:57:51.543 7f936f6ab700 0 -- xx.xx.xx.xx:6820/1 >> yy.yy.yy.yy:0/1 conn(0x2182700 :6820 s=STATE_
2019-02-15 14:57:52.539 7f93706ad700 0 -- xx.xx.xx.xx:6820/1 >> yy.yy.yy.yy:0/1 conn(0x2182000 :6820 s=STATE_
2019-02-15 14:57:52.539 7f936f6ab700 0 -- xx.xx.xx.xx:6820/1 >> xx.xx.xx.xx:0/1 conn(0x3ffae00 :6820 s=STATE_
2019-02-15 14:57:52.539 7f936b4f4700 0 client.0 ms_handle_reset on xx.xx.xx.xx:6820/1
2019-02-15 14:57:52.539 7f936f6ab700 0 -- xx.xx.xx.xx:6820/1 >> xx.xx.xx.xx:0/1 conn(0x3ffb500 :6820 s=STATE_
2019-02-15 14:57:52.543 7f936f6ab700 0 -- xx.xx.xx.xx:6820/1 >> yy.yy.yy.yy:0/1 conn(0x3ffbc00 :6820 s=STATE_
2019-02-15 14:57:52.863 7f9367ced700 1 mgr send_beacon active
2019-02-15 14:57:53.539 7f93706ad700 0 -- xx.xx.xx.xx:6820/1 >> yy.yy.yy.yy:0/1 conn(0x3ffc300 :6820 s=STATE_
2019-02-15 14:57:53.539 7f936f6ab700 0 -- xx.xx.xx.xx:6820/1 >> xx.xx.xx.xx:0/1 conn(0x3ffca00 :6820 s=STATE_
2019-02-15 14:57:53.539 7f936b4f4700 0 client.0 ms_handle_reset on xx.xx.xx.xx:6820/1
2019-02-15 14:57:53.539 7f936f6ab700 0 -- xx.xx.xx.xx:6820/1 >> xx.xx.xx.xx:0/1 conn(0x3ffd100 :6820 s=STATE_
2019-02-15 14:57:53.543 7f936f6ab700 0 -- xx.xx.xx.xx:6820/1 >> yy.yy.yy.yy:0/1 conn(0x3ffd800 :6820 s=STATE_
2019-02-15 14:57:54.539 7f93706ad700 0 -- xx.xx.xx.xx:6820/1 >> yy.yy.yy.yy:0/1 conn(0x3ea4000 :6820 s=STATE_
2019-02-15 14:57:54.539 7f936f6ab700 0 -- xx.xx.xx.xx:6820/1 >> xx.xx.xx.xx:0/1 conn(0x3ea4700 :6820 s=STATE_
2019-02-15 14:57:54.539 7f936b4f4700 0 client.0 ms_handle_reset on xx.xx.xx.xx:6820/1
2019-02-15 14:57:54.539 7f936f6ab700 0 -- xx.xx.xx.xx:6820/1 >> xx.xx.xx.xx:0/1 conn(0x3ea4e00 :6820 s=STATE_