2014-02-05 10:24:51 |
Bogdan Dobrelya |
bug |
|
|
added bug |
2014-02-05 10:42:02 |
Bogdan Dobrelya |
description |
We need to introduce periodic restarts for openstack services, mysql, corosync, and may be some other services as well.
(too bad Openstack services and Corosync do not understand 'kill -HUP' and became dead instead of HUPed :-) )
Please check http://pastebin.com/gUnjC0WD output for more details.
Solution suggested:
make '/etc/logrotate.d/10-fuel.conf' (daily logrotate job) to ensure periodic restarts for services mentioned above in order to clean their open file descriptors and to optimize disk space usage as well.
Note:
1) restarting will bring service accessibility outages for a very short amount of time.
2) Corosync gracefull restarting might be a *big* problem - it might hang for a long time (up to a half of a hour, or even hang for ever), but if it had killed -9, would cause the stonish-ng to consider such node as a failed one, thus immediately fenced (for now we do not have fencing yet, but we will in the nearest future)... |
We need to introduce periodic restarts for openstack services, mysql, corosync, and may be some other services as well.
(too bad Openstack services and Corosync do not understand 'kill -HUP' and became dead instead of HUPed :-) )
Please check http://pastebin.com/gUnjC0WD output for more details.
Solution suggested:
make '/etc/logrotate.d/10-fuel.conf' (daily logrotate job) to ensure periodic restarts for services mentioned above in order to clean their open file descriptors and to optimize disk space usage as well.
Note:
1) restarting will bring service accessibility outages for a very short amount of time.
2) Corosync gracefull restarting might be a *big* problem - it might hang for a long time (up to a half of a hour, or even hang for ever), but if it had killed -9, would cause the stonish-ng to consider such node as a failed one, thus immediately fenced (for now we do not have fencing yet, but we will in the nearest future)...
Note: to estimate the size (in bytes) of the disk space was held by deleted files at the 'node-foo' you can use:
ssh node-foo "lsof +L1 -s | awk '/\d+/ {print \$7}' | paste -sd+ - | bc" |
|
2014-02-05 13:56:27 |
Dmitry Pyzhov |
fuel: importance |
Undecided |
Medium |
|
2014-02-05 13:56:30 |
Dmitry Pyzhov |
fuel: milestone |
|
5.0 |
|
2014-04-24 09:55:48 |
Vladimir Kuklin |
fuel: importance |
Medium |
High |
|
2014-04-24 09:55:59 |
Vladimir Kuklin |
fuel: assignee |
|
Fuel OSCI Team (fuel-osci) |
|
2014-04-25 06:56:25 |
Bogdan Dobrelya |
fuel: importance |
High |
Medium |
|
2014-04-25 06:56:33 |
Bogdan Dobrelya |
fuel: milestone |
5.0 |
5.1 |
|
2014-06-19 13:26:01 |
Bogdan Dobrelya |
fuel: assignee |
Fuel OSCI Team (fuel-osci) |
Fuel Linux Hardening Team (fuel-linux) |
|
2014-06-19 13:27:58 |
Bogdan Dobrelya |
description |
We need to introduce periodic restarts for openstack services, mysql, corosync, and may be some other services as well.
(too bad Openstack services and Corosync do not understand 'kill -HUP' and became dead instead of HUPed :-) )
Please check http://pastebin.com/gUnjC0WD output for more details.
Solution suggested:
make '/etc/logrotate.d/10-fuel.conf' (daily logrotate job) to ensure periodic restarts for services mentioned above in order to clean their open file descriptors and to optimize disk space usage as well.
Note:
1) restarting will bring service accessibility outages for a very short amount of time.
2) Corosync gracefull restarting might be a *big* problem - it might hang for a long time (up to a half of a hour, or even hang for ever), but if it had killed -9, would cause the stonish-ng to consider such node as a failed one, thus immediately fenced (for now we do not have fencing yet, but we will in the nearest future)...
Note: to estimate the size (in bytes) of the disk space was held by deleted files at the 'node-foo' you can use:
ssh node-foo "lsof +L1 -s | awk '/\d+/ {print \$7}' | paste -sd+ - | bc" |
We need to introduce periodic restarts for openstack services, mysql, corosync, and may be some other services as well.
(too bad Openstack services and Corosync do not understand 'kill -HUP' and became dead instead of HUPed :-) )
Please check http://pastebin.com/gUnjC0WD output for more details.
Solution:
Introduce SIGHUP support for all OSt services which missing it (https://bugs.launchpad.net/oslo/+bug/1276694)
W/a (kludge) suggested:
make '/etc/logrotate.d/10-fuel.conf' (daily logrotate job) to ensure periodic restarts for services mentioned above in order to clean their open file descriptors and to optimize disk space usage as well.
Note:
1) restarting will bring service accessibility outages for a very short amount of time.
2) Corosync gracefull restarting might be a *big* problem - it might hang for a long time (up to a half of a hour, or even hang for ever), but if it had killed -9, would cause the stonish-ng to consider such node as a failed one, thus immediately fenced (for now we do not have fencing yet, but we will in the nearest future)...
Note: to estimate the size (in bytes) of the disk space was held by deleted files at the 'node-foo' you can use:
ssh node-foo "lsof +L1 -s | awk '/\d+/ {print $7}' | awk '{sum+=$1} END {print sum}'" |
|
2014-06-19 13:28:09 |
Bogdan Dobrelya |
tags |
library |
oslo |
|
2014-06-19 13:28:29 |
Bogdan Dobrelya |
fuel: assignee |
Fuel Linux Hardening Team (fuel-linux) |
Fuel Hardening Team (fuel-hardening) |
|
2014-06-24 13:55:09 |
Ilya Shakhat |
fuel: assignee |
Fuel Hardening Team (fuel-hardening) |
Fuel for Openstack (fuel) |
|
2014-07-04 07:01:02 |
Bogdan Dobrelya |
tags |
oslo |
oslo to-be-covered-by-tests |
|
2014-07-14 17:17:02 |
Dmitry Pyzhov |
fuel: assignee |
Fuel for Openstack (fuel) |
Fuel Library Team (fuel-library) |
|
2014-07-15 08:20:54 |
Bogdan Dobrelya |
fuel: status |
Triaged |
Won't Fix |
|