2020-06-15 20:17:03 |
Ghada Khalil |
bug |
|
|
added bug |
2020-06-15 20:19:49 |
Ghada Khalil |
description |
Brief Description
-----------------
The starlingx config code calls the k8s python client to perform a number of operations. The k8s python client creates a file under /tmp and continues to use this tmp file for the life-cycle of the sysinv-conductor process. After 10 days, sysinv starts to fail with an error that the tmp file is no longer there. There is a cleanup service in starlingx/centos that which runs daily and removes /tmp files which are not in use for 10 days.
This is a known issue with k8s:
https://github.com/kubernetes-client/python/issues/765
Options for resolution:
Severity
--------
Major - sysinv/config cmds will start failing after the system is up for 10 days w/o any controller swact
Steps to Reproduce
------------------
- Leave a system up for more than 10 days
- Attempt to make a config change -- For example: updating from http to https
Expected Behavior
------------------
config cmds remain functional regardless of how long the system has been up
Actual Behavior
----------------
config
Reproducibility
---------------
Was seen on one system which was up for more than 10 days, but expected to be reproducible
System Configuration
--------------------
any
Branch/Pull Time/Commit
-----------------------
Seen with a recent stx master load, but is a day 1 issue
Last Pass
---------
Never
Timestamp/Logs
--------------
sysinv 2020-06-11 20:51:51.446 106052 ERROR sysinv.puppet.puppet [-] failed to create secure_system config: ConfigException: File does not exists: /tmp/tmpFQ1byr
sysinv 2020-06-11 22:27:03.641 106052 ERROR sysinv.puppet.puppet [-] failed to create secure_system config: ConfigException: File does not exists: /tmp/tmpFQ1byr
sysinv 2020-06-11 22:29:09.146 106052 ERROR sysinv.puppet.puppet [-] failed to create secure_system config: ConfigException: File does not exists: /tmp/tmpFQ1byr
sysinv 2020-06-11 22:40:19.170 106052 ERROR sysinv.puppet.puppet [-] failed to create secure_system config: ConfigException: File does not exists: /tmp/tmpFQ1byr
Test Activity
-------------
System soak
Workaround
----------
Restart the sysinv-conductor to recover the system:
sudo sm-restart service sysinv-conductor |
Brief Description
-----------------
The starlingx config code calls the k8s python client to perform a number of operations. The k8s python client creates a file under /tmp and continues to use this tmp file for the life-cycle of the sysinv-conductor process. After 10 days, sysinv starts to fail with an error that the tmp file is no longer there. There is a cleanup service in starlingx/centos that which runs daily and removes /tmp files which are not in use for 10 days.
This is a known issue with k8s:
https://github.com/kubernetes-client/python/issues/765
The best option is to use a different location other than /tmp to keep these files. This is required for any starlingx process that calls the k8s python client. Keeping the files in /var/run is a good option.
Severity
--------
Major - sysinv/config cmds will start failing after the system is up for 10 days w/o any controller swact
Steps to Reproduce
------------------
- Leave a system up for more than 10 days
- Attempt to make a config change -- For example: updating from http to https
Expected Behavior
------------------
config cmds remain functional regardless of how long the system has been up
Actual Behavior
----------------
config
Reproducibility
---------------
Was seen on one system which was up for more than 10 days, but expected to be reproducible
System Configuration
--------------------
any
Branch/Pull Time/Commit
-----------------------
Seen with a recent stx master load, but is a day 1 issue
Last Pass
---------
Never
Timestamp/Logs
--------------
sysinv 2020-06-11 20:51:51.446 106052 ERROR sysinv.puppet.puppet [-] failed to create secure_system config: ConfigException: File does not exists: /tmp/tmpFQ1byr
sysinv 2020-06-11 22:27:03.641 106052 ERROR sysinv.puppet.puppet [-] failed to create secure_system config: ConfigException: File does not exists: /tmp/tmpFQ1byr
sysinv 2020-06-11 22:29:09.146 106052 ERROR sysinv.puppet.puppet [-] failed to create secure_system config: ConfigException: File does not exists: /tmp/tmpFQ1byr
sysinv 2020-06-11 22:40:19.170 106052 ERROR sysinv.puppet.puppet [-] failed to create secure_system config: ConfigException: File does not exists: /tmp/tmpFQ1byr
Test Activity
-------------
System soak
Workaround
----------
Restart the sysinv-conductor to recover the system:
sudo sm-restart service sysinv-conductor |
|
2020-06-15 20:20:23 |
Ghada Khalil |
description |
Brief Description
-----------------
The starlingx config code calls the k8s python client to perform a number of operations. The k8s python client creates a file under /tmp and continues to use this tmp file for the life-cycle of the sysinv-conductor process. After 10 days, sysinv starts to fail with an error that the tmp file is no longer there. There is a cleanup service in starlingx/centos that which runs daily and removes /tmp files which are not in use for 10 days.
This is a known issue with k8s:
https://github.com/kubernetes-client/python/issues/765
The best option is to use a different location other than /tmp to keep these files. This is required for any starlingx process that calls the k8s python client. Keeping the files in /var/run is a good option.
Severity
--------
Major - sysinv/config cmds will start failing after the system is up for 10 days w/o any controller swact
Steps to Reproduce
------------------
- Leave a system up for more than 10 days
- Attempt to make a config change -- For example: updating from http to https
Expected Behavior
------------------
config cmds remain functional regardless of how long the system has been up
Actual Behavior
----------------
config
Reproducibility
---------------
Was seen on one system which was up for more than 10 days, but expected to be reproducible
System Configuration
--------------------
any
Branch/Pull Time/Commit
-----------------------
Seen with a recent stx master load, but is a day 1 issue
Last Pass
---------
Never
Timestamp/Logs
--------------
sysinv 2020-06-11 20:51:51.446 106052 ERROR sysinv.puppet.puppet [-] failed to create secure_system config: ConfigException: File does not exists: /tmp/tmpFQ1byr
sysinv 2020-06-11 22:27:03.641 106052 ERROR sysinv.puppet.puppet [-] failed to create secure_system config: ConfigException: File does not exists: /tmp/tmpFQ1byr
sysinv 2020-06-11 22:29:09.146 106052 ERROR sysinv.puppet.puppet [-] failed to create secure_system config: ConfigException: File does not exists: /tmp/tmpFQ1byr
sysinv 2020-06-11 22:40:19.170 106052 ERROR sysinv.puppet.puppet [-] failed to create secure_system config: ConfigException: File does not exists: /tmp/tmpFQ1byr
Test Activity
-------------
System soak
Workaround
----------
Restart the sysinv-conductor to recover the system:
sudo sm-restart service sysinv-conductor |
Brief Description
-----------------
The starlingx config code calls the k8s python client to perform a number of operations. The k8s python client creates a file under /tmp and continues to use this tmp file for the life-cycle of the sysinv-conductor process. After 10 days, sysinv starts to fail with an error that the tmp file is no longer there. There is a cleanup service in starlingx/centos that which runs daily and removes /tmp files which are not in use for 10 days.
This is a known issue with k8s:
https://github.com/kubernetes-client/python/issues/765
The best option is to use a different location other than /tmp to keep these files. This is required for any starlingx process that calls the k8s python client. Keeping the files in /var/run is a good option.
Severity
--------
Major - sysinv/config cmds will start failing after the system is up for 10 days w/o any controller swact
Steps to Reproduce
------------------
- Leave a system up for more than 10 days
- Attempt to make a config change -- For example: updating from http to https
Expected Behavior
------------------
config cmds remain functional regardless of how long the system has been up
Actual Behavior
----------------
config cmds start failing after the system is up for 10 days
Reproducibility
---------------
Was seen on one system which was up for more than 10 days, but expected to be reproducible
System Configuration
--------------------
any
Branch/Pull Time/Commit
-----------------------
Seen with a recent stx master load, but is a day 1 issue
Last Pass
---------
Never
Timestamp/Logs
--------------
sysinv 2020-06-11 20:51:51.446 106052 ERROR sysinv.puppet.puppet [-] failed to create secure_system config: ConfigException: File does not exists: /tmp/tmpFQ1byr
sysinv 2020-06-11 22:27:03.641 106052 ERROR sysinv.puppet.puppet [-] failed to create secure_system config: ConfigException: File does not exists: /tmp/tmpFQ1byr
sysinv 2020-06-11 22:29:09.146 106052 ERROR sysinv.puppet.puppet [-] failed to create secure_system config: ConfigException: File does not exists: /tmp/tmpFQ1byr
sysinv 2020-06-11 22:40:19.170 106052 ERROR sysinv.puppet.puppet [-] failed to create secure_system config: ConfigException: File does not exists: /tmp/tmpFQ1byr
Test Activity
-------------
System soak
Workaround
----------
Restart the sysinv-conductor to recover the system:
sudo sm-restart service sysinv-conductor |
|
2020-06-15 22:32:39 |
Ghada Khalil |
tags |
|
stx.config stx.containers |
|
2020-06-15 22:32:50 |
Ghada Khalil |
bug |
|
|
added subscriber Daniel Badea |
2020-06-15 22:33:06 |
Ghada Khalil |
starlingx: assignee |
|
Andy (andy.wrs) |
|
2020-06-15 22:33:10 |
Ghada Khalil |
starlingx: importance |
Undecided |
High |
|
2020-06-16 14:41:45 |
Ghada Khalil |
starlingx: status |
New |
In Progress |
|
2020-06-17 18:00:13 |
Ghada Khalil |
tags |
stx.config stx.containers |
stx.4.0 stx.config stx.containers |
|
2020-06-18 13:02:51 |
OpenStack Infra |
starlingx: status |
In Progress |
Fix Released |
|
2020-06-18 13:02:52 |
OpenStack Infra |
bug watch added |
|
https://github.com/kubernetes-client/python/issues/765 |
|
2020-06-18 14:38:14 |
Ghada Khalil |
starlingx: status |
Fix Released |
Confirmed |
|
2020-06-18 15:34:37 |
Ghada Khalil |
starlingx: status |
Confirmed |
In Progress |
|
2020-06-18 20:30:08 |
OpenStack Infra |
starlingx: status |
In Progress |
Fix Released |
|
2020-09-25 01:09:01 |
Ghada Khalil |
starlingx: status |
Fix Released |
Triaged |
|
2020-09-25 01:09:08 |
Ghada Khalil |
bug |
|
|
added subscriber Allain Legacy |
2020-09-25 14:52:56 |
OpenStack Infra |
starlingx: status |
Triaged |
In Progress |
|
2020-09-25 15:46:59 |
OpenStack Infra |
starlingx: status |
In Progress |
Fix Released |
|
2020-10-31 15:17:13 |
Bart Wensley |
bug |
|
|
added subscriber Bart Wensley |
2021-06-16 12:26:18 |
OpenStack Infra |
tags |
stx.4.0 stx.config stx.containers |
in-f-centos8 stx.4.0 stx.config stx.containers |
|