9.2 stops collect logs after some time

Bug #1659210 reported by Sergey Galkin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
High
Fuel Sustaining
Mitaka
Invalid
High
Fuel Sustaining
Newton
Invalid
High
Fuel Sustaining

Bug Description

Steps to reproduce:
1. Install 9.0
2. Upgrade to 9.2 from http://mirror.fuel-infra.org/mos-repos/centos/mos9.0-centos7/snapshots/proposed-2017-01-13-184421/x86_64
3. Start deploying cluster (~300 nodes in my case)

After some times logs in /var/log/remote are not updated.

On example, one of controller - node-2009

[root@fuel remote]# date
Wed Jan 25 08:56:38 UTC 2017

[root@fuel remote]# tail -n1 node-2009.domain.tld/kernel.log
2017-01-24T18:25:27.175605+00:00 warning: [12855.742914] nr_pdflush_threads exported in /proc is scheduled for removal

[root@fuel remote]# ssh node-2009.domain.tld tail -n1 /var/log/kern.log
Warning: Permanently added 'node-2009.domain.tld' (ECDSA) to the list of known hosts.
<4>Jan 25 03:18:07 node-2009 kernel: [44816.762676] NOHZ: local_softirq_pending 08

all controllers does not have puppet-apply.log in /var/log/remote

Another example - random compute node
[root@fuel remote]# date
Wed Jan 25 09:02:46 UTC 2017
[root@fuel remote]# tail -n1 node-2228.domain.tld/kernel.log
2017-01-24T18:43:09.624266+00:00 info: [ 9867.217408] Process accounting resumed
[root@fuel remote]# ssh node-2228.domain.tld tail -n1 /var/log/kern.log
Warning: Permanently added 'node-2228.domain.tld' (ECDSA) to the list of known hosts.
<4>Jan 25 06:27:02 node-2228 kernel: [52101.701376] NOHZ: local_softirq_pending 08

/var/log has a lot of free space
/dev/mapper/os-varlog 281G 14G 254G 5% /var/log

rsyslogd in the top of load every time and load 90%-120% CPUs
  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
25707 root 20 0 950796 70852 31100 S 97.3 0.1 2934:18 rsyslogd

Changed in fuel:
milestone: none → 10.1
milestone: 10.1 → 9.3
assignee: nobody → Fuel Sustaining (fuel-sustaining-team)
status: New → Confirmed
importance: Undecided → High
tags: added: area-library
Revision history for this message
Vladimir Kuklin (vkuklin) wrote :

it seems that we have rsyslog not accepting logs from the nodes due to it getting into high CPU load. we need to investigate this behaviour deeper with some SMEs

Revision history for this message
Dmitry Pyzhov (dpyzhov) wrote :

Marking as incomplete because we can't do anything without live environment.

Changed in fuel:
milestone: 9.x-updates → 11.0
status: Confirmed → Incomplete
Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :

Marking as Invalid because of no activity for more than a month.

Changed in fuel:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.