Distributed Cloud - dcorch.logs only capturing four days history with 10 subclouds

Bug #1857069 reported by Gerry Kopec
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Jessica Castelino

Bug Description

Brief Description
-----------------
When testing distributed cloud system with 10 subclouds noticed that /var/log/dcorch was rotating logs out after 4+ days. As the number of logs scale with number of subclouds either the syslog size/rotate parameters have to be increased or the content of the logs need to be reduced if more subclouds are added.
Seems like a lot of debuggish info logs.

Severity
--------
Minor

Steps to Reproduce
------------------
Set up DC system with 10 subclouds. Observe /var/log/dcorch

Expected Behavior
------------------
Not sure if there's a specific requirement but we should be not filling logs unnecessarily.

Actual Behavior
----------------
dcorch logs are rotated out after 4-5 days

Reproducibility
---------------
Reproducible

System Configuration
--------------------
All-in-one duplex plus worker, DC system controller with 10 subclouds

Branch/Pull Time/Commit
-----------------------
2019-12-09_20-00-00

Last Pass
---------
n/a

Timestamp/Logs
--------------
-rw-r--r-- 1 root root 712681 Dec 12 04:09 dcorch.log.19.gz
-rw-r--r-- 1 root root 650633 Dec 12 10:17 dcorch.log.18.gz
-rw-r--r-- 1 root root 654417 Dec 12 16:27 dcorch.log.17.gz
-rw-r--r-- 1 root root 659792 Dec 12 23:20 dcorch.log.16.gz
-rw-r--r-- 1 root root 665438 Dec 13 06:35 dcorch.log.15.gz
-rw-r--r-- 1 root root 666915 Dec 13 13:59 dcorch.log.14.gz
-rw-r--r-- 1 root root 663523 Dec 13 20:21 dcorch.log.13.gz
-rw-r--r-- 1 root root 658569 Dec 14 02:33 dcorch.log.12.gz
-rw-r--r-- 1 root root 655500 Dec 14 08:44 dcorch.log.11.gz
-rw-r--r-- 1 root root 656587 Dec 14 14:53 dcorch.log.10.gz
-rw-r--r-- 1 root root 653580 Dec 14 21:02 dcorch.log.9.gz
-rw-r--r-- 1 root root 666141 Dec 15 03:21 dcorch.log.8.gz
-rw-r--r-- 1 root root 666080 Dec 15 09:38 dcorch.log.7.gz
-rw-r--r-- 1 root root 657471 Dec 15 15:47 dcorch.log.6.gz
-rw-r--r-- 1 root root 656949 Dec 15 21:59 dcorch.log.5.gz
-rw-r--r-- 1 root root 656141 Dec 16 04:08 dcorch.log.4.gz
-rw-r--r-- 1 root root 655956 Dec 16 10:14 dcorch.log.3.gz
-rw-r--r-- 1 root root 654930 Dec 16 16:27 dcorch.log.2.gz
-rw-r--r-- 1 root root 662068 Dec 16 22:26 dcorch.log.1.gz

dcorch logs are set to size 10M, rotate 20 in /etc/logrotate.d/syslog
Also dcorch openstack logs are duplicated in /var/log/openstack.log

Test Activity
-------------
DC

Workaround
----------
n/a

Revision history for this message
Gerry Kopec (gerry-kopec) wrote :
summary: - distributed cloud - dcorch.logs only capturing four days history with 10
+ Distributed Cloud - dcorch.logs only capturing four days history with 10
subclouds
Revision history for this message
Ghada Khalil (gkhalil) wrote :

stx.4.0 / medium priority - should look into reducing the logs or increasing the log size as suggested by the reporter.

tags: added: stx.distcloud
Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
tags: added: stx.4.0
Ghada Khalil (gkhalil)
Changed in starlingx:
assignee: nobody → Gerry Kopec (gerry-kopec)
Revision history for this message
Gerry Kopec (gerry-kopec) wrote :

Somewhat related issue is that dcdbsync.log isn't rotating:
controller-0:/scratch/dimensioning/dclogs7# ls -l /var/log/dcdbsync/dcdbsync.log
-rw-r--r-- 1 root root 172385686 Feb 11 19:29 /var/log/dcdbsync/dcdbsync.log

Changed in starlingx:
assignee: Gerry Kopec (gerry-kopec) → Jessica Castelino (jcasteli)
Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config-files (master)

Fix proposed to branch: master
Review: https://review.opendev.org/711775

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to distcloud (master)

Fix proposed to branch: master
Review: https://review.opendev.org/711778

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to distcloud (master)

Reviewed: https://review.opendev.org/711352
Committed: https://git.openstack.org/cgit/starlingx/distcloud/commit/?id=48268f420e1bbfbdf284fc3ef513626d1d2b051c
Submitter: Zuul
Branch: master

commit 48268f420e1bbfbdf284fc3ef513626d1d2b051c
Author: Jessica Castelino <email address hidden>
Date: Wed Mar 4 16:55:04 2020 -0500

    Remove remote logging configuration

    Remote logging is removed from dcorch logs to avoid
    sync of unnecessary system data

    Story: 2007267
    Task: 38970
    Change-Id: I36fade7ff4a87855207f570f103b7e1b8fc1262a
    Partial-Bug: 1857069
    Signed-off-by: Jessica Castelino <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config-files (master)

Reviewed: https://review.opendev.org/711775
Committed: https://git.openstack.org/cgit/starlingx/config-files/commit/?id=b95127d6800612776adbb4307bc97a7a14105762
Submitter: Zuul
Branch: master

commit b95127d6800612776adbb4307bc97a7a14105762
Author: Jessica Castelino <email address hidden>
Date: Fri Mar 6 16:27:28 2020 -0500

    Log rotation for Distributed Cloud

    Implemented log rotation for dcdbsync.log and increased the size of
    dcorch.log to 20M

    Change-Id: I29f701fa0d4701820f6409a08478bf2d84e4dc10
    Story: 2007267
    Task: 38978
    Partial-Bug: 1857069
    Signed-off-by: Jessica Castelino <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to distcloud (master)

Reviewed: https://review.opendev.org/711778
Committed: https://git.openstack.org/cgit/starlingx/distcloud/commit/?id=3c8d6bc7f56c97350ca255502462d565eaed38b2
Submitter: Zuul
Branch: master

commit 3c8d6bc7f56c97350ca255502462d565eaed38b2
Author: Jessica Castelino <email address hidden>
Date: Fri Mar 6 17:09:27 2020 -0500

    Reduce dcorch log size

    Removes unnecessary dcorch info logs to reduce log size

    Change-Id: Ib8b3b31c4c174b85cf95cb1a4fc06b3ae10f4d32
    Story: 2007267
    Task: 38740
    Depends-On: https://review.opendev.org/#/c/711775/
    Closes-Bug: 1857069
    Signed-off-by: Jessica Castelino <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config-files (f/centos8)

Fix proposed to branch: f/centos8
Review: https://review.opendev.org/716138

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to distcloud (f/centos8)

Fix proposed to branch: f/centos8
Review: https://review.opendev.org/716140

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config-files (f/centos8)

Reviewed: https://review.opendev.org/716138
Committed: https://git.openstack.org/cgit/starlingx/config-files/commit/?id=77460a9893ddbec82cf2a370e2434d5970b556f9
Submitter: Zuul
Branch: f/centos8

commit de8d65efdf298d23ad690fb0b97d209cc95e9354
Author: Robert Church <email address hidden>
Date: Wed Mar 25 17:19:57 2020 -0400

    Reserve ephemeral ports that are expected by system services

    Update sysctl.conf to reserve keystone and tiller ports so that any
    initial system processes do not claim these ports.

    These are also reserved in puppet and part of initial system
    provisioning.

    Change-Id: I3bae661348718df00f7b50ba15931281a744d473
    Closes-Bug: #1869011
    Related-Bug: #1851533
    Signed-off-by: Robert Church <email address hidden>

commit b95127d6800612776adbb4307bc97a7a14105762
Author: Jessica Castelino <email address hidden>
Date: Fri Mar 6 16:27:28 2020 -0500

    Log rotation for Distributed Cloud

    Implemented log rotation for dcdbsync.log and increased the size of
    dcorch.log to 20M

    Change-Id: I29f701fa0d4701820f6409a08478bf2d84e4dc10
    Story: 2007267
    Task: 38978
    Partial-Bug: 1857069
    Signed-off-by: Jessica Castelino <email address hidden>

commit aecd17c5e3e928d84c7ac14f247bab2fbee5b6d5
Author: Bin Qian <email address hidden>
Date: Wed Feb 5 14:17:37 2020 -0500

    Adding job to upload commits to GitHub

    Add job to publish config-files repo to GitHub

    Change-Id: I5e08200ed748e080f2629ac5c1af05d8fddbb497
    Story: 2007252
    Task: 38665
    Signed-off-by: Bin Qian <email address hidden>

tags: added: in-f-centos8
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to distcloud (f/centos8)
Download full text (15.6 KiB)

Reviewed: https://review.opendev.org/716140
Committed: https://git.openstack.org/cgit/starlingx/distcloud/commit/?id=04b49dd093ab850f4520cdb85638221120dd7568
Submitter: Zuul
Branch: f/centos8

commit 25c9d6ed3861f2d783404fcf84b186441ab9cd4d
Author: albailey <email address hidden>
Date: Wed Mar 25 15:43:32 2020 -0500

    Removing ddt from unit tests

    This cleanup should assist in transitioning to
    stestr and fixtures, as well as py3 support.

    The ddt data is primarily unused, only subcloud, route
    and endpoints were being loaded.

    The information in the data files was out of date,
    and not necessarily matching the current product model.

    Story: 2004515
    Task: 39160
    Change-Id: Iddd7ed4664b0d59dbc58aae5c3fedd74c9a138c0
    Signed-off-by: albailey <email address hidden>

commit 7f3827f24d2fb3cb546d3caf71d505d23187b0dc
Author: Tao Liu <email address hidden>
Date: Thu Mar 12 09:46:29 2020 -0400

    Keystone token and resource caching

    Add the following misc. changes to dcorch and dcmanager components:
    - Cache the master resource in dcorch audit
    - Consolidate the openstack drivers to common module, combine the
      dcmanager and dcorch sysinv client. (Note: the sdk driver that
      used by nova, neutron and cinder will be cleaned as part of
      story 2006588).
    - Update the common sdk driver:
      . in order to avoid creating new keystone client multiple times
      . to add a option for caching region clients, in addition to the
        keystone client
      . finally, to randomize the token early renewal duration
    - Change subcloud audit manager, patch audit manager,
      and sw update manager to:
      utilize the sdk driver which caches the keystone client and token

    Test cases:
    1. Manage/unmanage subclouds
    2. Platform resources sync and audit
    3. Verify the keystone token is cached until the token is
       expired
    4. Add/delete subclouds
    5. Managed subcloud goes offline/online (power off/on)
    6. Managed subcloud goes offline/online (delete/add a static route)
    7. Apply a patch to all subclouds via patch Orchestration

    Story: 2007267
    Task: 38865

    Change-Id: I75e0cf66a797a65faf75e7c64dafb07f54c2df06
    Signed-off-by: Tao Liu <email address hidden>

commit 3a1bf60caddfa2e807d4f5996ff94fea7dde5477
Author: Jessica Castelino <email address hidden>
Date: Wed Mar 11 16:23:21 2020 -0400

    Cleanup subcloud details when subcloud add fails

    Failure during add subcloud prevents subcloud from being added again
    with the same name as the subcloud details are not cleaned up
    properly. Fixes have been added for proper cleanup of dcorch database
    tables, ansible subcloud inventory files, keystone endpoints, keystone
    region, and addn_hosts_dc file when failure is encountered.

    Test cases:
    1. Add subcloud
    2. Add subcloud with "--deploy-playbook"
    3. Delete subcloud
    4. Raise explicit exception in dcorch/objects/subcloud.py
    5. Raise explicit exception in dcmanager/manager/subcloud_manager.py

    Change-Id: Iedf172c3e9c3c4bdb9b9482dc5d46f072b3ccf61
    ...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.