Unrecoverable resyncs if DB is restored from backup

Bug #1616116 reported by Andreas Hasenack
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Landscape Client
Fix Released
Medium
Simon Poirier
landscape-client (Ubuntu)
Fix Released
Medium
Simon Poirier
Trusty
Fix Released
Medium
Simon Poirier
Xenial
Fix Released
Medium
Simon Poirier
Bionic
Fix Released
Medium
Simon Poirier
Cosmic
Fix Released
Medium
Simon Poirier

Bug Description

[Impact]

 * In some cases generally involving backups/restore, client would get
   inconsistent package data and keep that data upon resync, thus getting
   stuck in a resync loop. This usually gets noticed through the stress
   it adds on the server and though logs which grow abnormally.

[Test Case]

  * deploy landscape-server quickstart from ppa:landscape/18.03
  * register client against server. wait for package info
  * pg_dumpall
  * add a repo and wait for new package to show on in landscape.
  * restore the postgres backup.
  * run ./scripts/hash_id_databases.sh from the server to complete
    the restore.
  * trigger a package install from the new repo to create some package
    info to update
  * client should resync once then will re-fetch hash on the next run.

[Regression Potential]

 * Modified code is used only during resync operations and removes
   cached data when the client state is deemed inconsistent.

 * In the unlikely event the code is called outside of the expected
   cases, the end result would be limited to the package-monitor
   having to re-download the hash-id databases, which shouldn't
   cause issues as that is the behaviour at client registration.

[Other Info]

 * Other cases than server restores have been noticed to generate the
   bug but they are far less common.

[Original description]

Landscape with live clients cannot handle a DB restore to a point in the past.

The scenario is Landscape running as usual, with live clients, restoring to a DB backup taken in the past. After the service ir brought up again with this data, clients will start resyncing and becoming wedged with all sorts of tracebacks on the message server.

I left such a scenario running overnight, hoping that eventually the resyncs would settle down and everything recover, but that didn't happen. The resyncs continued, in the packages scope.

An interesting one in particular was this:
Aug 22 21:46:26 message-server-2 ERR Error handling message 'operation-result' for computer 104: {'status': 6, 'timestamp': 1471901963, 'result-text': u'Mon Aug 22 21:39:23 UTC 2016\n', 'api': '3.3', 'operation-id': 533, 'type': 'operation-result'}#012Traceback (most recent call last):#012 File "/opt/canonical/landscape/canonical/landscape/message/apis.py", line 358, in _process_messages#012 self.handle(message["type"], message)#012 File "/opt/canonical/landscape/canonical/message/api.py", line 66, in handle#012 return handler(type, body)#012 File "/opt/canonical/landscape/canonical/message/handler.py", line 30, in __call__#012 return function(self.message_api, type, body)#012 File "/opt/canonical/landscape/canonical/lib/arguments.py", line 79, in replacement#012 return original(*new_args, **new_kwargs)#012 File "/opt/canonical/landscape/canonical/landscape/message/handlers/activity.py", line 32, in handle_activity_result#012 activity.succeed(code=result_code, text=result_text)#012AttributeError: 'NoneType' object has no attribute 'succeed'

That was about an activity that had been delivered already, but did not exist in the restored DB.

Related branches

description: updated
Revision history for this message
Simon Poirier (simpoir) wrote :

Hit a similar issue. We have 2 known workarounds:

- purge landscape-client and re-register
- service landscape-client stop && rm -fr /var/lib/landscape/client/package/hash-id/* /var/lib/landscape/client/package/database /var/lib/landscape/client/messages/* && service landscape-client restart; then wait up to 6h for package data to update

Simon Poirier (simpoir)
Changed in landscape:
status: New → In Progress
tags: added: lds-squad
Simon Poirier (simpoir)
Changed in landscape:
assignee: nobody → Simon Poirier (simpoir)
Revision history for this message
Simon Poirier (simpoir) wrote :
Simon Poirier (simpoir)
Changed in landscape:
status: In Progress → Fix Committed
Simon Poirier (simpoir)
description: updated
information type: Proprietary → Public
affects: landscape → landscape-client
Simon Poirier (simpoir)
Changed in landscape-client:
importance: Undecided → Medium
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package landscape-client - 18.01-0ubuntu6

---------------
landscape-client (18.01-0ubuntu6) disco; urgency=medium

  * debian/patches/nutanix-kvm.patch: Update vm_info.py to include Nutanix
    hypervisor.
  * Fixes for release-upgrade (LP: #1699179).
    - debian/patches/release-upgrade-success.patch: Enable landscape-client to
      survive trusty upgrade. (LP: #1670291)
    - debian/patches/post-upgrade-reboot.patch: Force reboot operation in
      case systemd fails. (LP: #1670291)
  * debian/patches/unicode-tags-script.patch: Permit environments
    containing unicode chars for script execution. (LP: #1765518)
  * debian/patches/1616116-resync-loop.patch:
    Clear hash id database on package resync. (LP: #1616116)

 -- Simon Poirier <email address hidden> Tue, 27 Nov 2018 09:24:22 -0500

Changed in landscape-client (Ubuntu):
status: New → Fix Released
Changed in landscape-client (Ubuntu Bionic):
status: New → In Progress
Changed in landscape-client (Ubuntu Cosmic):
status: New → In Progress
Changed in landscape-client (Ubuntu Bionic):
importance: Undecided → Medium
Changed in landscape-client (Ubuntu):
importance: Undecided → Medium
Changed in landscape-client (Ubuntu Cosmic):
importance: Undecided → Medium
Changed in landscape-client (Ubuntu):
assignee: nobody → Simon Poirier (simpoir)
Changed in landscape-client (Ubuntu Bionic):
assignee: nobody → Simon Poirier (simpoir)
Changed in landscape-client (Ubuntu Cosmic):
assignee: nobody → Simon Poirier (simpoir)
Changed in landscape-client (Ubuntu Xenial):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Simon Poirier (simpoir)
Changed in landscape-client (Ubuntu Trusty):
status: New → In Progress
assignee: nobody → Simon Poirier (simpoir)
importance: Undecided → Medium
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Proposed package upload rejected

An upload of landscape-client to cosmic-proposed has been rejected from the upload queue for the following reason: "Includes a change without an SRU bug attached to it".

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

An upload of landscape-client to bionic-proposed has been rejected from the upload queue for the following reason: "Includes a change without an SRU bug attached to it".

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

An upload of landscape-client to xenial-proposed has been rejected from the upload queue for the following reason: "Includes a change without an SRU bug attached to it".

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

An upload of landscape-client to trusty-proposed has been rejected from the upload queue for the following reason: "Includes a change without an SRU bug attached to it".

Revision history for this message
Timo Aaltonen (tjaalton) wrote : Please test proposed package

Hello Andreas, or anyone else affected,

Accepted landscape-client into cosmic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/landscape-client/18.01-0ubuntu4.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-cosmic to verification-done-cosmic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-cosmic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in landscape-client (Ubuntu Cosmic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-cosmic
Changed in landscape-client (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed-bionic
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Hello Andreas, or anyone else affected,

Accepted landscape-client into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/landscape-client/18.01-0ubuntu3.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in landscape-client (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed-xenial
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Hello Andreas, or anyone else affected,

Accepted landscape-client into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/landscape-client/16.03-0ubuntu2.16.04.5 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Hello Andreas, or anyone else affected,

Accepted landscape-client into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/landscape-client/14.12-0ubuntu6.14.04.4 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-trusty to verification-done-trusty. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-trusty. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in landscape-client (Ubuntu Trusty):
status: In Progress → Fix Committed
tags: added: verification-needed-trusty
Simon Poirier (simpoir)
tags: added: verification-done-cosmic
removed: verification-needed-cosmic
Simon Poirier (simpoir)
tags: added: verification-done verification-done-bionic verification-done-trusty verification-done-xenial
removed: verification-needed verification-needed-bionic verification-needed-trusty verification-needed-xenial
Revision history for this message
Simon Poirier (simpoir) wrote :

I tested this by deploying landscape-server from ppa:landscape/18.03, connecting clients from {trusty,xenial,bionic,cosmic} against it, restoring a database dump to trigger the issue.
The landscape-clients were then upgraded to *-proposed. As per the test case, the resync loop stopped after.

tail -f /var/log/landscape/package-reporter.log
2019-02-05 19:01:07,139 WARNING [MainThread] Removing cached hash=>id database /var/lib/landscape/client/package/hash-id/2b0e1068-20d4-11e9-8e28-00163e8887ff_bionic_amd64
2019-02-05 19:01:27,315 INFO [MainThread] Queuing message with changes in known packages: 216 installed, 26775 available, 48 available upgrades, 0 locked, 0 autoremovable, 0 not installed, 0 not available, 0 not available upgrades, 0 not locked, 0 not autoremovable.
2019-02-05 19:02:07,812 INFO [MainThread] Downloaded hash=>id database from https://lscape.lxd/hash-id-databases/2b0e1068-20d4-11e9-8e28-00163e8887ff_bionic_amd64

and the server output: https://screenshots.firefox.com/kU7atd0RI411o7r4/lscape.lxd

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package landscape-client - 18.01-0ubuntu4.1

---------------
landscape-client (18.01-0ubuntu4.1) cosmic; urgency=medium

  * debian/patches/nutanix-kvm.patch: Update vm_info.py to include Nutanix
    hypervisor. (LP: #1788219)
  * Fixes for release-upgrade:
    - debian/patches/release-upgrade-success.patch: Enable landscape-client to
      survive trusty upgrade. (LP: #1670291)
    - debian/patches/post-upgrade-reboot.patch: Force reboot operation in
      case systemd fails. (LP: #1670291)
  * debian/patches/unicode-tags-script.patch: Permit environments
    containing unicode chars for script execution. (LP: #1765518)
  * debian/patches/1616116-resync-loop.patch:
    Clear hash id database on package resync. (LP: #1616116)

 -- Simon Poirier <email address hidden> Tue, 27 Nov 2018 09:24:22 -0500

Changed in landscape-client (Ubuntu Cosmic):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for landscape-client has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package landscape-client - 18.01-0ubuntu3.2

---------------
landscape-client (18.01-0ubuntu3.2) bionic; urgency=medium

  * debian/patches/nutanix-kvm.patch: Update vm_info.py to include Nutanix
    hypervisor. (LP: #1788219)
  * Fixes for release-upgrade:
    - debian/patches/release-upgrade-success.patch: Enable landscape-client to
      survive trusty upgrade. (LP: #1670291)
    - debian/patches/post-upgrade-reboot.patch: Force reboot operation in
      case systemd fails. (LP: #1670291)
  * debian/patches/unicode-tags-script.patch: Permit environments
    containing unicode chars for script execution. (LP: #1765518)
  * debian/patches/1616116-resync-loop.patch:
    Clear hash id database on package resync. (LP: #1616116)

 -- Simon Poirier <email address hidden> Tue, 27 Nov 2018 09:24:22 -0500

Changed in landscape-client (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package landscape-client - 16.03-0ubuntu2.16.04.5

---------------
landscape-client (16.03-0ubuntu2.16.04.5) xenial; urgency=medium

  * debian/patches/nutanix-kvm.patch: Update vm_info.py to include Nutanix
    hypervisor (LP: #1788219)
  * Fixes for release-upgrade (LP: #1699179).
    - debian/patches/1699179-release-upgrade-check.diff: Check if ubuntu-
      release-upgrader is running before apt-update. (LP: #1699179)
    - debian/patches/release-upgrade-success.patch: Enable landscape-client to
      survive trusty upgrade. (LP: #1670291)
    - debian/patches/post-upgrade-reboot.patch: Force reboot operation in
      case systemd fails. (LP: #1670291)
  * debian/patches/1616116-resync-loop.patch:
    Clear hash id database on package resync. (LP: #1616116)

 -- Simon Poirier <email address hidden> Tue, 27 Nov 2018 09:24:22 -0500

Changed in landscape-client (Ubuntu Xenial):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package landscape-client - 14.12-0ubuntu6.14.04.4

---------------
landscape-client (14.12-0ubuntu6.14.04.4) trusty; urgency=medium

  * debian/patches/nutanix-kvm.patch: Update vm_info.py to include Nutanix
    hypervisor. (LP: #1788219)
  * Fixes for release-upgrade (LP: #1699179).
    - debian/patches/1699179-release-upgrade-check.diff: Check if ubuntu-
      release-upgrader is running before apt-update. (LP: #1699179)
    - debian/patches/release-upgrade-success.patch: Enable landscape-client to
      survive trusty upgrade. (LP: #1670291)
    - debian/patches/post-upgrade-reboot.patch: Force reboot operation in
      case systemd fails. (LP: #1670291)
  * debian/patches/1616116-resync-loop.patch:
    Clear hash id database on package resync. (LP: #1616116)

 -- Simon Poirier <email address hidden> Tue, 27 Nov 2018 09:24:22 -0500

Changed in landscape-client (Ubuntu Trusty):
status: Fix Committed → Fix Released
Simon Poirier (simpoir)
Changed in landscape-client:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.