Deployment fails if apt/deb maintenance is started on schedule (cron)

Bug #1510107 reported by Tatyanka
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Kyrylo Galanov

Bug Description

Steps:
Deploy ceph ha with 1 controller with RadosGW for objects

Scenario:
1. Create cluster
2. Add 1 node with controller role
3. Add 1 node with compute role
4. Add 3 nodes with ceph-osd role
5. Deploy the cluster
6. Check ceph status
7. Run OSTF tests
8. Check the radosqw daemon is started

Actual:
Cluster is in error state, puppet failed on primary-controller (node-1) on firewall task with errors:

http://paste.openstack.org/show/477388/

VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "8.0"
  openstack_version: "2015.1.0-8.0"
  api: "1.0"
  build_number: "66"
  build_id: "66"
  fuel-nailgun_sha: "2476325f95f3bbdc0ff5dbd827868f2ab243e1b4"
  python-fuelclient_sha: "8ea3b64d21c4d729d1069f3aa5528ede3c76b412"
  fuel-agent_sha: "e4056a7923dd607521d97763d5dfb6de8a33ab5d"
  fuel-nailgun-agent_sha: "e377e83268abd406f22b656b76014656077a6a74"
  astute_sha: "eebbb2470cb800e532de19c29673558aeb86aae4"
  fuel-library_sha: "bc044a0562cda204245b2a9136fa4bd6d7ef723e"
  fuel-ostf_sha: "9f500668555292add5d87c942e0cd804aefa6df2"
  fuel-createmirror_sha: "0315aa30aee56e10f142683a25340c3c9d2f1e85"
  fuelmain_sha: "21b84eb3d09883a7da526ebc4bd21458d2e9844a"

https://product-ci.infra.mirantis.net/job/8.0.system_test.ubuntu.ha_neutron_tun/28/testReport/junit/(root)/ceph_rados_gw/ceph_rados_gw/

Tags: area-library
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :
description: updated
Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

2015-10-26T04:06:28.273792+00:00 err: (/Stage[main]/Firewall::Linux::Debian/Package[iptables-persistent]/ensure) dpkg: error: dpkg status database is locked by another process
This means something was broken in provisioning/deployment (not freeing dpkg lock). This is not a puppet bug directly and should not be assigned to mos-puppet

Changed in fuel:
assignee: MOS Puppet Team (mos-puppet) → nobody
Maciej Relewicz (rlu)
Changed in fuel:
assignee: nobody → Fuel Library Team (fuel-library)
Dmitry Pyzhov (dpyzhov)
tags: added: area-library
Dmitry Klenov (dklenov)
Changed in fuel:
status: New → Confirmed
Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Kyrylo Galanov (kgalanov)
Revision history for this message
Kyrylo Galanov (kgalanov) wrote :

Hello,

According to the logs, provisioning was done ~ 4:00 AM
[node-1.test.domain.local] out: 04:07:28 up 6 min, 1 user, load average: 0.20, 0.36, 0.19

It's time to run cron daily tasks.
Surprisingly, one of the cron tasks is /etc/cron.daily/apt . That's why dpkg database was blocked.

Any suggestions for a work-around except stopping cron before provisioning and starting it after?

--
Kyrylo

Revision history for this message
Kyrylo Galanov (kgalanov) wrote :

Hi,

It is possible to leverage puppet prerun and postrun commands configuration to stop and start cron. Possible other locks/clean up in the future.

--
Kyrylo

summary: - Deploy ceph rados gw failed on firewall task on installation iptables-
- persistent package
+ Deployment fails if apt/deb maintenance is started on schedule (cron)
Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/241583

Changed in fuel:
assignee: Kyrylo Galanov (kgalanov) → Matthew Mosesohn (raytrac3r)
Changed in fuel:
assignee: Matthew Mosesohn (raytrac3r) → Kyrylo Galanov (kgalanov)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/241583
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=44f3a2be74d72c68857e18b49f145039532a5e03
Submitter: Jenkins
Branch: master

commit 44f3a2be74d72c68857e18b49f145039532a5e03
Author: Kyrylo Galanov <email address hidden>
Date: Wed Nov 4 14:44:11 2015 +0200

    Add custom package apt provides with timeout

    It may be that dkpg database is locked by abother process during
    the Puppet run. Consequently, the deployment will fail.
    To mitigate the issue, it is possible to wrap default apt provider
    with a class which waits until the lock is released or timeout
    expired.

    Change-Id: I691ca96cced232483fb2f3e79cbf3ee402f765ea
    Fixes-bug: #1510107

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Tatyanka (tatyana-leontovich) wrote :

verified 169 iso

Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.