CI: ovb-ha jobs fail because of lack of memory

Bug #1693174 reported by Sagi (Sergey) Shnaidman
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Alfredo Moralejo

Bug Description

OVB ha jobs fail because of memory:

http://logs.openstack.org/90/467490/1/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha-oooq/dac8a88/logs/undercloud/home/jenkins/failed_deployment_list.log.txt.gz

Error: Puppet::Util::FileType::FileTypeCrontab could not write nova: Cannot allocate memory - crontab
    Error: /Stage[main]/Nova::Cron::Archive_deleted_rows/Cron[nova-manage db archive_deleted_rows]: Could not evaluate: Puppet::Util::FileType::FileTypeCrontab could not write nova: Cannot allocate memory - crontab
    Error: /Stage[main]/Nova::Conductor/Nova::Generic_service[conductor]/Service[nova-conductor]: Could not evaluate: Cannot allocate memory - fork(2)
    Error: /Stage[main]/Nova::Api/Nova::Generic_service[api]/Service[nova-api]: Could not evaluate: Cannot allocate memory - fork(2)
    Error: /Stage[main]/Nova::Scheduler/Nova::Generic_service[scheduler]/Service[nova-scheduler]: Could not evaluate: Cannot allocate memory - fork(2)
    Error: /Stage[main]/Nova::Consoleauth/Nova::Generic_service[consoleauth]/Service[nova-consoleauth]: Could not evaluate: Cannot allocate memory - fork(2)
    Error: /Stage[main]/Nova::Vncproxy/Nova::Generic_service[vncproxy]/Service[nova-vncproxy]: Could not evaluate: Cannot allocate memory - fork(2)

http://logs.openstack.org/84/467484/1/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha-oooq/bd60898/logs/undercloud/home/jenkins/failed_deployment_list.log.txt.gz

Error: /Stage[main]/Neutron::Agents::Dhcp/Service[neutron-dhcp-service]: Failed to call refresh: Cannot allocate memory - fork(2)
    Error: /Stage[main]/Neutron::Agents::Dhcp/Service[neutron-dhcp-service]: Cannot allocate memory - fork(2)
    Error: /Stage[main]/Neutron::Server/Service[neutron-server]: Could not evaluate: Cannot allocate memory - fork(2)
    Error: /Stage[main]/Neutron::Server/Service[neutron-server]: Failed to call refresh: Cannot allocate memory - fork(2)
    Error: /Stage[main]/Neutron::Server/Service[neutron-server]: Cannot allocate memory - fork(2)
    Warning: /Stage[main]/Neutron::Agents::L3/Service[neutron-l3]: Skipping because of failed dependencies
    Warning: /Stage[main]/Neutron::Agents::Metadata/Service[neutron-metadata]: Skipping because of failed dependencies
    Warning: /Stage[main]/Neutron::Agents::Ml2::Ovs/Service[neutron-ovs-agent-service]: Skipping because of failed dependencies
    Warning: /Stage[main]/Neutron::Deps/Anchor[neutron::service::end]: Skipping because of failed dependencies
    Error: Failed to apply catalog: Cannot allocate memory - fork(2)

http://logs.openstack.org/77/430277/107/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha-oooq/e9e6adb/logs/undercloud/home/jenkins/failed_deployment_list.log.txt.gz

[1;31mError: /Stage[main]/Tripleo::Profile::Pacemaker::Haproxy/Tripleo::Pacemaker::Haproxy_with_vip[haproxy_and_redis_vip]/Pacemaker::Resource::Ip[redis_vip]/Pcmk_resource[ip-172.17.0.14]: Could not evaluate: Cannot allocate memory - /usr/sbin/pcs
    Error: /Stage[main]/Tripleo::Profile::Pacemaker::Haproxy/Tripleo::Pacemaker::Haproxy_with_vip[haproxy_and_internal_api_vip]/Pacemaker::Resource::Ip[internal_api_vip]/Pcmk_resource[ip-172.17.0.20]: Could not evaluate: Cannot allocate memory - /usr/sbin/pcs
    Error: /Stage[main]/Tripleo::Profile::Pacemaker::Haproxy/Tripleo::Pacemaker::Haproxy_with_vip[haproxy_and_storage_vip]/Pacemaker::Resource::Ip[storage_vip]/Pcmk_resource[ip-172.18.0.11]: Could not evaluate: Cannot allocate memory - /usr/sbin/pcs
    Error: /Stage[main]/Tripleo::Profile::Pacemaker::Haproxy/Tripleo::Pacemaker::Haproxy_with_vip[haproxy_and_storage_mgmt_vip]/Pacemaker::Resource::Ip[storage_mgmt_vip]/Pcmk_resource[ip-172.19.0.14]: Could not evaluate: Cannot allocate memory - /usr/sbin/pcs

Changed in tripleo:
milestone: none → pike-2
Revision history for this message
Emilien Macchi (emilienm) wrote :

We had similar issue in Puppet CI:
https://review.openstack.org/#/c/467129/

It's probably due to latest eventlet: python2-eventlet-0.20.1-1.el7.noarch
(It was upgraded yesterday from python2-eventlet-0.18.4-2.el7.noarch)

Revision history for this message
Alfredo Moralejo (amoralej) wrote :

We are working to update python-idna version to 2.5 in RDO deps, what should reduce memory usage to be similar as with python2-eventlet-0.18.4

Revision history for this message
Alfredo Moralejo (amoralej) wrote :

Package has been tagged and it's available in the repos. Issue should be fixed.

Changed in tripleo:
assignee: nobody → Alfredo Moralejo (amoralej)
tags: removed: alert
Changed in tripleo:
status: Triaged → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/468031
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=0751d69e3b6560ef87ed43859df92fdcc08f9cd1
Submitter: Jenkins
Branch: master

commit 0751d69e3b6560ef87ed43859df92fdcc08f9cd1
Author: John Trowbridge <email address hidden>
Date: Thu May 25 09:24:57 2017 -0400

    Add heat environment for disabling all telemetry services

    This will be used in our HA OVB CI job where we currently are
    failing due to running out of memory. Telemetry will still be
    tested via scenarios, but this will free up a large chunk of
    memory in the most memory intensive job.

    Closes-Bug: 1693174
    Change-Id: Idefe9f0de47c5b0f29b7326642d697ed179e2eb8

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-heat-templates (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/468818

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-heat-templates (stable/ocata)

Reviewed: https://review.openstack.org/468818
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=7859f30ebf76edaa1f9af83f5783eb60ddb96fb8
Submitter: Jenkins
Branch: stable/ocata

commit 7859f30ebf76edaa1f9af83f5783eb60ddb96fb8
Author: John Trowbridge <email address hidden>
Date: Thu May 25 09:24:57 2017 -0400

    Add heat environment for disabling all telemetry services

    This will be used in our HA OVB CI job where we currently are
    failing due to running out of memory. Telemetry will still be
    tested via scenarios, but this will free up a large chunk of
    memory in the most memory intensive job.

    Closes-Bug: 1693174
    Change-Id: Idefe9f0de47c5b0f29b7326642d697ed179e2eb8
    (cherry picked from commit 0751d69e3b6560ef87ed43859df92fdcc08f9cd1)

tags: added: in-stable-ocata
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 7.0.0.0b2

This issue was fixed in the openstack/tripleo-heat-templates 7.0.0.0b2 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-heat-templates 6.2.0

This issue was fixed in the openstack/tripleo-heat-templates 6.2.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.