[k8s-R5.0-Single-Ansible-Prov]: Process crash Alarms observed on UI on k8s provisioning using Ansible

Bug #1752661 reported by Pulkit Tandon
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
Trunk
Fix Released
Medium
Aniket Gawade

Bug Description

Configuration:
K8s 1.9.2
Docker version 17.03.1-ce
Centos -7.4
master-centos7-bld-8

Setup:
3 node setup.
1 Kube master. 1 Controller.
2 Agent+ K8s slaves

Followed steps suggested by Aniket to provision k8s contrail cluster using ansible provisioning.
Post completion of provisioning, multiple alarms are observed on web related to process crashes.

Please check attached snapshot

Tags: provisioning
Revision history for this message
Pulkit Tandon (pulkitt) wrote :
summary: - [k8s-R5.0-Single-Ansible-Prov]: k8s provisioning using Ansible
+ [k8s-R5.0-Single-Ansible-Prov]: Process crash Alarms observed on UI on
+ k8s provisioning using Ansible
Pulkit Tandon (pulkitt)
information type: Proprietary → Public
Revision history for this message
Aniket Gawade (aniketgawade) wrote :

Hey Pulkit,

I tried latest recently and I don't see this issue. You can verify and close this.

Changed in juniperopenstack:
status: New → Incomplete
importance: High → Medium
assignee: Aniket Gawade (aniketgawade) → Pulkit Tandon (pulkitt)
Revision history for this message
Pulkit Tandon (pulkitt) wrote :

Hi Aniket,

The issue still exist. I have verified on contrail-5.0.0-25 build.
Reopening the issue and attaching the latest snapshot on build contrail-5.0.0-25

Revision history for this message
Pulkit Tandon (pulkitt) wrote :
Changed in juniperopenstack:
status: Incomplete → New
assignee: Pulkit Tandon (pulkitt) → Aniket Gawade (aniketgawade)
Revision history for this message
Aniket Gawade (aniketgawade) wrote :

Pulkit,

Can you provide setup?

Revision history for this message
Pulkit Tandon (pulkitt) wrote :

Hi Aniket!
I shared the details of the setup on slack channel.
Please use following node contrail GUI to know more :
10.204.217.108:10000/

Revision history for this message
Sundaresan Rajangam (srajanga) wrote :

The issue is due to invalid core_file_list reported by the nodemgr.
The core_pattern is not set correctly.

[root@nodeg12 nodemgr]# cat /proc/sys/kernel/core_pattern
core

core_pattern should be /var/crashes/core.%e.%p.%h.%t

NodeStatus: {
build_info: "{"build-info" : [{"build-version" : "5.0.0", "build-time" : "2018-03-13 21:17:45.828547", "build-user" : "zuul", "build-hostname" : "centos-7-4-builder-juniper-contrail-ci-0000017874", "build-id" : "5.0.0-27.el7.centos", "build-number" : "@contrail"}]}",
installed_package_version: "5.0.0-27.el7.centos",
deleted: false,
disk_usage_info: {},
__T: 1522083698032223,
running_package_version: "5.0.0-27.el7.centos",
process_mem_cpu_usage: {},
system_cpu_usage: {},
system_mem_usage: {},
process_status: [],
all_core_file_list: [
"./srv",
"./mnt",
"./media",
"./home",
"./lost+found",
"./sbin",
"./lib64",
"./lib",
"./bin",
"./anaconda-post.log",
"./provision.sh",
"./functions.sh",
"./entrypoint.sh",
"./contrail-functions.sh",
"./common.sh",
"./opt",
"./root",
"./usr",
"./var",
"./etc",
"./proc",
"./dev",
"./run",
"./tmp",
"./sys"
],
system_cpu_info: {},
process_info: []
},

This should be fixed by https://bugs.launchpad.net/juniperopenstack/+bug/1757340 and hence marking as Fix Committed.

Revision history for this message
Pulkit Tandon (pulkitt) wrote :

Recently verified on build R5.0-ocata-53
The issue was not seen. Hence closing the bug

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.