k8s monitoring fails due to error.

Bug #1595373 reported by Madhuri Kumari
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Magnum
Fix Released
High
Spyros Trigazis

Bug Description

2016-06-23 09:03:30.625 WARNING magnum.service.periodic [req-030f6686-19a4-4249-b29f-8e8d27b99860 None None] Skip pulling data from bay d7d96734-77be-46f5-8e2b-e1e46f8d6569 due to error: malformed string
2016-06-23 09:03:30.625 TRACE magnum.service.periodic Traceback (most recent call last):
2016-06-23 09:03:30.625 TRACE magnum.service.periodic File "/opt/stack/magnum/magnum/service/periodic.py", line 160, in _send_bay_metrics
2016-06-23 09:03:30.625 TRACE magnum.service.periodic monitor.pull_data()
2016-06-23 09:03:30.625 TRACE magnum.service.periodic File "/opt/stack/magnum/magnum/conductor/k8s_monitor.py", line 44, in pull_data
2016-06-23 09:03:30.625 TRACE magnum.service.periodic self.data['nodes'] = self._parse_node_info(nodes)
2016-06-23 09:03:30.625 TRACE magnum.service.periodic File "/opt/stack/magnum/magnum/conductor/k8s_monitor.py", line 161, in _parse_node_info
2016-06-23 09:03:30.625 TRACE magnum.service.periodic capacity = ast.literal_eval(node.status.capacity)
2016-06-23 09:03:30.625 TRACE magnum.service.periodic File "/usr/lib/python2.7/ast.py", line 80, in literal_eval
2016-06-23 09:03:30.625 TRACE magnum.service.periodic return _convert(node_or_string)
2016-06-23 09:03:30.625 TRACE magnum.service.periodic File "/usr/lib/python2.7/ast.py", line 79, in _convert
2016-06-23 09:03:30.625 TRACE magnum.service.periodic raise ValueError('malformed string')
2016-06-23 09:03:30.625 TRACE magnum.service.periodic ValueError: malformed string

Changed in magnum:
assignee: nobody → Madhuri Kumari (madhuri-rai07)
hongbin (hongbin034)
Changed in magnum:
status: New → Triaged
importance: Undecided → High
Revision history for this message
huang.huayong (huang-huayong2) wrote :

I also met this error,I think the node.status.capacity maybe error in some time

information type: Public → Public Security
information type: Public Security → Private Security
information type: Private Security → Public
Revision history for this message
Michael liu (ztehypervisor) wrote :

I found this bug too, and I trace it. At last,I found its a bug in python-k8sclient. there is an error in class V1NodeStatus,where object should be changed to str

        self.swagger_types = {
            'capacity': 'object',
            'allocatable': 'object',
            'phase': 'str',
            'conditions': 'list[V1NodeCondition]',
            'addresses': 'list[V1NodeAddress]',
            'daemon_endpoints': 'V1NodeDaemonEndpoints',
            'node_info': 'V1NodeSystemInfo',
            'images': 'list[V1ContainerImage]'
        }

like this:

        self.swagger_types = {
            'capacity': 'str',
            'allocatable': 'str',
            'phase': 'str',
            'conditions': 'list[V1NodeCondition]',
            'addresses': 'list[V1NodeAddress]',
            'daemon_endpoints': 'V1NodeDaemonEndpoints',
            'node_info': 'V1NodeSystemInfo',
            'images': 'list[V1ContainerImage]'
        }

Revision history for this message
Michael liu (ztehypervisor) wrote :

@Madhuri Kumari, Hi,Madhuri ,are you processing this bug? If not, can I fix it?

Changed in magnum:
assignee: Madhuri Kumari (madhuri-rai07) → Michael liu (ztehypervisor)
Revision history for this message
yatin (yatinkarel) wrote :

<<< I found this bug too, and I trace it. At last,I found its a bug in python-k8sclient. there is an error in class V1NodeStatus,where object should be changed to str

k8sclient is a generated code by swagger-codegen

I think it breaked because of following changes in kubernetes:
Kubernetes: https://github.com/kubernetes/kubernetes/pull/22897/files
generaged k8sclient for v1.2 api: https://github.com/openstack/python-k8sclient/commit/5d1a429016a22c18d9e999a0bb482fa316dbff85

Revision history for this message
yatin (yatinkarel) wrote :

@Michael, are you working on this bug? if not i want to work on this.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to magnum (master)

Fix proposed to branch: master
Review: https://review.openstack.org/397902

Changed in magnum:
assignee: Michael liu (ztehypervisor) → Jason Dunsmore (jasondunsmore)
status: Triaged → In Progress
Revision history for this message
Jason Dunsmore (jasondunsmore) wrote :

I posted a fix for the version of this bug that I was seeing. Will yall confirm that this fixes in on your end? It's possible that there are other conditions that lead to this bug (other than the OutOfDisk status).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on magnum (master)

Change abandoned by Jason Dunsmore (<email address hidden>) on branch: master
Review: https://review.openstack.org/397902
Reason: You're right, it's due to the k8sclient change

Revision history for this message
Michal Jura (mjura) wrote :

After changes from comment #2, I hit another issue

2016-12-27 14:17:00.327 7959 ERROR oslo.messaging._drivers.impl_rabbit [-] [7b051fde-a2b6-491b-a8d8-dbba0ca41e2e] AMQP s
erver on 192.168.124.81:5672 is unreachable: [Errno 104] Connection reset by peer. Trying again in 1 seconds. Client por
t: 27146
2016-12-27 14:31:13.342 7959 WARNING magnum.service.periodic [req-ccd0c2d7-a916-42b5-ad74-8846d967107b - - - - -] Skip p
ulling data from cluster 0e2a5cb5-5787-4e1b-ba0e-a18b9efafc41 due to error: malformed string
2016-12-27 14:31:13.342 7959 ERROR magnum.service.periodic Traceback (most recent call last):
2016-12-27 14:31:13.342 7959 ERROR magnum.service.periodic File "/usr/lib/python2.7/site-packages/magnum/service/perio
dic.py", line 210, in _send_cluster_metrics
2016-12-27 14:31:13.342 7959 ERROR magnum.service.periodic monitor.pull_data()
2016-12-27 14:31:13.342 7959 ERROR magnum.service.periodic File "/usr/lib/python2.7/site-packages/magnum/conductor/k8s
_monitor.py", line 46, in pull_data
2016-12-27 14:31:13.342 7959 ERROR magnum.service.periodic self.data['pods'] = self._parse_pod_info(pods)
2016-12-27 14:31:13.342 7959 ERROR magnum.service.periodic File "/usr/lib/python2.7/site-packages/magnum/conductor/k8s
_monitor.py", line 116, in _parse_pod_info
2016-12-27 14:31:13.342 7959 ERROR magnum.service.periodic limits = ast.literal_eval(limits)
2016-12-27 14:31:13.342 7959 ERROR magnum.service.periodic File "/usr/lib64/python2.7/ast.py", line 80, in literal_eva
l
2016-12-27 14:31:13.342 7959 ERROR magnum.service.periodic return _convert(node_or_string)
2016-12-27 14:31:13.342 7959 ERROR magnum.service.periodic File "/usr/lib64/python2.7/ast.py", line 79, in _convert
2016-12-27 14:31:13.342 7959 ERROR magnum.service.periodic raise ValueError('malformed string')
2016-12-27 14:31:13.342 7959 ERROR magnum.service.periodic ValueError: malformed string
2016-12-27 14:31:13.342 7959 ERROR magnum.service.periodic

Revision history for this message
Michal Jura (mjura) wrote :

There is also problem in class V1ResourceRequirements(object)

it is

        self.swagger_types = {
            'limits': 'object',
            'requests': 'object'
        }

        self.attribute_map = {
            'limits': 'limits',
            'requests': 'requests'
        }

and should be

        self.swagger_types = {
            'limits': 'str',
            'requests': 'str'
        }

        self.attribute_map = {
            'limits': 'limits',
            'requests': 'requests'
        }

Revision history for this message
Michal Jura (mjura) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/415455/

Revision history for this message
Spyros Trigazis (strigazi) wrote :

I proposed this change [1], based on this PR [2] in swagger-api/swagger-codegen.

[1] https://review.openstack.org/423709
[2] https://github.com/swagger-api/swagger-codegen/pull/1192

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to magnum (master)

Fix proposed to branch: master
Review: https://review.openstack.org/423710

Changed in magnum:
assignee: Jason Dunsmore (jasondunsmore) → Spyros Trigazis (strigazi)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to magnum (master)

Reviewed: https://review.openstack.org/423710
Committed: https://git.openstack.org/cgit/openstack/magnum/commit/?id=52be59345bc7e78917c6d145a927c157e436e78a
Submitter: Jenkins
Branch: master

commit 52be59345bc7e78917c6d145a927c157e436e78a
Author: Spyros Trigazis <email address hidden>
Date: Sat Jan 21 23:22:01 2017 +0100

    Fix getting capacity in k8s_monitor

    Remove parsing literal for capacity. K8s client returns an object
    now.

    Change-Id: I26b3e529ee69ea9e48e0bedfbf95dd77d2b78593
    Depends-On: Ia55d01a7cfd6e11448272e5859dd84e40147b618
    Closes-Bug: #1595373

Changed in magnum:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/magnum 4.0.0

This issue was fixed in the openstack/magnum 4.0.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.