Creating pods results in "EOF occurred in violation of protocol" exception

Bug #1506226 reported by hongbin
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Magnum
Fix Released
Critical
hongbin

Bug Description

Bertrand NOEL reported the following in ML:

Hi,
I try Magnum, following instructions on the quickstart page [1]. I
successfully create the baymodel and the bay. When I run the command to
create redis pods (_magnum pod-create --manifest ./redis-master.yaml
--bay k8sbay_), client side, it timeouts. And server side (m-cond.log),
I get the following stack trace. It also happens with other Kubernetes
examples.
I try with Ubuntu 14.04, with Magnum at commit
fc8f412c87ea0f9dc0fc1c24963013e6d6209f27.

2015-10-14 12:16:40.877 ERROR oslo_messaging.rpc.dispatcher
[req-960570cf-17b2-489f-9376-81890e2bf2d8 admin admin] Exception during
message handling: [Errno 8] _ssl.c:510: EOF occurred in violation of
protocol
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher Traceback
(most recent call last):
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher File
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py",
line 142, in _dispatch_and_reply
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher
executor_callback))
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher File
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py",
line 186, in _dispatch
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher
executor_callback)
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher File
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py",
line 129, in _do_dispatch
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher result =
func(ctxt, **new_args)
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher File
"/opt/stack/magnum/magnum/conductor/handlers/k8s_conductor.py", line 89,
in pod_create
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher
namespace='default')
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher File
"/opt/stack/magnum/magnum/common/pythonk8sclient/swagger_client/apis/apiv_api.py",
line 3596, in create_namespaced_pod
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher
callback=params.get('callback'))
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher File
"/opt/stack/magnum/magnum/common/pythonk8sclient/swagger_client/api_client.py",
line 320, in call_api
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher
response_type, auth_settings, callback)
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher File
"/opt/stack/magnum/magnum/common/pythonk8sclient/swagger_client/api_client.py",
line 148, in __call_api
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher
post_params=post_params, body=body)
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher File
"/opt/stack/magnum/magnum/common/pythonk8sclient/swagger_client/api_client.py",
line 350, in request
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher body=body)
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher File
"/opt/stack/magnum/magnum/common/pythonk8sclient/swagger_client/rest.py", line
265, in POST
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher return
self.IMPL.POST(*n, **kw)
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher File
"/opt/stack/magnum/magnum/common/pythonk8sclient/swagger_client/rest.py", line
187, in POST
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher body=body)
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher File
"/opt/stack/magnum/magnum/common/pythonk8sclient/swagger_client/rest.py", line
133, in request
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher headers=headers)
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher File
"/usr/local/lib/python2.7/dist-packages/urllib3/request.py", line 72, in
request
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher **urlopen_kw)
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher File
"/usr/local/lib/python2.7/dist-packages/urllib3/request.py", line 149,
in request_encode_body
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher return
self.urlopen(method, url, **extra_kw)
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher File
"/usr/local/lib/python2.7/dist-packages/urllib3/poolmanager.py", line
161, in urlopen
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher response =
conn.urlopen(method, u.request_uri, **kw)
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher File
"/usr/local/lib/python2.7/dist-packages/urllib3/connectionpool.py", line
588, in urlopen
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher raise
SSLError(e)
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher SSLError:
[Errno 8] _ssl.c:510: EOF occurred in violation of protocol
2015-10-14 12:16:40.877 TRACE oslo_messaging.rpc.dispatcher
2015-10-14 12:16:40.879 ERROR oslo_messaging._drivers.common
[req-960570cf-17b2-489f-9376-81890e2bf2d8 admin admin] Returning
exception [Errno 8] _ssl.c:510: EOF occurred in violation of protocol to
caller
2015-10-14 12:16:40.879 ERROR oslo_messaging._drivers.common
[req-960570cf-17b2-489f-9376-81890e2bf2d8 admin admin] ['Traceback (most
recent call last):\n', ' File
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py",
line 142, in _dispatch_and_reply\n executor_callback))\n', ' File
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py",
line 186, in _dispatch\n executor_callback)\n', ' File
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py",
line 129, in _do_dispatch\n result = func(ctxt, **new_args)\n', '
File "/opt/stack/magnum/magnum/conductor/handlers/k8s_conductor.py",
line 89, in pod_create\n namespace=\'default\')\n', ' File
"/opt/stack/magnum/magnum/common/pythonk8sclient/swagger_client/apis/apiv_api.py",
line 3596, in create_namespaced_pod\n
callback=params.get(\'callback\'))\n', ' File
"/opt/stack/magnum/magnum/common/pythonk8sclient/swagger_client/api_client.py",
line 320, in call_api\n response_type, auth_settings, callback)\n',
' File
"/opt/stack/magnum/magnum/common/pythonk8sclient/swagger_client/api_client.py",
line 148, in __call_api\n post_params=post_params, body=body)\n', '
File
"/opt/stack/magnum/magnum/common/pythonk8sclient/swagger_client/api_client.py",
line 350, in request\n body=body)\n', ' File
"/opt/stack/magnum/magnum/common/pythonk8sclient/swagger_client/rest.py", line
265, in POST\n return self.IMPL.POST(*n, **kw)\n', ' File
"/opt/stack/magnum/magnum/common/pythonk8sclient/swagger_client/rest.py", line
187, in POST\n body=body)\n', ' File
"/opt/stack/magnum/magnum/common/pythonk8sclient/swagger_client/rest.py", line
133, in request\n headers=headers)\n', ' File
"/usr/local/lib/python2.7/dist-packages/urllib3/request.py", line 72, in
request\n **urlopen_kw)\n', ' File
"/usr/local/lib/python2.7/dist-packages/urllib3/request.py", line 149,
in request_encode_body\n return self.urlopen(method, url,
**extra_kw)\n', ' File
"/usr/local/lib/python2.7/dist-packages/urllib3/poolmanager.py", line
161, in urlopen\n response = conn.urlopen(method, u.request_uri,
**kw)\n', ' File
"/usr/local/lib/python2.7/dist-packages/urllib3/connectionpool.py", line
588, in urlopen\n raise SSLError(e)\n', 'SSLError: [Errno 8]
_ssl.c:510: EOF occurred in violation of protocol\n']
2015-10-14 12:16:40.880 ERROR oslo_messaging._utils
[req-960570cf-17b2-489f-9376-81890e2bf2d8 admin admin] The dispatcher
method must catches all exceptions
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils Traceback (most
recent call last):
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils File
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/_utils.py", line
81, in run
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils self._executor_callback)
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils File
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py",
line 152, in _dispatch_and_reply
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils
incoming.reply(failure=exc_info)
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils File
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py",
line 101, in reply
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils
self._send_reply(conn, reply, failure, log_failure=log_failure)
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils File
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py",
line 59, in _send_reply
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils log_failure)
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils File
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/common.py",
line 199, in serialize_remote_exception
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils json_data =
jsonutils.dumps(data)
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils File
"/usr/local/lib/python2.7/dist-packages/oslo_serialization/jsonutils.py", line
185, in dumps
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils return
json.dumps(obj, default=default, **kwargs)
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils File
"/usr/lib/python2.7/json/__init__.py", line 250, in dumps
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils sort_keys=sort_keys,
**kw).encode(obj)
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils File
"/usr/lib/python2.7/json/encoder.py", line 207, in encode
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils chunks =
self.iterencode(o, _one_shot=True)
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils File
"/usr/lib/python2.7/json/encoder.py", line 270, in iterencode
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils return
_iterencode(o, 0)
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils ValueError: Circular
reference detected
2015-10-14 12:16:40.880 TRACE oslo_messaging._utils

[1] http://docs.openstack.org/developer/magnum/dev/dev-quickstart.html

hongbin (hongbin034)
Changed in magnum:
importance: Undecided → Critical
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to magnum (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/235010

Revision history for this message
Ton Ngo (ton-i) wrote :

This problem occurs when TLS is enabled in Kubernetes. It is observed on both the previous and current atomic-5 image, so it does not appear to be related to the images.
kube-apiserver seems to be not configured properly for TLS since the API call returns EOF.
This also causes kube-proxy and kubelet in the nodes to fail since they cannot query kube-apiserver.
It could also be the client that is not configured properly to make API call with TLS to kube-apiserver.

hongbin (hongbin034)
Changed in magnum:
assignee: nobody → hongbin (hongbin034)
Changed in magnum:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to magnum (master)

Reviewed: https://review.openstack.org/235010
Committed: https://git.openstack.org/cgit/openstack/magnum/commit/?id=da294e0ced1da7bd2492f07b600f2076caf72591
Submitter: Jenkins
Branch: master

commit da294e0ced1da7bd2492f07b600f2076caf72591
Author: Hongbin Lu <email address hidden>
Date: Wed Oct 14 18:50:56 2015 -0400

    Open port 6443 in security group for k8s bay

    This should fix a SSLError, which is due to the blocking of the port.

    Change-Id: I8cee3b4189bc84d095461abffefd969796ca7774
    Closes-Bug: #1506226

Changed in magnum:
status: In Progress → Fix Committed
Revision history for this message
Bertrand NOEL (bertrand-noel-88) wrote :

It seems this bug came back. At commit 'Merge "Fix incorrect usage of CertManager in k8s_api"'/d59d4c24655a9f08a9ae723b8a6e73228143b8ee
I can reproduce this error.

Changed in magnum:
status: Fix Committed → Confirmed
Revision history for this message
Bertrand NOEL (bertrand-noel-88) wrote :

Problem was coming from my config. Google DNS was not available. Changing to my DNS, it works.

Changed in magnum:
status: Confirmed → Fix Released
Revision history for this message
Ligong Duan (duanlg) wrote :
Download full text (6.8 KiB)

Hi, this issue also occurs to my devstack+magnum environment when I am trying to execute "magnum pod-create". Below is the detailed trace. I've check the source code and the fix has been applied to my code already.

Any suggestions?

-----------------
2015-11-06 23:32:43.534 ^[[00;32mDEBUG magnum.conductor.handlers.k8s_conductor [^[[01;36mreq-13e11083-e60b-4334-acee-2040ab068b05 ^[[00;36madmin admin^[[00;32m] ^[[01;35m^[[00;32mpod_create^[[00m ^[[00;33mfrom (pid=11232) pod_create /opt/stack/magnum/magnum/conductor/handlers/k8s_conductor.py:84^[[00m
2015-11-06 23:32:43.543 ^[[01;33mWARNING magnum.common.cert_manager.local_cert_manager [^[[01;36mreq-13e11083-e60b-4334-acee-2040ab068b05 ^[[00;36madmin admin^[[01;33m] ^[[01;35m^[[01;33mLoading certificate bc914ed4-fd10-4e3b-9b49-af89cabf1e53 from the local filesystem. CertManager type 'local' should be used for testing purpose.^[[00m
2015-11-06 23:32:43.551 ^[[01;33mWARNING magnum.common.cert_manager.local_cert_manager [^[[01;36mreq-13e11083-e60b-4334-acee-2040ab068b05 ^[[00;36madmin admin^[[01;33m] ^[[01;35m^[[01;33mLoading certificate 5609ea85-0f86-4a5b-a5de-b15e8b3866f7 from the local filesystem. CertManager type 'local' should be used for testing purpose.^[[00m
/usr/local/lib/python2.7/dist-packages/urllib3/util/ssl_.py:100: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
2015-11-06 23:32:43.568 ^[[01;31mERROR oslo_messaging.rpc.dispatcher [^[[01;36mreq-13e11083-e60b-4334-acee-2040ab068b05 ^[[00;36madmin admin^[[01;31m] ^[[01;35m^[[01;31mException during message handling: [Errno 8] _ssl.c:510: EOF occurred in violation of protocol^[[00m
^[[01;31m2015-11-06 23:32:43.568 TRACE oslo_messaging.rpc.dispatcher ^[[01;35m^[[00mTraceback (most recent call last):
^[[01;31m2015-11-06 23:32:43.568 TRACE oslo_messaging.rpc.dispatcher ^[[01;35m^[[00m File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
^[[01;31m2015-11-06 23:32:43.568 TRACE oslo_messaging.rpc.dispatcher ^[[01;35m^[[00m executor_callback))
^[[01;31m2015-11-06 23:32:43.568 TRACE oslo_messaging.rpc.dispatcher ^[[01;35m^[[00m File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
^[[01;31m2015-11-06 23:32:43.568 TRACE oslo_messaging.rpc.dispatcher ^[[01;35m^[[00m executor_callback)
^[[01;31m2015-11-06 23:32:43.568 TRACE oslo_messaging.rpc.dispatcher ^[[01;35m^[[00m File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 129, in _do_dispatch
^[[01;31m2015-11-06 23:32:43.568 TRACE oslo_messaging.rpc.dispatcher ^[[01;35m^[[00m result = func(ctxt, **new_args)
^[[01;31m2015-11-06 23:32:43.568 TRACE oslo_messaging.rpc.dispatcher ^[[01;35m^[[00m File "/opt/stack/magnum/magnum/conductor/handlers/k8s_conductor.py", line 89, in pod_create
2015-11-06 23:32:43.568 TRACE oslo_messaging.rpc.dispatcher ^[[01;35m^[[00m namespace='default')
2015-11-06 23:3...

Read more...

Revision history for this message
hongbin (hongbin034) wrote :

This error means magnum is not able to communicate with the k8s cluster. There could be many reasons:

1. The k8s processes doesn't start properly in the cluster.
2. The k8s processes start properly, but magnum and k8s cluster is disconnected (due to nova, neutron, etc.).
3. Magnum is able to communicate with k8s API server, but not able to pass authentication due to issue on the TLS certificates.

To verify #1, you can ssh to the nova instances and verify k8s is started properly. Reference: https://github.com/kubernetes/kubernetes/wiki/Debugging-FAQ

To verify #3, you can disable TLS and check if it works:

$ magnum baymodel-delete k8sbaymodel # delete the old baymodel
$ magnum baymodel-create --name k8sbaymodel \
                       --image-id fedora-21-atomic-5 \
                       --keypair-id testkey \
                       --external-network-id public \
                       --dns-nameserver 8.8.8.8 \
                       --flavor-id m1.small \
                       --docker-volume-size 5 \
                       --network-driver flannel \
                       --coe kubernetes \
                       --tls-disabled

Then, create a new bay based on the new baymodel.

Revision history for this message
Ligong Duan (duanlg) wrote :

Thank Hongbin and this is very helpful.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers