test_network_policies failes due to Fatal error on SSL transport

Bug #2007997 reported by Konstantinos Kaskavelis
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Charmed Kubernetes Testing
Fix Released
Undecided
Unassigned

Bug Description

SQA team has a failed run, in a yoga focal run, where on top of openstack we bootstrap a Juju controller and then deploy kubernetes, and when running the k8s-suite the test_network_policies test fails with the following error:

Task was destroyed but it is pending!
task: <Task pending name='Task-34' coro=<WebSocketCommonProtocol.keepalive_ping() running at /home/ubuntu/k8s-validation/.tox/py3/lib/python3.8/site-packages/websockets/protocol.py:977> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7fab24dd7490>()]>>
Task was destroyed but it is pending!
task: <Task pending name='Task-41' coro=<Connection._pinger() running at /home/ubuntu/k8s-validation/.tox/py3/lib/python3.8/site-packages/juju/client/connection.py:583> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7fab256782e0>()]>>
Fatal error on SSL transport
protocol: <asyncio.sslproto.SSLProtocol object at 0x7fab24de2af0>
transport: <_SelectorSocketTransport closing fd=10>
Traceback (most recent call last):
  File "/usr/lib/python3.8/asyncio/selector_events.py", line 910, in write
    n = self._sock.send(data)
OSError: [Errno 9] Bad file descriptor

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/asyncio/sslproto.py", line 685, in _process_write_backlog
    self._transport.write(chunk)
  File "/usr/lib/python3.8/asyncio/selector_events.py", line 916, in write
    self._fatal_error(exc, 'Fatal write error on socket transport')
  File "/usr/lib/python3.8/asyncio/selector_events.py", line 711, in _fatal_error
    self._force_close(exc)
  File "/usr/lib/python3.8/asyncio/selector_events.py", line 723, in _force_close
    self._loop.call_soon(self._call_connection_lost, exc)
  File "/usr/lib/python3.8/asyncio/base_events.py", line 719, in call_soon
    self._check_closed()
  File "/usr/lib/python3.8/asyncio/base_events.py", line 508, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
Task was destroyed but it is pending!
task: <Task pending name='Task-33' coro=<WebSocketCommonProtocol.transfer_data() running at /home/ubuntu/k8s-validation/.tox/py3/lib/python3.8/site-packages/websockets/protocol.py:674> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7fab256d2fd0>()]> cb=[<TaskWakeupMethWrapper object at 0x7fab24dd7ca0>(), _wait.<locals>._on_completion() at /usr/lib/python3.8/asyncio/tasks.py:518]>
Task was destroyed but it is pending!
task: <Task pending name='Task-35' coro=<WebSocketCommonProtocol.close_connection() running at /home/ubuntu/k8s-validation/.tox/py3/lib/python3.8/site-packages/websockets/protocol.py:1019> wait_for=<Task pending name='Task-33' coro=<WebSocketCommonProtocol.transfer_data() running at /home/ubuntu/k8s-validation/.tox/py3/lib/python3.8/site-packages/websockets/protocol.py:674> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7fab256d2fd0>()]> cb=[<TaskWakeupMethWrapper object at 0x7fab24dd7ca0>(), _wait.<locals>._on_completion() at /usr/lib/python3.8/asyncio/tasks.py:518]>>
Exception ignored in: <generator object WebSocketCommonProtocol.close_connection at 0x7fab24de1190>
Traceback (most recent call last):
  File "/home/ubuntu/k8s-validation/.tox/py3/lib/python3.8/site-packages/websockets/protocol.py", line 1056, in close_connection
    if (yield from self.wait_for_connection_lost()):
  File "/home/ubuntu/k8s-validation/.tox/py3/lib/python3.8/site-packages/websockets/protocol.py", line 1077, in wait_for_connection_lost
    yield from asyncio.wait_for(
  File "/usr/lib/python3.8/asyncio/tasks.py", line 472, in wait_for
    timeout_handle = loop.call_later(timeout, _release_waiter, waiter)
  File "/usr/lib/python3.8/asyncio/base_events.py", line 687, in call_later
    timer = self.call_at(self.time() + delay, callback, *args,
  File "/usr/lib/python3.8/asyncio/base_events.py", line 698, in call_at
    self._check_closed()
  File "/usr/lib/python3.8/asyncio/base_events.py", line 508, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
Task was destroyed but it is pending!
task: <Task pending name='Task-36' coro=<Connection._receiver() running at /home/ubuntu/k8s-validation/.tox/py3/lib/python3.8/site-packages/juju/client/connection.py:535> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7fab256d2dc0>()]>>
Task was destroyed but it is pending!
task: <Task pending name='Task-122' coro=<WebSocketCommonProtocol.recv() running at /home/ubuntu/k8s-validation/.tox/py3/lib/python3.8/site-packages/websockets/protocol.py:416> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7fab256d2370>()]> cb=[create_task_with_handler.<locals>._task_result_exp_handler(task_name='tmp', logger=<Logger juju....ection (INFO)>)() at /home/ubuntu/k8s-validation/.tox/py3/lib/python3.8/site-packages/juju/jasyncio.py:64]>
Task was destroyed but it is pending!
task: <Task pending name='Task-123' coro=<Event.wait() running at /usr/lib/python3.8/asyncio/locks.py:309> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7fab29605850>()]>>
ERROR: InvocationError for command /home/ubuntu/k8s-validation/.tox/py3/bin/pytest -v -s --junit-xml=/home/ubuntu/project/generated/kubernetes-openstack/k8s-suite/test_network_policies-junit.xml --controller=foundation-openstack --model=kubernetes-openstack /home/ubuntu/k8s-validation/jobs/integration/validation.py::test_network_policies (exited with code 1)

Test run:

https://solutions.qa.canonical.com/v2/testruns/932c10a8-2437-42f9-8777-5ee020ca6aa1

Logs:

https://oil-jenkins.canonical.com/artifacts/932c10a8-2437-42f9-8777-5ee020ca6aa1/index.html

Revision history for this message
George Kraft (cynerva) wrote :

Traceback (most recent call last):
  File "/home/ubuntu/k8s-validation/jobs/integration/validation.py", line 474, in test_network_policies
    await scp_to(
  File "/home/ubuntu/k8s-validation/jobs/integration/utils.py", line 196, in scp_to
    await asyncify(subprocess.check_call)(cmd.split())
  File "/home/ubuntu/k8s-validation/jobs/integration/utils.py", line 104, in wrapper
    return await loop.run_in_executor(None, partial)
  File "/usr/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['juju', 'scp', '-m', 'foundation-openstack:admin/kubernetes-openstack', '/home/ubuntu/.local/share/juju/tmpkny3oqxm/netpolicy-test.yaml', 'kubernetes-control-plane/0:netpolicy-test.yaml']' returned non-zero exit status 1.

And the command output:

ERROR cannot connect to any address: [172.16.0.170:22]

This appears to be the same underlying issue as https://bugs.launchpad.net/bugs/2007708, in which juju scp commands fail when the kubernetes-control-plane units do not have floating IPs.

Revision history for this message
George Kraft (cynerva) wrote :

PR: https://github.com/charmed-kubernetes/jenkins/pull/1213

The above PR adds a --juju-ssh-proxy option to the validation suite that should allow juju scp calls to work in this scenario. You will need to specify the new option when running test_network_policies.

Changed in charmed-kubernetes-testing:
status: New → Fix Committed
milestone: none → 1.27
Changed in charmed-kubernetes-testing:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.