network sometimes unreachable accessing our git server

Bug #1298006 reported by Doug Hellmann
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Core Infrastructure
Triaged
Medium
Unassigned
OpenStack-Gate
Triaged
Medium
Unassigned

Bug Description

The neutron-large-ops job failed with:

2014-03-26 17:02:26.542 | Fetching origin
2014-03-26 17:02:46.573 | error: Failed to connect to 2001:4800:7813:516:3bc3:d7f6:ff04:aacb: Network is unreachable while accessing https://git.openstack.org/openstack/swift/info/refs
2014-03-26 17:02:46.575 | fatal: HTTP request failed
2014-03-26 17:02:46.577 | error: Could not fetch origin

http://logs.openstack.org/56/82356/2/check/gate-tempest-dsvm-neutron-large-ops/1ca16bc/

Tags: gate-failure
Revision history for this message
Arx Cruz (arxcruz) wrote :

Last time this happen to me was because one of the git0x.openstack.org fails to sync and replicate the repository.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Right, if memory serves, you observed that trying to cache git repositories as a part of building nodepool images while we were creating a new project in Gerrit, so nodepool scripts tried to clone a new project listed in the projects.yaml while it was still in the process of being added to the git server farm. That shouldn't happen in devstack-gate jobs themselves because they only update repositories which have already been pre-cached on their images, at which point the repositories will already be on the git server farm.

Revision history for this message
Sean Dague (sdague) wrote :

There is a new manifestation of this bug:

2014-06-24 04:13:56.120 | Started by user anonymous
2014-06-24 04:13:56.122 | [EnvInject] - Loading node environment variables.
2014-06-24 04:13:59.483 | Building remotely on devstack-precise-hpcloud-b1-526670 in workspace /home/jenkins/workspace/gate-tempest-dsvm-neutron
2014-06-24 04:14:22.522 | [gate-tempest-dsvm-neutron] $ /bin/sh /tmp/hudson2325080147448468575.sh
2014-06-24 04:14:23.337 | Detailed logs: http://logs.openstack.org/05/95105/16/gate/gate-tempest-dsvm-neutron/3db45bd/
2014-06-24 04:14:24.096 | [gate-tempest-dsvm-neutron] $ /bin/sh /tmp/hudson7919123659949194155.sh
2014-06-24 04:14:24.209 | Network interface addresses...
2014-06-24 04:14:24.210 | 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
2014-06-24 04:14:24.210 | link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2014-06-24 04:14:24.210 | inet 127.0.0.1/8 scope host lo
2014-06-24 04:14:24.210 | inet6 ::1/128 scope host
2014-06-24 04:14:24.210 | valid_lft forever preferred_lft forever
2014-06-24 04:14:24.210 | 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
2014-06-24 04:14:24.210 | link/ether fa:16:3e:bd:8d:35 brd ff:ff:ff:ff:ff:ff
2014-06-24 04:14:24.210 | inet 10.0.0.188/24 brd 10.0.0.255 scope global eth0
2014-06-24 04:14:24.210 | inet6 fe80::f816:3eff:febd:8d35/64 scope link
2014-06-24 04:14:24.210 | valid_lft forever preferred_lft forever
2014-06-24 04:14:24.211 | Network routing tables...
2014-06-24 04:14:24.211 | default via 10.0.0.1 dev eth0 metric 100
2014-06-24 04:14:24.211 | 10.0.0.0/24 dev eth0 proto kernel scope link src 10.0.0.188
2014-06-24 04:14:24.212 | fe80::/64 dev eth0 proto kernel metric 256
2014-06-24 04:14:24.212 | Network neighbors...
2014-06-24 04:14:24.212 | 10.0.0.1 dev eth0 lladdr fa:16:3e:c4:c8:74 REACHABLE
2014-06-24 04:14:24.786 | [gate-tempest-dsvm-neutron] $ /bin/bash -xe /tmp/hudson3865244588641370424.sh
2014-06-24 04:14:24.823 | + [[ ! -e devstack-gate ]]
2014-06-24 04:14:24.823 | + git clone git://git.openstack.org/openstack-infra/devstack-gate
2014-06-24 04:15:27.930 | fatal: unable to connect to git.openstack.org:
2014-06-24 04:15:27.930 | git.openstack.org[0: 192.237.223.224]: errno=Connection timed out
2014-06-24 04:15:27.931 | git.openstack.org[1: 2001:4800:7813:516:3bc3:d7f6:ff04:aacb]: errno=Network is unreachable
2014-06-24 04:15:27.931 |
2014-06-24 04:15:27.931 | Cloning into 'devstack-gate'...
2014-06-24 04:15:28.121 | Build step 'Execute shell' marked build as failure
2014-06-24 04:15:29.504 | [SCP] Connecting to static.openstack.org
2014-06-24 04:15:33.294 | [SCP] No file(s) found: logs/**
2014-06-24 04:15:33.761 | [SCP] ‘logs/**’ doesn’t match anything, but ‘**’ does. Perhaps that’s what you mean?
2014-06-24 04:15:33.840 | [SCP] Copying console log.
2014-06-24 04:15:37.566 | [SCP] Trying to create /srv/static/logs/05/95105/16/gate/gate-tempest-dsvm-neutron/3db45bd
2014-06-24 04:15:37.595 | Finished: FAILURE

http://logs.openstack.org/05/95105/16/gate/gate-tempest-dsvm-neutron/3db45bd/console.html

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to elastic-recheck (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/102181

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to elastic-recheck (master)

Reviewed: https://review.openstack.org/102181
Committed: https://git.openstack.org/cgit/openstack-infra/elastic-recheck/commit/?id=fac1bef9430c10b5b9d19583b0f86c63a96855a7
Submitter: Jenkins
Branch: master

commit fac1bef9430c10b5b9d19583b0f86c63a96855a7
Author: Sean Dague <email address hidden>
Date: Tue Jun 24 06:44:10 2014 -0400

    add bug for git.o.o going away

    Change-Id: Ib5b2d2e227e387fdfa3694930cf982cad9595b1a
    Related-Bug: #1298006

Jeremy Stanley (fungi)
Changed in openstack-ci:
status: New → Triaged
importance: Undecided → Medium
tags: added: gate-failure
Revision history for this message
Matt Riedemann (mriedem) wrote :

Saw this here:

http://logs.openstack.org/28/111328/2/gate/gate-devstack-dsvm-cells/23222cf/logs/devstack-gate-setup-workspace-new.txt.gz

The error is similar but a bit different so e-r didn't catch it:

2014-08-03 21:14:17.478 | fatal: unable to access 'https://git.openstack.org/openstack-infra/devstack-gate/': Failed to connect to git.openstack.org port 443: Connection timed out

We should update the logstash query.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to elastic-recheck (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/111617

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to elastic-recheck (master)

Reviewed: https://review.openstack.org/111617
Committed: https://git.openstack.org/cgit/openstack-infra/elastic-recheck/commit/?id=939255f9207805837aac7bfda506649492803a2c
Submitter: Jenkins
Branch: master

commit 939255f9207805837aac7bfda506649492803a2c
Author: Matt Riedemann <email address hidden>
Date: Sun Aug 3 20:05:29 2014 -0700

    Update query for bug 129806

    I hit a similar infra/git clone timeout bug
    so update the fingerprint for multiple types
    of fails.

    Change-Id: I69dd2b32072a5961c3de3dcaa46082a31769d25b
    Related-Bug: #1298006

Jeremy Stanley (fungi)
Changed in openstack-gate:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to project-config (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/308585

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to project-config (master)

Reviewed: https://review.openstack.org/308585
Committed: https://git.openstack.org/cgit/openstack-infra/project-config/commit/?id=1c7e030265827066afd0f60c73a7dc39f38ee3f1
Submitter: Jenkins
Branch: master

commit 1c7e030265827066afd0f60c73a7dc39f38ee3f1
Author: Jeremy Stanley <email address hidden>
Date: Wed Apr 20 20:56:32 2016 +0000

    Slowly turn rax-ord back on in nodepool

    The provider indicates they've completed some very recent
    maintenance on private circuits between ORD and DFW regions, so our
    previously witnessed network issues may be solved by that. If this
    still doesn't work, we'll revert again.

    Change-Id: I88f1634b04869e3223481002c4f99a10e5351540
    Related-Bug: #1298006

Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

Seems there is another variant this issue manifests itself now:

2018-04-07 01:21:29.234042 | TASK [validate-host : Collect information about zuul worker]
2018-04-07 01:21:29.890927 | ubuntu-xenial | ERROR: No viable v4 or v6 route found to git.openstack.org. The build node is assumed to be invalid.

368 hits in the last 7 days, will update the es query.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to elastic-recheck (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/559681

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to elastic-recheck (master)

Reviewed: https://review.openstack.org/559681
Committed: https://git.openstack.org/cgit/openstack-infra/elastic-recheck/commit/?id=55c0c977af83f91bff67ac727fdfe761f3ad925f
Submitter: Zuul
Branch: master

commit 55c0c977af83f91bff67ac727fdfe761f3ad925f
Author: Jens Harbott <email address hidden>
Date: Mon Apr 9 09:51:30 2018 +0000

    Update query for bug 1298006

    The playbooks now check the connectivity towards git.openstack.org
    before any actual connection is attempted, so we need to check for a new
    message.

    Change-Id: Ieedef559c5c99e39acabf5d07a4d3c53d51ee6b4
    Related-Bug: 1298006

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.