Timeout errors on git operations

Bug #1042746 reported by Milo Casagrande
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linaro patch metrics
Fix Released
Medium
Milo Casagrande

Bug Description

With information also from bug 1012037, there are still errors coming from patches.l.o, about git operations, mostly timeouts, but also other smaller ones.

The script that produces the errors is this one:

/srv/patches.linaro.org/apps/patchwork/bin/update-committed-patches.py

The problem with those errors is that they cannot be easily reproduced locally.
One of the ideas was to not suppress the git output from the command we run, since now we are storing it. Doing this locally showed some improvements.

Changed in linaro-patchmetrics:
assignee: nobody → Deepti B. Kalakeri (deeptik)
milestone: none → 2012.09
Revision history for this message
Deepti B. Kalakeri (deeptik) wrote :

Adding the several timeout errors we are seeing of late from the update-committed-patches script.

Changed in linaro-patchmetrics:
importance: Undecided → Medium
Changed in linaro-patchmetrics:
status: New → Confirmed
Revision history for this message
Deepti B. Kalakeri (deeptik) wrote :

Latest timeout git errors attached.

Revision history for this message
Deepti B. Kalakeri (deeptik) wrote :

There are 2 sets of failures now.
1) Some of the errors are occuring for ex the one below because of the missing branch information in the repository because of which the git pull commands fail.

(i)
Failed to fetch http://gcc.gnu.org/git/gcc.git (gcc-patches - gcc-patches).
 stdout:
 stderr: From http://gcc.gnu.org/git/gcc
   6a27289..a06bbd8 gcc-4_6-branch -> origin/gcc-4_6-branch
   caab129..18f7be2 gcc-4_7-branch -> origin/gcc-4_7-branch
 * [new branch] hjl/pr54037 -> origin/hjl/pr54037
   53057a1..74e1f4d hjl/x32/gcc-4_7-branch -> origin/hjl/x32/gcc-4_7-branch
   eaf4b52..88fe75b master -> origin/master
   103a005..84972db melt-branch -> origin/melt-branch
   eaf4b52..88fe75b trunk -> origin/trunk
You asked to pull from the remote 'origin', but did not specify
a branch. Because this is not the default configured remote
for your current branch, you must specify a branch on the command line.

(ii)
Timeout when performing git operation on 'Linux wireless' (linux-wireless).
 Project path: /tmp/linux-wireless
 Project tree: http://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next.git
 Killing processes ['30579', '30580', '30593', '30594', '30597', '30598'] and moving on. Output of the kill command: (, kill: No such process
kill: No such process
kill: No such process
kill: No such process
).

To resolve the above errors the best approach would be to change the patchwork to accept the repository and the branch name when adding a new project to the p.l.o.

2) There are repositories which are not getting clone on peony as they fail with errors like the one below.

error: RPC failed; result=18, HTTP code = 200
fatal: protocol error: bad pack header
warning: http unexpectedly said: '0000'
Timeout when performing git operation on 'linux-serial' (linux-serial).
 Project path: /tmp/linux-serial
 Project tree: http://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
 Killing processes ['19534', '19535', '19537', '19538'] and moving on. Output of the kill command: (, kill: No such process
kill: No such process
).
My first guess was that these might be appearing because of the network issues at the time when they try to clone it.
But, I was not able to reproduce the error listed in (2) and was able to clone the repository on peony without any problems.
We need to investigate further on the errors (2).

Revision history for this message
Milo Casagrande (milo) wrote : Re: [Bug 1042746] Re: Timeout errors on git operations

On Tue, Sep 25, 2012 at 7:48 AM, Deepti B. Kalakeri
<email address hidden> wrote:
>
> To resolve the above errors the best approach would be to change the
> patchwork to accept the repository and the branch name when adding a new
> project to the p.l.o.

Indeed. I think I once discussed with Danilo about such a change. I
will try and see this month if it is possible to do it.
There is a bug to track it:
https://bugs.launchpad.net/linaro-patchmetrics/+bug/1017933

I thought I had inserted a small change to patchmetrics to default to
the master branch in case of the GCC repository... It is better to
check if it is working then.

> 2) There are repositories which are not getting clone on peony as they
> fail with errors like the one below.
>
> error: RPC failed; result=18, HTTP code = 200
> fatal: protocol error: bad pack header
> warning: http unexpectedly said: '0000'
> Timeout when performing git operation on 'linux-serial' (linux-serial).
> Project path: /tmp/linux-serial
> Project tree: http://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
> Killing processes ['19534', '19535', '19537', '19538'] and moving on. Output of the kill command: (, kill: No such process
> kill: No such process
> ).
> My first guess was that these might be appearing because of the network issues at the time when they try to clone it.
> But, I was not able to reproduce the error listed in (2) and was able to clone the repository on peony without any problems.
> We need to investigate further on the errors (2).

I have another option, but is a shortcut: all these errors come from
git.kernel.org, all of them via the HTTP protocol. If you look at the
ones you enabled with the GIT protocol, none of them timeouts.
We might ask IS to enable access to git://git.kernel.org and test at
least with a coupe of projects if they do not timeout anymore. This
holds valid if we need to provide IS only the "host" and not the full
URL to open. What do you think? Worth a try?

--
Milo Casagrande
Infrastructure Engineer
Linaro.org <www.linaro.org> │ Open source software for ARM SoCs

Changed in linaro-patchmetrics:
milestone: 2012.09 → 2012.10
Revision history for this message
Deepti B. Kalakeri (deeptik) wrote :
Download full text (3.2 KiB)

On 9/25/12, Milo Casagrande <email address hidden> wrote:
> On Tue, Sep 25, 2012 at 7:48 AM, Deepti B. Kalakeri
> <email address hidden> wrote:
>>
>> To resolve the above errors the best approach would be to change the
>> patchwork to accept the repository and the branch name when adding a new
>> project to the p.l.o.
>
> Indeed. I think I once discussed with Danilo about such a change. I
> will try and see this month if it is possible to do it.
> There is a bug to track it:
> https://bugs.launchpad.net/linaro-patchmetrics/+bug/1017933
>
> I thought I had inserted a small change to patchmetrics to default to
> the master branch in case of the GCC repository... It is better to
> check if it is working then.
>
The default branches were added to only couple of trees I guess.
Now, the timeouts are occurring for more repositories now.
>> 2) There are repositories which are not getting clone on peony as they
>> fail with errors like the one below.
>>
>> error: RPC failed; result=18, HTTP code = 200
>> fatal: protocol error: bad pack header
>> warning: http unexpectedly said: '0000'
>> Timeout when performing git operation on 'linux-serial' (linux-serial).
>> Project path: /tmp/linux-serial
>> Project tree:
>> http://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
>> Killing processes ['19534', '19535', '19537', '19538'] and moving on.
>> Output of the kill command: (, kill: No such process
>> kill: No such process
>> ).
>> My first guess was that these might be appearing because of the network
>> issues at the time when they try to clone it.
>> But, I was not able to reproduce the error listed in (2) and was able to
>> clone the repository on peony without any problems.
>> We need to investigate further on the errors (2).
>
> I have another option, but is a shortcut: all these errors come from
> git.kernel.org, all of them via the HTTP protocol. If you look at the
> ones you enabled with the GIT protocol, none of them timeouts.
> We might ask IS to enable access to git://git.kernel.org and test at
> least with a coupe of projects if they do not timeout anymore. This
> holds valid if we need to provide IS only the "host" and not the full
> URL to open. What do you think? Worth a try?
>

Sure it would be worth trying.
> --
> Milo Casagrande
> Infrastructure Engineer
> Linaro.org <www.linaro.org> │ Open source software for ARM SoCs
>
> --
> You received this bug notification because you are a bug assignee.
> https://bugs.launchpad.net/bugs/1042746
>
> Title:
> Timeout errors on git operations
>
> Status in Linaro patch metrics:
> Confirmed
>
> Bug description:
> With information also from bug 1012037, there are still errors coming
> from patches.l.o, about git operations, mostly timeouts, but also
> other smaller ones.
>
> The script that produces the errors is this one:
>
> /srv/patches.linaro.org/apps/patchwork/bin/update-committed-patches.py
>
> The problem with those errors is that they cannot be easily reproduced
> locally.
> One of the ideas was to not suppress the git output from the command we
> run, since now we are storing it. Doing this locally showed some
> improvements.
>
> To manage notifications ab...

Read more...

Changed in linaro-patchmetrics:
assignee: Deepti B. Kalakeri (deeptik) → Milo Casagrande (milo)
Revision history for this message
Milo Casagrande (milo) wrote :

On Tue, Sep 25, 2012 at 12:04 PM, Deepti B. Kalakeri
<email address hidden> wrote:
> The default branches were added to only couple of trees I guess.
> Now, the timeouts are occurring for more repositories now.

True.

I was referring to the first error we have, that is not a timeout, is
a git error about the wrong "branch". When you clone gcc (gcc-patches
in this case) it does not have a default branch set.
We added a workaround for two trees:
REPO_BRANCHES = {'cpufreq': 'next', 'gcc-patches': 'master'}
But in some ways this is not being picked up.

--
Milo Casagrande
Infrastructure Engineer
Linaro.org <www.linaro.org> │ Open source software for ARM SoCs

Revision history for this message
Milo Casagrande (milo) wrote :

I requested IS to enable GIT protocol for git.kernel.org, in order to see if this resolves some of the timeouts.
At the moment I changed 2 projects to use GIT URL instead of the usual HTTP one. The two projects are cpufreq and linux-usb.
It might not be visible today, since I guess a sync job is already running, but tomorrow we will get a better picture if it solved the problem.

Revision history for this message
Milo Casagrande (milo) wrote :

As of today, all the projects that take code from kernel.org are using GIT protocol.
We still didn't get the usual error email about timeouts, it can also be the process taking too long to execute.

Revision history for this message
Milo Casagrande (milo) wrote :

It has been 3 (or 4) days since we didn't receive a new error email from patches.l.o about timeout errors, although we are getting the one about the impossibility to acquire the lock. I guess that that means the script is running and is doing its job, only that it is taking quite a lot to complete (might also be that the cloned repositories are quite out of date...).

I'm closing this for this cycle as "Fix Released", but it could be re-opened if the problem arises again.

Changed in linaro-patchmetrics:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.