short readline in the readvfile hunk on smb mount

Bug #360476 reported by Ted Gould
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Invalid
Medium
Martin Pool

Bug Description

ted@shi:~/Photos$ bzr push /media/nas/Photos/
bzr: ERROR: [Errno 37] No locks available
ted@shi:~/Photos$ bzr push /media/nas/Photos/
Unable to obtain lock file:///media/nas/Photos/.bzr/branch/lock
held by <email address hidden> on host shi [process #1166]
locked 8 hours, 56 minutes ago
Will continue to try until 07:51:38, unless you press Ctrl-C
If you're sure that it's not being modified, use bzr break-lock file:///media/nas/Photos/.bzr/branch/lock
^Cbzr: interrupted
ted@shi:~/Photos$ bzr break-lock file:///media/nas/Photos/.bzr/branch/lock
/usr/lib/python2.6/dist-packages/subvertpy/delta.py:21: DeprecationWarning: the md5 module is deprecated; use hashlib instead
  import md5
Break lock file:///media/nas/Photos/.bzr/checkout/lock
held by <email address hidden> on host shi [process #1166]
locked 8 hours, 57 minutes ago? [y/n]: y
Break lock file:///media/nas/Photos/.bzr/branch/lock
held by <email address hidden> on host shi [process #1166]
locked 8 hours, 57 minutes ago? [y/n]: y
ted@shi:~/Photos$ bzr push /media/nas/Photos/
bzr: ERROR: short readline in the readvfile hunk.
ted@shi:~/Photos$ bzr push /media/nas/Photos/
bzr: ERROR: short readline in the readvfile hunk.
ted@shi:~/Photos$ bzr version
Bazaar (bzr) 1.13.1
  Python interpreter: /usr/bin/python 2.6.2
  Python standard library: /usr/lib/python2.6
  bzrlib: /usr/lib/python2.6/dist-packages/bzrlib
  Bazaar configuration: /home/ted/.bazaar
  Bazaar log file: /home/ted/.bzr.log

---

Wed 2009-06-17 14:20:50 -0500
0.039 bzr arguments: [u'push', u'/media/nas/Photos/']
0.051 looking for plugins in /home/ted/.bazaar/plugins
0.196 looking for plugins
in /usr/lib/python2.6/dist-packages/bzrlib/plugins
0.229 encoding stdout as sys.stdout encoding 'UTF-8'
10.805 opening working tree '/media/nas/Photos'
21.586 Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 729, in exception_to_return_code
    return the_callable(*args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 924, in run_bzr
    ret = run(*run_argv)
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 560, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/usr/lib/python2.6/dist-packages/bzrlib/builtins.py", line 1077, in run
    use_existing_dir=use_existing_dir)
  File "/usr/lib/python2.6/dist-packages/bzrlib/push.py", line 128, in _show_push_branch
    remember)
  File "/usr/lib/python2.6/dist-packages/bzrlib/bzrdir.py", line 1286, in push_branch
    tree_to.update()
  File "/usr/lib/python2.6/dist-packages/bzrlib/workingtree.py", line 2180, in update
    return self._update_tree(old_tip, change_reporter)
  File "/usr/lib/python2.6/dist-packages/bzrlib/mutabletree.py", line 52, in tree_write_locked
    return unbound(self, *args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/workingtree.py", line 2211, in _update_tree
    to_tree = self.branch.basis_tree()
  File "/usr/lib/python2.6/dist-packages/bzrlib/branch.py", line 2206, in basis_tree
    return self.repository.revision_tree(self.last_revision())
  File "/usr/lib/python2.6/dist-packages/bzrlib/decorators.py", line 138, in read_locked
    result = unbound(self, *args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line 2313, in revision_tree
    inv = self.get_revision_inventory(revision_id)
  File "/usr/lib/python2.6/dist-packages/bzrlib/decorators.py", line 138, in read_locked
    result = unbound(self, *args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line 2272, in get_revision_inventory
    return self.get_inventory(revision_id)
  File "/usr/lib/python2.6/dist-packages/bzrlib/decorators.py", line 138, in read_locked
    result = unbound(self, *args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line 2164, in get_inventory
    return self.iter_inventories([revision_id]).next()
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line 2183, in _iter_inventories
    for text, revision_id in self._iter_inventory_xmls(revision_ids):
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line 2190, in _iter_inventory_xmls
    for record in stream:
  File "/usr/lib/python2.6/dist-packages/bzrlib/knit.py", line 1345, in get_record_stream
    ordering, include_delta_closure):
  File "/usr/lib/python2.6/dist-packages/bzrlib/knit.py", line 1452, in _get_remaining_record_stream
    global_map)
  File "/usr/lib/python2.6/dist-packages/bzrlib/knit.py", line 2172, in __init__
    allow_missing=True)
  File "/usr/lib/python2.6/dist-packages/bzrlib/knit.py", line 1236, in _get_record_map_unparsed
    for key, data in self._read_records_iter_unchecked(records):
  File "/usr/lib/python2.6/dist-packages/bzrlib/knit.py", line 1907, in _read_records_iter_unchecked
    data = raw_records.next()
  File "/usr/lib/python2.6/dist-packages/bzrlib/knit.py", line 3245, in get_raw_records
    for names, read_func in reader.iter_records():
  File "/usr/lib/python2.6/dist-packages/bzrlib/pack.py", line 253, in iter_records
    self._read_format()
  File "/usr/lib/python2.6/dist-packages/bzrlib/pack.py", line 294, in _read_format
    format = self._read_line()
  File "/usr/lib/python2.6/dist-packages/bzrlib/pack.py", line 221, in _read_line
    line = self._source.readline()
  File "/usr/lib/python2.6/dist-packages/bzrlib/pack.py", line 188, in readline
    raise errors.BzrError('short readline in the readvfile hunk.')
BzrError: short readline in the readvfile hunk.

Revision history for this message
Martin Pool (mbp) wrote :

Thanks for the report Ted.

Could you please paste the traceback from ~/.bzr.log, and tell us what kind of filesystem and/or device this is?

Changed in bzr:
status: New → Incomplete
Revision history for this message
Ted Gould (ted) wrote : Re: [Bug 360476] Re: short readline in the readvfile hunk
Download full text (4.4 KiB)

On Wed, 2009-06-17 at 02:00 +0000, Martin Pool wrote:
> Could you please paste the traceback from ~/.bzr.log, and tell us what
> kind of filesystem and/or device this is?

The device is a Linksys NAS200. Basically it's a couple of HDDs
attached to a network interface. I'm mounting it over SMB using the
Linux filesystem driver. I use it for backups, and this archive is my
photos. So I'm pushing it to the backup copy of the repository.

Wed 2009-06-17 14:20:50 -0500
0.039 bzr arguments: [u'push', u'/media/nas/Photos/']
0.051 looking for plugins in /home/ted/.bazaar/plugins
0.196 looking for plugins
in /usr/lib/python2.6/dist-packages/bzrlib/plugins
0.229 encoding stdout as sys.stdout encoding 'UTF-8'
10.805 opening working tree '/media/nas/Photos'
21.586 Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 729,
in exception_to_return_code
    return the_callable(*args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 924,
in run_bzr
    ret = run(*run_argv)
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 560,
in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/usr/lib/python2.6/dist-packages/bzrlib/builtins.py", line 1077,
in run
    use_existing_dir=use_existing_dir)
  File "/usr/lib/python2.6/dist-packages/bzrlib/push.py", line 128, in
_show_push_branch
    remember)
  File "/usr/lib/python2.6/dist-packages/bzrlib/bzrdir.py", line 1286,
in push_branch
    tree_to.update()
  File "/usr/lib/python2.6/dist-packages/bzrlib/workingtree.py", line
2180, in update
    return self._update_tree(old_tip, change_reporter)
  File "/usr/lib/python2.6/dist-packages/bzrlib/mutabletree.py", line
52, in tree_write_locked
    return unbound(self, *args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/workingtree.py", line
2211, in _update_tree
    to_tree = self.branch.basis_tree()
  File "/usr/lib/python2.6/dist-packages/bzrlib/branch.py", line 2206,
in basis_tree
    return self.repository.revision_tree(self.last_revision())
  File "/usr/lib/python2.6/dist-packages/bzrlib/decorators.py", line
138, in read_locked
    result = unbound(self, *args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line
2313, in revision_tree
    inv = self.get_revision_inventory(revision_id)
  File "/usr/lib/python2.6/dist-packages/bzrlib/decorators.py", line
138, in read_locked
    result = unbound(self, *args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line
2272, in get_revision_inventory
    return self.get_inventory(revision_id)
  File "/usr/lib/python2.6/dist-packages/bzrlib/decorators.py", line
138, in read_locked
    result = unbound(self, *args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line
2164, in get_inventory
    return self.iter_inventories([revision_id]).next()
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line
2183, in _iter_inventories
    for text, revision_id in self._iter_inventory_xmls(revision_ids):
  File "/usr/lib/python2.6/dist-packages/bzrlib/repository.py", line
2190, in _iter_inventory_xmls
    for record in stream:
  Fi...

Read more...

Martin Pool (mbp)
summary: - short readline in the readvfile hunk
+ short readline in the readvfile hunk on smb mount
Revision history for this message
Martin Pool (mbp) wrote : Re: short readline in the readvfile hunk

So this indicates some kind of problem reading back from these files. We may have written something wrong into them but I'm inclined to attribute it to the cifs mount.

Things we could do from here:

1- you could determine (eg using strace) which file is affected, and see if it seems truncated or damaged
2- see if you can reproduce this pushing the same data to a fresh repository on the nas
3- see if you can reproduce this with a smaller case

Revision history for this message
Martin Pool (mbp) wrote :

Low if it turns out to be due to a quirk in this setup or unreproducible, otherwise higher.

Changed in bzr:
importance: Undecided → Low
Revision history for this message
Ted Gould (ted) wrote :

I've got an strace. When doing the new push bazaar decided to send a bunch of data, about 40 GB or so. I've attached the full strace, I'm not sure which file I should be looking at next.

Revision history for this message
Martin Pool (mbp) wrote : Re: [Bug 360476] Re: short readline in the readvfile hunk on smb mount

2009/6/23 Ted Gould <email address hidden>:
> I've got an strace. When doing the new push bazaar decided to send a
> bunch of data, about 40 GB or so. I've attached the full strace, I'm
> not sure which file I should be looking at next.

Robert recently fixed bug 162930 relating to IO problems on cifs/smb
filesystems. It's just possible that might fix your problem too, if
you can try a snapshot of bzr from today or later.

--
Martin <http://launchpad.net/~mbp/>

Revision history for this message
Martin Pool (mbp) wrote :

The same error was reported on the list as occurring on Windows. There's not much more data there yet:

https://lists.ubuntu.com/archives/bazaar/2009q2/059969.html

Revision history for this message
Martin Pool (mbp) wrote : Re: bzr: ERROR: short readline in the readvfile hunk.

2009/6/26 Raindog <email address hidden>:
> Hmm, I think in this case it's not a samba mount as I am on windows, which I
> handily overlooked in my original mail =(. The repository sits on my main
> hard drive. I am using bzr 1.11
>
> I've managed to fix the issue by copying over the repository from my laptop.

Do you have a traceback from the time of the original error in your
.bzr.log file? Could you please attach that to your bug?

Also, I nearly didn't see your message because your smtp sender is
listed in spamhaus.org...

--
Martin <http://launchpad.net/~mbp/>

Revision history for this message
Ted Gould (ted) wrote : Re: [Bug 360476] Re: short readline in the readvfile hunk on smb mount

On Thu, 2009-06-25 at 04:20 +0000, Martin Pool wrote:
> 2009/6/23 Ted Gould <email address hidden>:
> > I've got an strace. When doing the new push bazaar decided to send a
> > bunch of data, about 40 GB or so. I've attached the full strace, I'm
> > not sure which file I should be looking at next.
>
> Robert recently fixed bug 162930 relating to IO problems on cifs/smb
> filesystems. It's just possible that might fix your problem too, if
> you can try a snapshot of bzr from today or later.

No, it did not. Tried with bzr 4476 which I believe had Robert's fix in
it. I've attached the strace of this run as well.

Revision history for this message
Martin Pool (mbp) wrote :

The fix for bug 340347 is similar to this (see also https://code.edge.launchpad.net/~mbp/bzr/340347-log-decorator/+merge/8075) so I'm going to think about whether this might be a bug in ReadVFile.

Changed in bzr:
assignee: nobody → Martin Pool (mbp)
importance: Low → Medium
status: Incomplete → Confirmed
description: updated
Revision history for this message
Martin Pool (mbp) wrote :

Hypothesis: the OS is returning all the right data, but it's fragmented into smaller chunks than we expected. This behaviour is typical of network filesystems and explains why we only see it there. We coalesce neighbouring chunks in sending the request but we don't do this when processing the response. Therefore, ReadVFile should tolerate a line spanning different chunks in the result as long as there's no actual gap between them.

Revision history for this message
Martin Pool (mbp) wrote :

Ted, if you don't mind trying one more thing, that branch may give some more debugging information in the error - or you can wait for it to land in trunk.

More investigation: Transport._seek_and_read checks that the result of each fp.read() call contains as many bytes as were expected. So if the underlying file is giving back too little data, it's surprising that we're not seeing ShortReadvError raised from there.

Revision history for this message
Martin Pool (mbp) wrote :

Thanks for the strace, that's quite useful.

The peccant part seems to be this:

open("/media/nas/Photos/.bzr/repository/packs/3ba59619a476d59dd8870c4608ce0014.pack", O_RDONLY|O_LARGEFILE) = 9
fstat64(9, {st_mode=S_IFREG|0664, st_size=1719731989, ...}) = 0
fstat64(9, {st_mode=S_IFREG|0664, st_size=1719731989, ...}) = 0
mmap2(NULL, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb79a3000
_llseek(9, 0, [0], SEEK_SET) = 0
read(9, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384) = 16384

This is followed by various operations to release the lock and then print a traceback so I think this is where the error was raised. And it's not surprising that it would be, because there's something seriously wrong if we're trying to read the start of a pack file and seeing a block of nuls, without even a format marker.

Compare the IO when we read the previous pack file:

munmap(0xb79a3000, 16384) = 0
gettimeofday({1245746161, 82841}, NULL) = 0
open("/media/nas/Photos/.bzr/repository/packs/19abcaadba2fc93ce5c7098363767bbc.pack", O_RDONLY|O_LARGEFILE) = 9
fstat64(9, {st_mode=S_IFREG|0664, st_size=113767636, ...}) = 0
fstat64(9, {st_mode=S_IFREG|0664, st_size=113767636, ...}) = 0
mmap2(NULL, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb79a3000
_llseek(9, 0, [0], SEEK_SET) = 0

It may help if you can look in the first of those files and confirm whether it is in fact full of nuls, or starting with them. Also, knowing the mtime of that file would be useful.

If it's true that that's what's on disk:

1- how can you recover?
2- how did this happen?

You may be able to recover from this if you have files in the obsolete_packs directory, by moving away the broken pack file and its indices, and then bzr should look at the obsolete files and repack them. If possible make a complete backup first.

As for how it happened: I haven't checked through the code but I think we normally write pack files by just starting at the beginning and writing sequentially. So it would be pretty strange if we left a block of zeros in there. Perhaps the machine crashed or lost its connection just after the file was written, before it was flushed to disk?

Martin Pool (mbp)
Changed in bzr:
status: Confirmed → Incomplete
Revision history for this message
Ted Gould (ted) wrote :

On Wed, 2009-07-01 at 08:35 +0000, Martin Pool wrote:
> This is followed by various operations to release the lock and then
> print a traceback so I think this is where the error was raised. And
> it's not surprising that it would be, because there's something
> seriously wrong if we're trying to read the start of a pack file and
> seeing a block of nuls, without even a format marker.

It looks like the first few K of the file are all nulls.

> It may help if you can look in the first of those files and confirm
> whether it is in fact full of nuls, or starting with them. Also,
> knowing the mtime of that file would be useful.

It was last modified April 12th. Though, I'm not sure which version of
Bazaar I had at that time. That's different than all the other files in
the system as there are packs that are newer.

> You may be able to recover from this if you have files in the
> obsolete_packs directory, by moving away the broken pack file and its
> indices, and then bzr should look at the obsolete files and repack them.
> If possible make a complete backup first.

Since this itself is a backup, and I haven't lost the original, I can
just repush it. There was nothing in the obsolete_packs.

> As for how it happened: I haven't checked through the code but I think
> we normally write pack files by just starting at the beginning and
> writing sequentially. So it would be pretty strange if we left a block
> of zeros in there. Perhaps the machine crashed or lost its connection
> just after the file was written, before it was flushed to disk?

All possible. I have no way to tell really. :-/

I think this one is going to have to be marked up to unreproducable. As
I have no way to recreate the error'd packs file. Thanks! Ted.

  status invalid

Changed in bzr:
status: Incomplete → Invalid
Revision history for this message
Martin Pool (mbp) wrote : Re: [Bug 360476] Re: short readline in the readvfile hunk on smb mount

See also bug 402857.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.