Should use the revision timestamp when exporting

Bug #262261 reported by James Westby on 2008-08-28
2
Affects Status Importance Assigned to Milestone
Bazaar
Low
Unassigned
Breezy
Medium
Jelmer Vernooij
bzr (Debian)
Fix Released
Unknown

Bug Description

Hi,

The export code should use the timestamp of the revision when exporting.

This was discussed on the mailing list a little while ago, I'm filing this
as house keeping.

Thanks,

James

Related branches

Changed in bzr:
status: Unknown → New
John A Meinel (jameinel) wrote :

The code I see has this:

    now = time.time()
...
    for dp, ie in _export_iter_entries(tree, subdir):
        filename = osutils.pathjoin(root, dp).encode('utf8')
        item = tarfile.TarInfo(filename)
        item.mtime = now

Obviously, that is not a stable solution.

There are two possibilities:

1) Use repository.get_revision(revision_id).timestamp for all files
2) Use repository.get_revision(ie.revision).timestamp for each file

(1) is nice because it is stable for a given revision_id, and should fix our current problem of "bzr builddeb" creating different tarballs accidentally.

(2) might be more correct, but has the overhead of needing to extract a lot of Revision texts when exporting a large tree. Making it configurable would be a bit of a trick, so my recommendation would be to go with (1), and if it is an issue, switch to (2).

Note that some people have requested similar functionality for "bzr export directory". So it might not be a bad idea to make it configurable and expose it to all exporters.

Changed in bzr:
importance: Undecided → Low
status: New → Triaged

On Thu, 2008-08-28 at 16:52 +0000, John A Meinel wrote:
> The code I see has this:
>
> now = time.time()
> ...
> for dp, ie in _export_iter_entries(tree, subdir):
> filename = osutils.pathjoin(root, dp).encode('utf8')
> item = tarfile.TarInfo(filename)
> item.mtime = now
>
> Obviously, that is not a stable solution.
>
> There are two possibilities:
>
> 1) Use repository.get_revision(revision_id).timestamp for all files
> 2) Use repository.get_revision(ie.revision).timestamp for each file

I think there's a method on tree that gives you the timestamp. This is
important to me as I export working trees as well.

The thing that I found strange was that for three different
implementations of the method that I found I found three different
results.

> (1) is nice because it is stable for a given revision_id, and should fix
> our current problem of "bzr builddeb" creating different tarballs
> accidentally.

It may not fix it, and isn't the solution we want long term, but it
would be nice if it at least fixed some cases.

Thanks,

James

John A Meinel (jameinel) wrote :

So there *is* Tree.get_file_mtime() and for RevisionTrees it is defined as:
    def get_file_mtime(self, file_id, path=None):
        ie = self._inventory[file_id]
        revision = self._repository.get_revision(ie.revision)
        return revision.timestamp

^- Notice that this deserializes a new Revision object for every request, and does *no* caching. So for 10,000 files in a tree, it does 10,000 deserializations.

Which is probably why we don't use it at present.

Changed in bzr:
status: New → Confirmed
Jelmer Vernooij (jelmer) wrote :

I haven't worked on caching interfaces like this before. Could you perhaps give me a pointer as to how to approach this?

John A Meinel (jameinel) wrote :

Well, you could do it in the exporter code, though it is a bit of an abstraction violation. Something like:

revisions_to_mtimes = {}
for ie in inventory:
  mtime = revisions_to_mtimes.get(ie.revision, None)
  if mtime is None:
    mtime = tree.get_file_mtime(ie.file_id)
    revisions_to_mtimes[ie.revision] = mtime

It's bad because tree could be a WorkingTree which doesn't work this way.

I feel it is a little bad for RevisionTree.get_file_mtime() to be peeking at other Revision objects, which at least hints to me that maybe we should be using a different api in general.

The problem is that I think the cache lifetime should be managed by the export api, but the actual caching seems like it should happen underneath RevisionTree.get_file_mtime. (By RevisionTree or by Repository)

I wonder if we would want to move this sort of information into the Inventory (in the future). That would let us record the last modified time as part of commit.

Anyway, I would probably just change the RevisionTree.get_file_mtime() to cache the Revision objects for the lifetime of a lock. We might also want to test that it will, indeed, be an issue during export, but from what I can tell, it seems like it would.

Robert Collins (lifeless) wrote :

Just to note, we now set a mtime for files we export to a dir of time.time(), which we didn't do before,and this works well enough for builddeb users.

Martin Pool (mbp) on 2010-03-19
Changed in bzr:
status: Triaged → Confirmed
Jelmer Vernooij (jelmer) wrote :

This is now partially supported - we have --per-file-timestamps to do this. It would be nice to use as default, but for that we'd have to improve performance first.

Parth Malwankar (parthm) on 2010-09-22
tags: added: export
Changed in bzr (Debian):
status: Confirmed → Fix Released
Jelmer Vernooij (jelmer) on 2011-03-13
Changed in bzr:
assignee: nobody → Jelmer Vernooij (jelmer)
assignee: Jelmer Vernooij (jelmer) → nobody
Jelmer Vernooij (jelmer) on 2017-11-09
tags: added: check-for-breezy
Jelmer Vernooij (jelmer) on 2019-01-13
Changed in brz:
status: New → Triaged
importance: Undecided → Medium
tags: removed: check-for-breezy
Changed in brz:
assignee: nobody → Jelmer Vernooij (jelmer)
milestone: none → 3.0.0
status: Triaged → Fix Committed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.