commands updating working tree should provide the same modification time for all modified files

Bug #488724 reported by Vincent Ladeuil on 2009-11-26
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
High
John A Meinel

Bug Description

Many package sources includes files generated from other
files which are also included in the tree.
There are cases where bzr produce inconsistent time stamps
(the generated file is modified before its source file).

This occurs rarely but kind of makes it worse. It's true that the
packages have build dependencies so the needed tools are known,
but in some circumstances, the upstream packager may have use a
different version of a tool and produced a file that the Ubuntu packager
cannot reproduce exactly (we encounter a variation of
that problem on the bzr project when people use different pyrex
versions).

Ideally, all files modified by a given command (branch, checkout, update, pull, etc) should receive
the exact same timestamp.

Using futimes() should help address the problem on Linux.
I don't know if that's available on Windows though (it is on OSX and FreeBSD)

Tags: udd Edit Tag help

Related branches

Vincent Ladeuil (vila) on 2009-11-26
Changed in bzr:
status: New → Confirmed
importance: Undecided → High
tags: added: udd
Vincent Ladeuil (vila) wrote :

>>>>> "robert" == Robert Collins writes:

    robert> Thanks for filing this Vincent. Uhm, Its a dup :).

Ha ! I was so sure it was a duplicate... but I couldn't find it... Can you ?

John A Meinel (jameinel) wrote :

Vincent, could you explain why this is 'udd'?

I think it would be okay for TreeTransform to set the mtime of all of the files it creates to a fixed time. I would be a little bit concerned about overhead, but 1 sys-call to set the time for the file we just created doesn't seem too bad. What would be bad is if we can't set the time backwards.

Meaning, I would set the time for all files to the creation time of the *first* file. Then you can do it on-the-fly. If you have to do it for the time of the *last* file, then you have to go back and touch all the files you just created after-the-fact, which isn't nice.

>>>>> "jam" == John A Meinel writes:

    jam> Vincent, could you explain why this is 'udd'?

Because it was mentioned at UDS as the most annoying bug in bzr
when it comes to use bzr for packages.

    jam> I think it would be okay for TreeTransform to set the
    jam> mtime of all of the files it creates to a fixed time.

Yes, that's the point. Which time doesn't really matter as long
as it's the same for all files.

The bug is clearly about a generated file getting an earlier date
than its corresponding source file because then it appears to
build tools as it needs to be rebuilt.

    jam> I would be a little bit concerned about overhead, but 1
    jam> sys-call to set the time for the file we just created
    jam> doesn't seem too bad.

That's the idea, since the file is already opened, it shouldn't
add any IO overhead only a syscall which should already have
whatever data is needed in memory.

    jam> What would be bad is if we can't set the time backwards.

That's not what was asked for.

    jam> Meaning, I would set the time for all files to the
    jam> creation time of the *first* file.

Yes.

    jam> Then you can do it on-the-fly. If you have to do it for
    jam> the time of the *last* file, then you have to go back
    jam> and touch all the files you just created after-the-fact,
    jam> which isn't nice.

Agreed.

John A Meinel (jameinel) wrote :

The associated branch does exactly this on Windows, using ctypes to get at the "GetFileInformationByHandleEx" and "SetFileInformationByHandle" functions.

It is ugly and hackish, and needs the 'set_mtime()' functionality to be factored out into a helper (like something in osutils, etc.)

Also, I think eventually we would move the functions into _walkdirs_win32_pyx.pyx and _read_dir_pyx.pyx since they are platform specific stuff.

I think the test case for TreeTransform is reasonable, as is the basic layout of calling self._set_mtime() on the newly created file. It does test that:

1) The mtime is set to self._creation_mtime. We may actually want to expose this via an api, for people who want the timestamp of files to be set to the commit timestamp... (though that would be different for each file.)
2) Ensures that creating 2 different files gets the same timestamp, and that matches _creation_mtime.

Potentially the test needs to be updated so that comparisons are in seconds or milliseconds. I think it would be reasonable to test that st1.st_mtime is exactly st2.st_mtime, but we can compare that to _creation_mtime with just 1-sec resolution. (We should ensure that the time for the files is exact, but we don't care the exact time.)

I'll probably poke at this some more.

Changed in bzr:
assignee: nobody → John A Meinel (jameinel)
milestone: none → 2.1.0
status: Confirmed → In Progress
Robert Collins (lifeless) wrote :

On Tue, 2010-01-05 at 16:54 +0000, John A Meinel wrote:
> Vincent, could you explain why this is 'udd'?

Packages which are for autoconf style build systems with maintainer-mode
always enabled, will trigger autoreconfiscation when the timestamps are
skewed, and this can fail (in particular when the packager doesn't
expect this to happen), and can add huge diffs that are not desired.
Some folk even patch configure itself, so this can interfere with that
too. (I'm not arguing for or against these things, just that they
happen, and we interfere).

-Rob

Aaron Bentley (abentley) wrote :

You're calling self._set_mtime from DiskTreeTransform methods, but it's only defined in TreeTransform. That will be bad with PreviewTree. I think it makes sense to move it into DiskTreeTransform (and move the ctypes code into osutils). It might be useful to allow specifying a specific mtime in create_file et al, but I can't think of a good use case at the moment.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Aaron Bentley wrote:
> You're calling self._set_mtime from DiskTreeTransform methods, but it's
> only defined in TreeTransform. That will be bad with PreviewTree. I
> think it makes sense to move it into DiskTreeTransform (and move the
> ctypes code into osutils). It might be useful to allow specifying a
> specific mtime in create_file et al, but I can't think of a good use
> case at the moment.
>

People wanting the working tree files to be at the 'commit' timestamp
would be one possibility.

I certainly still have *lots* of room for improvement, this was just a
quick stab to see if I could get the Windows side of it working.

However, the code in question is calling the builtin open(), I don't see
why it would be bad with PreviewTree. Unless you are doing something
terribly like monkey-patching open() to do something other than
returning a real disk file.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAktDn88ACgkQJdeBCYSNAAOxegCfd9AmcbOl8sEOl5HXPUa2rQk/
fUEAn1U1UAxAwNs8zJEXoDepI5O17pSV
=Hh8b
-----END PGP SIGNATURE-----

John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Arbash Meinel wrote:
> Aaron Bentley wrote:
>> You're calling self._set_mtime from DiskTreeTransform methods, but it's
>> only defined in TreeTransform. That will be bad with PreviewTree. I
>> think it makes sense to move it into DiskTreeTransform (and move the
>> ctypes code into osutils). It might be useful to allow specifying a
>> specific mtime in create_file et al, but I can't think of a good use
>> case at the moment.
>
>
> People wanting the working tree files to be at the 'commit' timestamp
> would be one possibility.
>
> I certainly still have *lots* of room for improvement, this was just a
> quick stab to see if I could get the Windows side of it working.
>
> However, the code in question is calling the builtin open(), I don't see
> why it would be bad with PreviewTree. Unless you are doing something
> terribly like monkey-patching open() to do something other than
> returning a real disk file.
>
> John
> =:->
>

NM this last bit. I didn't realize you define DiskTreeTransform *before*
you defined the base TreeTransform. I'll move the function. The layering
here certainly seems a bit unexpected.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAktDoC8ACgkQJdeBCYSNAAO4KgCdFSF1bSvZ7+/7RHTnjePN/Uh0
XeQAmQF6ChvQfsktuWIIekEW2iKxok3j
=4L8a
-----END PGP SIGNATURE-----

John A Meinel (jameinel) wrote :

In the associated branch, I have an 'futimes()' wrapper for Linux and a "SetFileInformationByHandle" wrapper on Windows. I also moved the functions around as mentioned by Aaron.
This seems to work, though I'm now benchmarking if this has significant overhead.

Changed in bzr:
status: In Progress → Fix Committed
Vincent Ladeuil (vila) wrote :

@Scott: I just subscribed you to this bug, can you test the patch proposed by jam (see also https://code.edge.launchpad.net/~jameinel/bzr/2.1.0rc1-set-mtime/+merge/16881) and give us feedback ?

John A Meinel (jameinel) on 2010-01-11
Changed in bzr:
milestone: 2.1.0 → 2.1.0rc1
status: Fix Committed → Fix Released
Neal McBurnett (nealmcb) wrote :

I wish bug 245170 would be fixed instead, so all the file times themselves were simply preserved, as apparently even CVS does. Look there for more details, complications, etc.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers