2009-03-30 01:20:44 |
Eric Anderson |
bug |
|
|
added bug |
2009-03-30 01:23:45 |
Eric Anderson |
description |
When trying to use subtree formats I cannot join in more than one git repository. To reproduce try the following:
$ mkdir test
$ cd test/
test$ bzr init --development-subtree
Created a standalone tree (format: development2-subtree)
test$ bzr branch git://github.com/harukizaemon/schema_validations.git schema_validations
Branched 4 revision(s).
test$ bzr join --reference schema_validations
test$ bzr branch git://github.com/harukizaemon/redhillonrails_core.git redhillonrails_core
Branched 1 revision(s).
test$ bzr join --reference redhillonrails_core
bzr: ERROR: Cannot join redhillonrails_core. Root id already present in tree
These git repositories are just used as an example because they are small and therefore quick to branch from. Any repo will get the same behavior. The problem is in the mapping. The root id for all git repositories are the same which is the constant ROOT_ID. If I make the following change I can add a new repository as a subtree:
def generate_file_id(self, path):
# Git paths are just bytestrings
# We must just hope they are valid UTF-8..
assert isinstance(path, str)
if path == "":
return ROOT_ID.join('-a')
return escape_file_id(path)
def parse_file_id(self, file_id):
if file_id.startswith(ROOT_ID):
return ""
return unescape_file_id(file_id)
But then I am back to the same problem of not being able to do anymore. I can change the '-a' to '-b' (or anything not already used) and therefore work around the issue. But obviously this is not a solution.
I tried just appending a randomly generated string to the suffix but within the joining process it seems we need to have the same value returned every time generate_file_id is called. My next attempt was to try affixing the current time under the idea that withing the joining operation the time is not likely to change but within different joining operations it will. This seems to work. My naive code for returning the root id is:
ROOT_ID.join(str(time.mktime(datetime.datetime.now().timetuple())))
I know nothing of Python and this was just borrowed from some site explaining how to get the current number of seconds since the unix epoch (I'm just a Ruby programmer so our stuff would just be ROOT_ID + Time.now.to_i). Anyway this obviously has two problems:
* It is possible that two repositories could be joined within the same second (via a script or something). Then we are back to our problem.
* It is also possible that joining a repo could span multiple seconds meaning the generate_file_id will not always return the same value within a joining operation causing an error.
But it seems to work well enough for my purposes until I real fix get's created. I would imagine the best thing to do would be to append a suffix based on the repo's URI (maybe hashed for fun). But the mapping object doesn't seem to have any reference to the repo it is mapping from what I can tell making that not possible unless we pass more info into the generate_file_id method. |
When trying to use subtree formats I cannot join in more than one git repository. To reproduce try the following:
$ mkdir test
$ cd test/
test$ bzr init --development-subtree
Created a standalone tree (format: development2-subtree)
test$ bzr branch git://github.com/harukizaemon/schema_validations.git schema_validations
Branched 4 revision(s).
test$ bzr join --reference schema_validations
test$ bzr branch git://github.com/harukizaemon/redhillonrails_core.git redhillonrails_core
Branched 1 revision(s).
test$ bzr join --reference redhillonrails_core
bzr: ERROR: Cannot join redhillonrails_core. Root id already present in tree
These git repositories are just used as an example because they are small and therefore quick to branch from. Any repo will get the same behavior. The problem is in the mapping. The root id for all git repositories are the same which is the constant ROOT_ID. If I make the following change I can add a new repository as a subtree:
def generate_file_id(self, path):
# Git paths are just bytestrings
# We must just hope they are valid UTF-8..
assert isinstance(path, str)
if path == "":
return ROOT_ID.join('-a')
return escape_file_id(path)
def parse_file_id(self, file_id):
if file_id.startswith(ROOT_ID):
return ""
return unescape_file_id(file_id)
But then I am back to the same problem of not being able to do anymore. I can change the '-a' to '-b' (or anything not already used) and therefore work around the issue. But obviously this is not a solution.
I tried just appending a randomly generated string to the suffix but within the joining process it seems we need to have the same value returned every time generate_file_id is called. My next attempt was to try affixing the current time under the idea that within the joining operation the time is not likely to change but within different joining operations it will. This seems to work. My naive code for returning the root id is:
ROOT_ID.join(str(time.mktime(datetime.datetime.now().timetuple())))
I know nothing of Python and this was just borrowed from some site explaining how to get the current number of seconds since the unix epoch (I'm just a Ruby programmer so our stuff would just be ROOT_ID + Time.now.to_i). Anyway this obviously has two problems:
* It is possible that two repositories could be joined within the same second (via a script or something). Then we are back to our problem.
* It is also possible that joining a repo could span multiple seconds meaning the generate_file_id will not always return the same value within a joining operation causing an error.
But it seems to work well enough for my purposes until a real fix gets created. I would imagine the best thing to do would be to append a suffix based on the repo's URI (maybe hashed for fun). But the mapping object doesn't seem to have any reference to the repo it is mapping from what I can tell making that not possible unless we pass more info into the generate_file_id method. |
|
2009-03-30 01:49:25 |
Jelmer Vernooij |
bzr-git: importance |
Undecided |
Wishlist |
|
2009-03-30 01:49:25 |
Jelmer Vernooij |
bzr-git: status |
New |
Triaged |
|
2009-05-16 15:31:59 |
Jelmer Vernooij |
tags |
|
next-mapping-format |
|
2010-12-25 23:23:47 |
Jelmer Vernooij |
summary |
Cannot join by reference more than one repository |
file ids are not very unique |
|
2013-08-25 15:33:28 |
Sergei Golubchik |
bug |
|
|
added subscriber Sergei |
2013-08-25 15:38:57 |
Sergei Golubchik |
attachment added |
|
bzr-git.file-id.patch https://bugs.launchpad.net/bzr-git/+bug/351317/+attachment/3787167/+files/bzr-git.file-id.patch |
|
2013-08-26 15:06:34 |
Sergei Golubchik |
attachment removed |
bzr-git.file-id.patch https://bugs.launchpad.net/bzr-git/+bug/351317/+attachment/3787167/+files/bzr-git.file-id.patch |
|
|
2013-08-26 15:08:08 |
Sergei Golubchik |
attachment added |
|
bzr-diff-fileid.patch https://bugs.launchpad.net/bzr-git/+bug/351317/+attachment/3788340/+files/bzr-diff-fileid.patch |
|
2018-03-06 01:20:15 |
Jelmer Vernooij |
bug task added |
|
brz-git |
|
2018-03-06 01:23:22 |
Jelmer Vernooij |
brz-git: status |
New |
Triaged |
|
2018-05-06 11:50:40 |
Jelmer Vernooij |
brz-git: importance |
Undecided |
Medium |
|
2018-05-10 01:20:14 |
Jelmer Vernooij |
summary |
file ids are not very unique |
git: file ids are not very unique |
|
2018-05-10 01:21:11 |
Jelmer Vernooij |
tags |
next-mapping-format |
git next-mapping-format |
|
2018-05-10 01:22:20 |
Jelmer Vernooij |
affects |
brz-git |
brz |
|