Better darcs support needed

Bug #232177 reported by Saša Janiška
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar Fast Import
Fix Released
Medium
Unassigned

Bug Description

Hi!

After recently migrated from darcs to bzr, I'm naturally looking for the best way to migrate my old darcs repos to Bazaar.

I've tried using darcs-2git front-end, but all I get is something like:

gour@nitai ~/r/b/fish> python ../../git/darcs2git/darcs2git.py ../../darcs/fish/ > fish.fi
reading patches.
Starting afresh at -1
Trying None -> patch 0
Pull patch 0
conflict, going one back
Export -3 -> 0 (total 1732)
Trying patch 0 -> patch 1
Pull patch 1
Trying None -> patch 1
Pull patch 1
conflict, going one back
Can't import patch 1, need conflict resolution patch?
Trying patch 0 -> patch 2
Rewinding 1 patches
yes: standard output: Broken pipe
yes: write error
Pull patch 2
Trying patch 0 -> patch 2
Rewinding 2 patches
yes: standard output: Broken pipe
yes: write error
Pull patch 2
conflict, going one back
Can't import patch 2, need conflict resolution patch?
Trying patch 0 -> patch 3
Rewinding 2 patches
yes: standard output: Broken pipe
yes: write error
Pull patch 3
Trying patch 1 -> patch 3
Pulling 1 patches to go to 1
Pull patch 3
conflict, going one back
Trying patch 0 -> patch 3
Rewinding 2 patches
yes: standard output: Broken pipe
yes: write error
Pull patch 3
conflict, going one back
Can't import patch 3, need conflict resolution patch?
Trying patch 0 -> patch 4
Rewinding 1 patches
yes: standard output: Broken pipe
yes: write error
Pull patch 4
Trying patch 2 -> patch 4
Rewinding 1 patches
yes: standard output: Broken pipe
yes: write error
....

i.e. it fails after looooong run on quite a small repo.

I tried another tool darcs-to-git and it works, but, afaik, cannot be used as 'bzr fast-import' front-end.

So, the only remaining tool to try was tailor which works nicely.

At the moment I'm converting darcs-2 repository to bzr and will report how long it takes.

Sincerely,
Gour

Revision history for this message
Saša Janiška (gour) wrote :

Hi!

I just tried to convert darcs-2 repo with the help of darcs-to-git, but it fails as well:

gour@nitai ~/r/g/darcs-bzr> ruby ../darcs-to-git/darcs-to-git ../../darcs/darcs.net/
Running: ["darcs", "-v"]
Initialising the working area.
Running: ["darcs", "init"]
Running: ["git-init"]
Initialized empty Git repository in .git/
Running: ["darcs", "changes", "--reverse", "--repodir=../../darcs/darcs.net/", "--xml", "--summary"]
../darcs-to-git/darcs-to-git:323:in `darcs_date_to_git_date': Wrong darcs date format (RuntimeError)
 from ../darcs-to-git/darcs-to-git:220:in `initialize'
 from ../darcs-to-git/darcs-to-git:248:in `new'
 from ../darcs-to-git/darcs-to-git:248:in `read_from_repo'
 from ../darcs-to-git/darcs-to-git:244:in `map'
 from ../darcs-to-git/darcs-to-git:244:in `read_from_repo'
 from ../darcs-to-git/darcs-to-git:408

Sincerely,
Gour

Revision history for this message
Saša Janiška (gour) wrote :

Hello!

Short update...

darcs-to-git fails with darcs-2 repository format as well...

otoh, I am happy to report that if you grab latest tailort-0.9.33 it works flawlessly :-)

The author sent patches to darcs-2 - you need latest darcs-2 including the below patches:

Sat May 17 13:08:20 CEST 2008 <email address hidden>
  * Honour --xml-output when printing the patches in the "will do"/"would do" message

Sat May 17 14:32:24 CEST 2008 David Roundy <email address hidden>
  * automatically include --xml support on commands supporting --dry-run.

and tailor is now able to cope with encoding problem I had (bzr back-end in tailor does not support that option), can handle old darcs' date-format and works with darcs-2 repository format - i.e. THE tool for doing darcs(2) --> bzr migration!

Sincerely,
Gour

Revision history for this message
Saša Janiška (gour) wrote :

Hi!

Just a short update...from today's Haskell-cafe list:

"I'm pleased to announce yet another tool for importing darcs repositories
to git. Unlike darcs2git [1] and darcs-to-git [2], it's written in
Haskell, on top of the darcs2 source code. The result is a much faster
program - it can convert the complete ghc 6.9 branch (without libraries)
in less than 15 minutes on my slightly dated machine (Athlon XP 2500+),
which is quite fast [3]. Incremental updates work, too."

See http://article.gmane.org/gmane.comp.lang.haskell.cafe/40795.

I didn' try it yet, but it sounds good.

Sincerely,
Gour

Revision history for this message
Miklos Vajna (vmiklos) wrote :

Just in case anyone is interested in a _fast_ darcs2bzr script:

http://vmiklos.hu/project/darcs-fast-export/

It does not handle darcs2 format yet, though.

Revision history for this message
Saša Janiška (gour) wrote :

Hi!

Now it does ;)

Sincerely,
Gour

Revision history for this message
Saša Janiška (gour) wrote :

Hi!

After some testing today, I had problems with darcs-fast-export like:

[gour@nitai test] darcs-fast-export.py /home/gour/repos/darcs/xmobar |(cd xmobar.bzr; bzr fast-import -)
/home/gour/bin/darcs-fast-export.py:32: DeprecationWarning: The popen2 module is deprecated. Use the subprocess module.
  import popen2
11:26:15 progress [2008-11-12 11:26:14] getting list of patches
Traceback (most recent call last):
  File "/home/gour/bin/darcs-fast-export.py", line 154, in <module>
    buf.append(sock.read())
  File "/usr/lib/python2.6/gzip.py", line 212, in read
    self._read(readsize)
  File "/usr/lib/python2.6/gzip.py", line 284, in _read
    self._read_eof()
  File "/usr/lib/python2.6/gzip.py", line 304, in _read_eof
    hex(self.crc)))
IOError: CRC check failed 0x327ba8f != 0xd257b5eL
11:27:21 Updating branch information ...
  branch xmobar.bzr now has 294 revisions and 4 tags
11:27:21 Imported 294 revisions, updating 1 branch and 0 trees in 0:01:06
To refresh the working tree for a branch, use 'bzr update'.

which makes Tailor the only reliable converter so far.

Sincerely,
Gour

Revision history for this message
Miklos Vajna (vmiklos) wrote :

Is this repo public?

I don't exactly see what is the problem, (since you said on IRC that the repo is not corrupted) so I would like to somehow reproduce it.

Thanks.

Revision history for this message
Saša Janiška (gour) wrote :

Hi!

Here is the public URl of the repo: darcs get http://code.haskell.org/xmobar/

However, I tried with othere and all (except the trivial one consisting of 19 patches) failed with the same error.

It would be nice to mak darcs-fast-export functional.

Otoh, I also had problem with --export-marks=<file> option. How it is supposed to used (since I'd like to use it for incremental export of local darcs development to oush bzr branhces to LP)?

Sincerely,
Gour

Revision history for this message
Miklos Vajna (vmiklos) wrote : Re: [Bug 232177] Re: Better darcs support needed

On Wed, Nov 12, 2008 at 08:39:40PM -0000, Gour <email address hidden> wrote:
> Here is the public URl of the repo: darcs get
> http://code.haskell.org/xmobar/
>
> However, I tried with othere and all (except the trivial one consisting
> of 19 patches) failed with the same error.
>
> It would be nice to mak darcs-fast-export functional.
>
> Otoh, I also had problem with --export-marks=<file> option. How it is
> supposed to used (since I'd like to use it for incremental export of
> local darcs development to oush bzr branhces to LP)?

Here is what I tried:

darcs get http://code.haskell.org/xmobar
mkdir xmobar.bzr
cd xmobar.bzr
bzr init-repo .
dmark="$(pwd)/test2.dfe-marks"
bmark="$(pwd)/test2.bfi-marks"
cd ..
darcs-fast-export --export-marks=$dmark xmobar |(cd xmobar.bzr; bzr fast-import --export-marks=$bmark -)
cd xmobar.bzr/master/
bzr update
cd -
diff --exclude _darcs --exclude .bzr --exclude '*-darcs-backup*' -Naur xmobar.bzr/master xmobar
# good, it does not show any change, so the initial conversion was fine
darcs-fast-export --export-marks=$dmark --import-marks=$dmark xmobar |(cd xmobar.bzr; bzr fast-import --export-marks=$bmark --import-marks=$bmark -)
# no changes, but let's see if it screwed up something
diff --exclude _darcs --exclude .bzr --exclude '*-darcs-backup*' -Naur xmobar.bzr/master xmobar
# fine, still the same

darcs 2.1.0 (release)
Bazaar (bzr) 1.7.1
bzr-fastimport 0.6

(The commands are based on the
http://vmiklos.hu/project/darcs-fast-export/t/test2-bzr-incremental.sh
testcase which tests a darcs2->bzr incremental conversion.)

Revision history for this message
Saša Janiška (gour) wrote :

Hi!

Thanks for reply. It's getting late here so I'll try tomorrow, but let me just tell that I use

darcs-2.1.0 and
bzr-1.9

Sincerely,
Gour

Revision history for this message
Miklos Vajna (vmiklos) wrote :

Output with bzr-1.9:

$ darcs-fast-export --export-marks=$dmark xmobar |(cd xmobar.bzr; bzr fast-import --export-marks=$bmark -)
23:59:27 progress [2008-11-12 23:59:27] getting list of patches
00:00:30 Updating branch information ...
         branch master now has 302 revisions and 4 tags
00:00:30 Imported 302 revisions, updating 1 branch and 0 trees in 0:01:03
To refresh the working tree for a branch, use 'bzr update'.

(This is the initial one.)

Revision history for this message
Saša Janiška (gour) wrote :

Hello Miklos!

I've tried again, actually run your 'unit-test' and it works.

Now, when I look again at the command sequence I see what's the difference, i.e. you did:

mkdir xmobar.bzr
cd xmobar.bzr
bzr init-repo .

and I did:

mkdir xmobar.bzr
cd xmobar.bzr
bzr init .

but I still cannot say why it worked for some repo(s) and failed for the others?

Sincerely,
Gour

Revision history for this message
Miklos Vajna (vmiklos) wrote :

'bzr help fast-import' says it should be init-repo, so I guess it's a feature if it works with bzr init as well. ;)

Revision history for this message
Saša Janiška (gour) wrote :

Hi!

You're right.

Still, even by following the procedure I got failures with bigger repos (gtk2hs, c2hs, darcs-unstable...)

Have you received/seen the paste I sent in #git yesterday?

Sincerely,
Gour

Revision history for this message
Saša Janiška (gour) wrote :

Hello!

Maybe it is worth investigating a bit more, but so far my experience is that every darcs-1 format repo fails (pulled the latest code from the repo), while every darcs-2 works (I tried with few smaller ones).

Tomorrow I'll try with some bigger darcs-2 (e.g. darcs-unstable).

Do you have success with format-1 repos?

Sincerely,
Gour

Revision history for this message
Miklos Vajna (vmiklos) wrote :

Hi,

I just pushed two fixes to fix two issues pointed out by the darcs-unstable conversion, so that works fine here now. I'll have a look at the other repos you mentioned as well soon.

And no, I did not have issues with big format-1 repos, in the README I refer to a large repo, and that was in the old format.

Revision history for this message
Miklos Vajna (vmiklos) wrote :

Sadly you pasted the output to some pastebin and not here, so it is already deleted, but here is the output I get for c2hs (ie. no errors):

$ time ../darcs-fast-export.py c2hs | (cd c2hs.git; git fast-import)
progress [2008-11-15 13:30:36] getting list of patches
progress [2008-11-15 13:30:36] starting export, repo has 325 patches
progress [2008-11-15 13:30:45] 100/325 patches
progress [2008-11-15 13:30:53] 200/325 patches
progress [2008-11-15 13:30:59] 300/325 patches
progress [2008-11-15 13:31:03] finished
git-fast-import statistics:
---------------------------------------------------------------------
Alloc'd objects: 5000
Total objects: 3194 ( 51251 duplicates )
      blobs : 1431 ( 43227 duplicates 938 deltas)
      trees : 1430 ( 8024 duplicates 190 deltas)
      commits: 325 ( 0 duplicates 0 deltas)
      tags : 8 ( 0 duplicates 0 deltas)
Total branches: 1 ( 1 loads )
      marks: 1024 ( 325 unique )
      atoms: 215
Memory total: 2255 KiB
       pools: 2098 KiB
     objects: 156 KiB
---------------------------------------------------------------------
pack_report: getpagesize() = 4096
pack_report: core.packedGitWindowSize = 33554432
pack_report: core.packedGitLimit = 268435456
pack_report: pack_used_ctr = 10326
pack_report: pack_mmap_calls = 325
pack_report: pack_open_windows = 1 / 1
pack_report: pack_mapped = 6567785 / 6567785
---------------------------------------------------------------------

real 0m26.691s
user 0m15.433s
sys 0m7.922s

Revision history for this message
Miklos Vajna (vmiklos) wrote :

And the gtk2hs repo can be converted without any error as well here.

Let me know if have any further problems with publicly accessible repos. :)

Revision history for this message
Saša Janiška (gour) wrote :
Download full text (5.4 KiB)

Hello Miklos!

I've just tried to convert gtk2hs (http://code.haskell.org/gtk2hs/) repo from darcs to git.

Here is the result:

[gour@gaura-nitai git] time darcs-fast-export --export-marks=$dmark gtk2hs|(cd gtk2hs.git; git fast-import --export-marks=$gmark)
/home/gour/bin/darcs-fast-export:33: DeprecationWarning: The popen2 module is deprecated. Use the subprocess module.
  import popen2
progress [2008-11-15 14:40:22] getting list of patches
progress [2008-11-15 14:40:29] starting export, repo has 1816 patches
progress [2008-11-15 14:41:08] 100/1816 patches
progress [2008-11-15 14:41:39] 200/1816 patches
progress [2008-11-15 14:42:19] 300/1816 patches
progress [2008-11-15 14:43:12] 400/1816 patches
progress [2008-11-15 14:44:12] 500/1816 patches
progress [2008-11-15 14:45:25] 600/1816 patches
progress [2008-11-15 14:46:38] 700/1816 patches
progress [2008-11-15 14:47:53] 800/1816 patches
progress [2008-11-15 14:49:00] 900/1816 patches
progress [2008-11-15 14:50:03] 1000/1816 patches
progress [2008-11-15 14:51:03] 1100/1816 patches
progress [2008-11-15 14:52:07] 1200/1816 patches
progress [2008-11-15 14:53:10] 1300/1816 patches
progress [2008-11-15 14:54:10] 1400/1816 patches
progress [2008-11-15 14:55:21] 1500/1816 patches
progress [2008-11-15 14:56:41] 1600/1816 patches
progress [2008-11-15 14:58:02] 1700/1816 patches
progress [2008-11-15 14:59:31] 1800/1816 patches
progress [2008-11-15 14:59:46] writing export marks
progress [2008-11-15 14:59:46] finished
git-fast-import statistics:
---------------------------------------------------------------------
Alloc'd objects: 20000
Total objects: 18032 ( 1135695 duplicates )
      blobs : 7919 ( 944377 duplicates 5108 deltas)
      trees : 8291 ( 191318 duplicates 1642 deltas)
      commits: 1816 ( 0 duplicates 0 deltas)
      tags : 6 ( 0 duplicates 0 deltas)
Total branches: 1 ( 1 loads )
      marks: 1048576 ( 1816 unique )
      atoms: 1339
Memory total: 2954 KiB
       pools: 2173 KiB
     objects: 781 KiB
---------------------------------------------------------------------
pack_report: getpagesize() = 4096
pack_report: core.packedGitWindowSize = 1073741824
pack_report: core.packedGitLimit = 8589934592
pack_report: pack_used_ctr = 206041
pack_report: pack_mmap_calls = 1816
pack_report: pack_open_windows = 1 / 1
pack_report: pack_mapped = 52923160 / 52923160
---------------------------------------------------------------------

darcs-fast-export --export-marks=$dmark gtk2hs 323,18s user 136,45s system 39% cpu 19:25,26 total
(; cd gtk2hs.git; git fast-import --export-marks=$gmark; ) 61,17s user 6,82s system 5% cpu 19:26,80 total

Now, attempting to do the same with bz's fast-import gave the following:

[gour@gaura-nitai bzr] darcs-fast-export --export-marks=$dmark gtk2hs |(cd gtk2hs.bzr; bzr fast-import --export-marks=$bmark -)
/home/gour/.bazaar/plugins/search/index.py:22: DeprecationWarning: the md5 modul...

Read more...

Revision history for this message
Miklos Vajna (vmiklos) wrote :

Okay, so let me summarize: The gtk2hs repo conversion in case of git works fine and it is fast, in case of bzr it works as well, but it's slower with an order of magnitude: "real 6m22.096s" vs "real 77m30.651s"

Given that my part in this party is to provide a fast darcs exporter, I think the slow side in this case is the bzr fast-importer, so in the end, sadly, I can't do too much here.

Revision history for this message
Saša Janiška (gour) wrote :

Hi Miklos,

yes you're right.

darcs-fast-export does the its job nicely although I will conduct some more test with bigger repos.

Unfortunately, bzr's side is really slow, but that's not the part you can do anything. I'll bug bzr's devs ;)

Thank you again.

Sincerely,
Gour

Revision history for this message
Miklos Vajna (vmiklos) wrote :

Gour,

It seems finally I managed to reproduce your corrupted gzip patches and it turns out it was a darcs bug:

http://thread.gmane.org/gmane.comp.version-control.darcs.user/15837

just upgrade your darcs and such bugs will go away. :) (I also updated the darcs version in d-f-e's README to match what I currently have as a working snapshot.)

Revision history for this message
Ian Clatworthy (ian-clatworthy) wrote :

Is the darcs-fast-export script something we should bundle under the exporters directory?

Other than the speed of the importer, is there a reason to keep this issue open?

Revision history for this message
Saša Janiška (gour) wrote :

Hi Ian,

darcs-fast-export is very nice piece of software and I'd be glad if the other part of it, i.e. darcs-fast-import would be used at LP to provide support for importing darcs repos - see https://answers.launchpad.net/launchpad-bazaar/+question/41438

Sincerely,
Gour

Revision history for this message
Miklos Vajna (vmiklos) wrote :

Ian,

Sure, feel free to include it! :-)

The git repo is here: git://vmiklos.hu/darcs-fast-export

Probably the interesting (for you) files are darcs-fast-export and darcs-fast-export.txt, the rest is not something that would make sense including in the exporters directory.

Thanks,
Miklos

tags: added: other-exporters
Changed in bzr-fastimport:
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Ian Clatworthy (ian-clatworthy) wrote :

darcs-fast-import is now bundled. If there are additional issues, let's open separate bugs for those.

Changed in bzr-fastimport:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.