Ability to index hard disks/directories

Bug #825754 reported by upkpk
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Basenji
Fix Released
Undecided
Patrick Ulbrich

Bug Description

Hello, I wish Basenji had the ability to index local filesystems, as fixed hard disks. I have a large audio/video/whatever file collection (music, samples, url links) that I can't possibly backup due to it's size, but I'd like to at least have a portable list to keep track of what is in there.

This would allow checking for duplicates when getting files from another person (eg. usb hard disk sharing), or when downloading files from where the collection is not accessible. In case of a hard disk crash/stolen laptop/severe data loss, this would make it easier to know what to redownload.

I can simply do a 'tree /media/audio/ > audiolist.txt' but this does not have the metadata extraction feature.

Please let us know if this feature is really wanted and if it is feasible.
Thanks

Revision history for this message
Patrick Ulbrich (pulb) wrote :

Hi, thanks for your bugreport.

Local hard disks should appear in the drive selection dialog, just like removable media.
Even though indexing of inidvidual folders insn't supported officially, you can index any folder by mounting it via "bindfs <path/to/folder> <path/to/mountpoint>". Basenji should recognize that folder as a drive then.

Please let me know if this is a solution for your needs.

Patrick Ulbrich (pulb)
Changed in basenji:
assignee: nobody → Patrick Ulbrich (pulb)
Revision history for this message
Amano (amano) wrote :

I was wondering the same thing and was unaware of the bindfs option..in my case to the OP if you have a lot of folders and don't want to mount seperatly..i just use chmod to remove read access so when Basenji scans it skips that folder.

Revision history for this message
upkpk (upkpk) wrote :

Hi, and thanks for your reply. Local hard disks do not appear in Basenji's "Add media" dialog for me. I'm using basenji 0.8.0-0~natty3 from the ppa, tried with i386 and amd64 packages on 2 different machines. My disks are standard IDE or SATA drives with ext3 partitions.

$ mount -l -t ext3
/dev/sda1 on / type ext3 (rw,errors=remount-ro,commit=0)
/dev/sda6 on /home type ext3 (rw,commit=0)

Maybe i should rename this bug or open another?

Thanks for the bindfs trick, also.

Revision history for this message
Patrick Ulbrich (pulb) wrote :

Are your disks mounted and visible in Nautilus? If this is the case Basenji should recognize them as well. Just report back here, no need to open another report.

Revision history for this message
Patrick Ulbrich (pulb) wrote :

Oh I missed your mount output. Root, home and system partitions are skipped intentionally by Basenji. Though I could probably add a switch to the options...

Revision history for this message
axel (axel334) wrote :

I checked and Basenji scans usb, which is great. But would it be possible to have some synchronize option, so that when some files are added, removed, renamed or moved to another folder I could just stick usb again and Basenji could compare what it has in its database with the actual content and introduce changes to the database?
The point is that with large disk, I mean with large collections, it is rather inconvenient to rescan everything (and make thumbnails again). That could be very time consuming. Would it be possible if program would just check somehow for changes and then introduce changes to the database?
Maybe I should open another bug for this but I feel this is very closely related to this bug.

Revision history for this message
Patrick Ulbrich (pulb) wrote : AW: [Bug 825754] Re: Ability to index hard disks/directories

Please have a look at this bugreport and let me know if this works for you. https://bugs.launchpad.net/basenji/+bug/682485

Revision history for this message
axel (axel334) wrote :

I created the database, introduce changes to use and:
axel@axel-945P-DS3 ~ $ volumedb-scanner-daemon /dev/sdb1 ~/.config/volumes.vdb --replace
[10:57] [ INFO ] Waiting for a new volume...
[10:57] [ INFO ] New volume connected: /dev/sdb1
[10:57] [ INFO ] Scanning started.

[10:57] [ INFO ] Options: generate thumbs: yes, extract metadata: no, discard symlinks: no, hashing: no.

[10:58] [ INFO ] Scanning completed successfully.

[10:58] [ INFO ] Found volume(s) with identical titles - removing...
[10:58] [ INFO ] Ejecting volume.
[10:58] [ INFO ] Waiting for a new volume...
Seems ok, but I don't see any changes in Basenji.

From what I see it doesn't recognize specific usb volumes, it treat them all just as another folder /dev/sdb1.

I whish I could stick usb click on volume and from right mouse click option choose synchronize. Maybe some ideas from synchronizing programs could be borrowed like Freefilesync or luckybackup but synchronizing not real files but just information about them . Unity and KDE has this programs like Zeitgeist, Nepomuk, Strigi that track and index information about files. Maybe some of its features can be used, but within Basenji.

I'm aware that you wrote: "Firstly, Basenji has been designed for indexing of read only media such as CD collections,
so managing an up to date index of your frequently changing harddrive data cleary does not fall into its remit."

But the point is that as large usb devices became more popular for storage then DVD users needs changed. It would be just wonderful to have a program with this feature. Of course I don't even know if it is possible at all from programming point of view.

Revision history for this message
Patrick Ulbrich (pulb) wrote :

Sory, the commandline should read volume-scannerdaemon /sev/sdb1/ ~/.config/Basenji/volimes.vdb.

Revision history for this message
Patrick Ulbrich (pulb) wrote :

Damn phone keyboard.. It's actually volume-scanner-daemon /dev/sdb1/ ~/.config/Basenji/volumes.vdb --replace

Revision history for this message
axel (axel334) wrote :

I scanned usb with Basenji, saved this database as a file, I mean not in default location, clozed Basenji, then I moved one folder on usb into another so it became subfolder and then I used:
volumedb-scanner-daemon /dev/sdb1 ~/.config/Basenji/volumes.vdb --replace
and a strange behavior occured
There were no changes in Basenji (saved as file) database but the folder that I moved was in it original location, so this terminal command moved it back were it was before, as if it synchronized real folder with Basenji but structure of files and folders in Basenji seemed to be more important, I mean the real structure was adjusted to the structure of files and folders in Basenji. While it should work the other way round, Basenji should be adjusted to what has changed in real structure on usb.

Acctually I don't understand it and neither I understand the role of default /.config/Basenji/volumes.vdb.
Perhaps I should with this volumedb-scanner-daemon command give path to the database that I want to new data to be added, synchronized.
And there is also a problem with this different usb devices names. Does volumedb-scanner-daemon recognizes their individual names?
I guess not. So, when I will have more then one usb deviced scanned in one Baseji database how volumedb-scanner-daemon will recognize which volume information should be updated.
That's why I wrote that some new feature in Basenji GUI should be available and when I click on a volume that is a scan of usb, update will refer just to /dev /sdb1 but write information to the volume that I clicked on regardless of it individual name/title.

Revision history for this message
Patrick Ulbrich (pulb) wrote :

Yes, you have to pass the path of the database the daemon should write files into (the location you saved the database to). The daemon cannot alter filestructures on the filesystem, it just reads them. The daemon also recognizes individual disk names and replaces existing indexes of recognized disks in the database.

Revision history for this message
axel (axel334) wrote :

How to do it correctly? When I type:
axel@axel-945P-DS3 ~ $ volume-scanner-daemon /sev/sdb1/ /media/data/backup/katalog1.vdb --replace
I get something like
volume-scannerdaemon: no command found

> The daemon also recognizes individual disk names and replaces existing indexes of recognized disks in the database.

The point is that all my usb drives are named KINGSTON. I can change that name in Basenji after first scan, but how to refer to it with this volume-scanner-daemon command, so that it knows what volume in Basenji this particular KINGSTON refers to?

Revision history for this message
Patrick Ulbrich (pulb) wrote :

No command found either means you misstyped the command or the command isn't installed.

You have to rename the usb drives in both, your filemanager and in basenji. So if you've renamed your drives to kingston1, kingston2,kingston3.. basenji (and the daemon) can differentiate between them. (basenji and the daemon share the same database backend).

Revision history for this message
axel (axel334) wrote :

Actually it is named volumedb-scanner-daemon from volumedb-tools.
And correct command for custom database location is for example:
volumedb-scanner-daemon /dev/sdb1 /media/date/backup/katalog1.vdb --replace

I changed usb name with mlabel
http://techbu.com/2009/06/28/renaming-usb-drive-labels-in-linux
https://help.ubuntu.com/community/RenameUSBDrive

Thank you for this feature.
One more thing. I guess you are the author of volumedb-scanner-daemon. Would it be possible to add some progress indication (like ------------- or % ) after "Scanning started." just to indicate that something is going on?

Can you tell me how this --replace option actually works? Does it first delete what was in database and simply rescan everything on this volume again or just kind of compare changes and synchronize usb with database? Wouldn't it be the same as if I deleting a volume and re-scanning it again in Basenji?

Revision history for this message
Patrick Ulbrich (pulb) wrote :

I don't think adding a progress indicator to the deamon makes much sense since it is a daemon and daemons are expected to run in the background invisibly :-) The daemon is supposed to be added to the autostart applications, so that whenever you insert a new media it'll be indexed automatically.

Yes, the --replace option works by re-scanning a media completely and deleting the old index in the database. Basenji's architecture was not designed to handle frequently changing storage, so implementing the resync feature in a more efficient manner would involve some major code changes. And as you know, I'm short of time :-)

If you're fine with the current re-scan implementation of the deamon, I could add it to Basenji too. Meaning right-clicking a volume in Basenji would show a "Rescan" menu entry. Clicking the menu entry would execute the following steps:
- search all connected drives for a volume with a title identical to the selected volume,
- re-scan it,
- replace the old volume with the freshly scanned one...
- ... and probably also copy over volume info from the replaced volume (archive-no, category. description,..)

Revision history for this message
axel (axel334) wrote :

> "Yes, the --replace option works by re-scanning a media completely and deleting the old index in the database."
Well, if this is the case I thing there is no need implementing deamon to Basenji, at least for me. It is easy to delete volume and rescan it. So, I don't see any reason to add more options. Don't bother, especially when you don't have time. Synchronization or better call it update-rescan is a whole different story. Update-rescan could just save time while scanning large volumes and creating thumbnails.
But anyway, without update-rescan a regular scan without thumbnails is still pretty fast and can be used while tracking changes on large volumes.

> "implementing the resync feature in a more efficient manner would involve some major code changes. And as you know, I'm short of time :-)"
I see. Well, if you could ever consider it that would be great. Of course this is not that urgent. :) That would be something original. In fact there are other programs for scanning DVD, even more advanced. There is no tool for update-rescan (that would only search for and introduce changes) with usb or HDD. So, there is a gap in the market. :)

Revision history for this message
Patrick Ulbrich (pulb) wrote :

Hmm.. maybe I'm gonna implement the Rescan menu anyway, should be pretty easy.

> In fact there are other programs for scanning DVD, even more advanced.
Though you are still interested in this bugreport -> https://bugs.launchpad.net/basenji/+bug/672592
or are you using one of the more advanced tools, now? :-)

Revision history for this message
axel (axel334) wrote :

> I'm gonna implement the Rescan menu anyway, should be pretty easy.
Great. Go ahead.

> Though you are still interested in this bug report or are you using one of the more advanced tools, now? :-)

I'm still interested. The great side of Linux is choice. In fact I use cdcat and gamcat and I reported bugs and ideas to their developers. I don't treat your programs as competition. I think neither you developers should treat each other as competition. After all these programs are free and open source. I use Basenji too. I have different databases like music in one program and documents in another so when I open program I have default database open and I know which program to open for which database, so I don't need to open different files in one program. But I also have for each program all other databases. Besides using different programs is a kind of insurance that data are scanned correctly (I compare the number of folders and files between programs), and backup on the other hand. And when one of this programs won't be developed anymore (which is the case with gwhere or gnome catalog) I will be able to have my data in a program that is developed and that I will be able to transfer them to another program. Besides all this programs are slightly different, use different framework, so for me is interesting to see how similar things can be accomplished using different programming language.

Revision history for this message
Patrick Ulbrich (pulb) wrote :

I'm closing this bug since Basenji is now able to scan hard drives and directories (via bindfs as described above). As of the current daily build, Basenji is also able to rescan volumes from a popup menu.

Changed in basenji:
milestone: none → 1.0
status: New → Fix Committed
Patrick Ulbrich (pulb)
Changed in basenji:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.