Beagle uses wrong mime-type
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Beagle |
New
|
Undecided
|
Unassigned |
Bug Description
Beagle is not able to index or find all major office file types like ms word, excel, openoffice.
Here is the dialog of beagle-
--start --
~/Dokumente/search$ beagle-
Filename: file://
Warn: bibparse is not found; bibtex files will not be indexed
Debug: Loaded 64 filters from /usr/lib/
Debug: Verifying filter_cache at /home/tv/
Debug: No filter for file://
Filter: (determined in ,29s)
MimeType: application/
-- end --
the mimetype for all office file types is always: application/
With xdg-mime everything is recognized well:
~/Dokumente$ xdg-mime query filetype test_search.doc
application/msword
Thank you for any help
Thomas
I have a very similar problem, only with PDF files.
when I run: extract- content mockus.pdf /tmp/mockus. pdf beagle/ Filters/ Filters. dll .beagle/ filterver. dat ... cache is dirty ? False /tmp/mockus. pdf (/tmp/mockus.pdf) [application/ octet-stream] octet-stream
$ beagle-
Filename: file://
Debug: Loaded 64 filters from /usr/lib/
Debug: Verifying filter_cache at /home/serafim/
Debug: No filter for file://
Filter: (determined in ,36s)
MimeType: application/
Properties:
Timestamp = 2009-05-19 11:51:49 (Utc)
On the other hand, when I run: extract- content --mimetype= application/ pdf mockus.pdf /tmp/mockus. pdf beagle/ Filters/ Filters. dll .beagle/ filterver. dat ... cache i Filters. FilterPdf (determined in ,40s)
$ beagle-
Filename: file://
Debug: Loaded 64 filters from /usr/lib/
Debug: Verifying filter_cache at /home/s/
s dirty ? False
Filter: Beagle.
MimeType: application/pdf
Properties:
Timestamp = 2009-05-19 11:51:49 (Utc)
beagle:FileType = document
dc:appname = Acrobat Distiller 3.0 for Power Macintosh
fixme:page-count = 38
Content:
....... ....... ....... ...
Text extracted in 63,79s
Thanks,
Andrej