Zim

Indexing a notebook with large .txt attachment drives zim crazy

Bug #1162270 reported by mario bezzi on 2013-03-30
28
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Zim
High
Unassigned

Bug Description

Opening a notebook with a large .txt attachment hangs zim for several minutes, apparently while the index is updated. Subsequent operations like page rename and others are also impacted. The problem disappears if the attached file is renamed omitting the .txt suffix.

zim 0.59 under RHEL .4.

Python versions is:
Python 2.6.6 (r266:84292, Oct 12 2012, 14:23:48)
[GCC 4.4.6 20120305 (Red Hat 4.4.6-4)] on linux2

The problem can be easlily reproduced creating a notebook with a single page and a single large attachment. In the provided test case it is a 35+ MB text file with 6.5M words.

mario bezzi (mbezzi) wrote :
mario bezzi (mbezzi) wrote :

Typo - it is zim 0.59 under RHEL 6.4

Andreas Wehler (andreas-wehler) wrote :

Apparently the problem is, that text files .txt are generally handled as pages of the notebook.

* So, is there a plan to differentiate a) txt file is page; b) txt file is attachment?
* Imagine a configurable directory name for attachments of some page (e.g. ".attach")
* This way any file may go into the attachment directory without affecting pages.

Changed in zim:
status: New → Confirmed
importance: Undecided → High

Some thoughts on a possible solution...

* Do not index a .txt file if line 1 does not contain "Content-Type: text/x-zim-wiki"
  * A Try-Except block for this will probably perform better than If-Then since the exceptions will be rare.

* Provide a menu selection something like "Plain text (not indexed)" that changes line 1 of a file to "Content-Type: text/plain".

(I haven't looked at the Zim code so don't take this as criticism. Zim is my diary, folio, manual log, notebook of documentation and anything else I can think of. It's always open. I couldn't live without it!)

On Fri, Jan 17, 2014 at 10:30 AM, Robert C Watson <
<email address hidden>> wrote:

> Some thoughts on a possible solution...
>
> * Do not index a .txt file if line 1 does not contain "Content-Type:
> text/x-zim-wiki"
> * A Try-Except block for this will probably perform better than If-Then
> since the exceptions will be rare.
>
> * Provide a menu selection something like "Plain text (not indexed)"
> that changes line 1 of a file to "Content-Type: text/plain".
>
> (I haven't looked at the Zim code so don't take this as criticism. Zim
> is my diary, folio, manual log, notebook of documentation and anything
> else I can think of. It's always open. I couldn't live without it!)
>

This is indeed what I was planning.

First step will be to ignore text files without a heading

Second step will be to implement a commandline "--import" that allows
importing text files (by adding the heading)

Third step will be to add a dialog for importing text files that are not
imported.

-- Jaap

dagurasu (dagurasu15) wrote :

This is basically the same underlying bug as bug #931901 and bug #705479

dagurasu (dagurasu15) wrote :

The import idea sounds nice as a feature addition (maybe a right click option in the attachment browser), but once the non-zim text files are properly treated as attachments, then people can do this easily enough with cut and paste and the bug part of the bug is solved.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments