Zim

Indexing a notebook with large .txt attachment drives zim crazy

Bug #1162270 reported by mario bezzi
28
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Zim
Confirmed
High
Unassigned

Bug Description

Opening a notebook with a large .txt attachment hangs zim for several minutes, apparently while the index is updated. Subsequent operations like page rename and others are also impacted. The problem disappears if the attached file is renamed omitting the .txt suffix.

zim 0.59 under RHEL .4.

Python versions is:
Python 2.6.6 (r266:84292, Oct 12 2012, 14:23:48)
[GCC 4.4.6 20120305 (Red Hat 4.4.6-4)] on linux2

The problem can be easlily reproduced creating a notebook with a single page and a single large attachment. In the provided test case it is a 35+ MB text file with 6.5M words.

Tags: index
Revision history for this message
mario bezzi (mbezzi) wrote :
Revision history for this message
mario bezzi (mbezzi) wrote :

Typo - it is zim 0.59 under RHEL 6.4

Revision history for this message
Andreas Wehler (andreas-wehler) wrote :

Apparently the problem is, that text files .txt are generally handled as pages of the notebook.

* So, is there a plan to differentiate a) txt file is page; b) txt file is attachment?
* Imagine a configurable directory name for attachments of some page (e.g. ".attach")
* This way any file may go into the attachment directory without affecting pages.

Changed in zim:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Robert "DocSalvager" Watson (robertcwatson) wrote :

Some thoughts on a possible solution...

* Do not index a .txt file if line 1 does not contain "Content-Type: text/x-zim-wiki"
  * A Try-Except block for this will probably perform better than If-Then since the exceptions will be rare.

* Provide a menu selection something like "Plain text (not indexed)" that changes line 1 of a file to "Content-Type: text/plain".

(I haven't looked at the Zim code so don't take this as criticism. Zim is my diary, folio, manual log, notebook of documentation and anything else I can think of. It's always open. I couldn't live without it!)

Revision history for this message
Jaap Karssenberg (jaap.karssenberg) wrote : Re: [Bug 1162270] Re: Indexing a notebook with large .txt attachment drives zim crazy

On Fri, Jan 17, 2014 at 10:30 AM, Robert C Watson <
<email address hidden>> wrote:

> Some thoughts on a possible solution...
>
> * Do not index a .txt file if line 1 does not contain "Content-Type:
> text/x-zim-wiki"
> * A Try-Except block for this will probably perform better than If-Then
> since the exceptions will be rare.
>
> * Provide a menu selection something like "Plain text (not indexed)"
> that changes line 1 of a file to "Content-Type: text/plain".
>
> (I haven't looked at the Zim code so don't take this as criticism. Zim
> is my diary, folio, manual log, notebook of documentation and anything
> else I can think of. It's always open. I couldn't live without it!)
>

This is indeed what I was planning.

First step will be to ignore text files without a heading

Second step will be to implement a commandline "--import" that allows
importing text files (by adding the heading)

Third step will be to add a dialog for importing text files that are not
imported.

-- Jaap

Revision history for this message
dagurasu (dagurasu15) wrote :

This is basically the same underlying bug as bug #931901 and bug #705479

Revision history for this message
dagurasu (dagurasu15) wrote :

The import idea sounds nice as a feature addition (maybe a right click option in the attachment browser), but once the non-zim text files are properly treated as attachments, then people can do this easily enough with cut and paste and the bug part of the bug is solved.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.