I think we're not going to consider anything that requires a server process. I had the impression when watching it run that it started pretty fast.
--Paul
On Mar 27, 2013, at 3:34 PM, Chris Rossi <email address hidden> wrote:
> Theune says what he showed you was LibreOffice, using the CLI. I'll
> play with it and see how it does. Seems to be a very heavy weight tool,
> a big process needs to load before it can do anything. So we might be
> looking at long extract times that would require us to use a queue and a
> separate thread or process to do the text extraction offline, so we
> don't slow down user HTTP requests. I
>
> --
> You received this bug notification because you are subscribed to KARL3.
> https://bugs.launchpad.net/bugs/1081104
>
> Title:
> Investigate switching from doctotext to a supported extractor
>
> Status in KARL3:
> Confirmed
>
> Bug description:
> Hi!
>
> we see a lot of errors of the following form in our KARL error
> monitor:
>
> Tue Nov 20 05:32:47 2012 ERROR mailin Error converting file
> /tmp/tmp46IR1W
>
> Error converting file /tmp/tmp46IR1W
>
> Traceback (most recent call last):
> File "/srv/multikarl/production/12/eggs/karl-3.99-py2.6.egg/karl/content/models/adapters.py", line 116, in _extract_file_data
> mimetype=context.mimetype)
> File "/srv/multikarl/production/12/eggs/karl-3.99-py2.6.egg/karl/utilities/converters/doc.py", line 39, in convert
> return self.execute('doctotext "%s"' % filename), 'utf-8'
> File "/srv/multikarl/production/12/eggs/karl-3.99-py2.6.egg/karl/utilities/converters/baseconverter.py", line 54, in execute
> close_fds=True)
> File "/usr/lib/python2.6/subprocess.py", line 623, in __init__
> errread, errwrite)
> File "/usr/lib/python2.6/subprocess.py", line 1141, in _execute_child
> raise child_exception
> OSError: [Errno 2] No such file or directory
>
>
> We see this in all sub-projects (Ariadne, Oxfam, Privacy International) and it always affects files written to /tmp/somefile.
>
>
> Best regards,
> Alex
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/karl3/+bug/1081104/+subscriptions
I think we're not going to consider anything that requires a server process. I had the impression when watching it run that it started pretty fast.
--Paul
On Mar 27, 2013, at 3:34 PM, Chris Rossi <email address hidden> wrote:
> Theune says what he showed you was LibreOffice, using the CLI. I'll /bugs.launchpad .net/bugs/ 1081104 /production/ 12/eggs/ karl-3. 99-py2. 6.egg/karl/ content/ models/ adapters. py", line 116, in _extract_file_data context. mimetype) /production/ 12/eggs/ karl-3. 99-py2. 6.egg/karl/ utilities/ converters/ doc.py" , line 39, in convert 'doctotext "%s"' % filename), 'utf-8' /production/ 12/eggs/ karl-3. 99-py2. 6.egg/karl/ utilities/ converters/ baseconverter. py", line 54, in execute python2. 6/subprocess. py", line 623, in __init__ python2. 6/subprocess. py", line 1141, in _execute_child /bugs.launchpad .net/karl3/ +bug/1081104/ +subscriptions
> play with it and see how it does. Seems to be a very heavy weight tool,
> a big process needs to load before it can do anything. So we might be
> looking at long extract times that would require us to use a queue and a
> separate thread or process to do the text extraction offline, so we
> don't slow down user HTTP requests. I
>
> --
> You received this bug notification because you are subscribed to KARL3.
> https:/
>
> Title:
> Investigate switching from doctotext to a supported extractor
>
> Status in KARL3:
> Confirmed
>
> Bug description:
> Hi!
>
> we see a lot of errors of the following form in our KARL error
> monitor:
>
> Tue Nov 20 05:32:47 2012 ERROR mailin Error converting file
> /tmp/tmp46IR1W
>
> Error converting file /tmp/tmp46IR1W
>
> Traceback (most recent call last):
> File "/srv/multikarl
> mimetype=
> File "/srv/multikarl
> return self.execute(
> File "/srv/multikarl
> close_fds=True)
> File "/usr/lib/
> errread, errwrite)
> File "/usr/lib/
> raise child_exception
> OSError: [Errno 2] No such file or directory
>
>
> We see this in all sub-projects (Ariadne, Oxfam, Privacy International) and it always affects files written to /tmp/somefile.
>
>
> Best regards,
> Alex
>
> To manage notifications about this bug go to:
> https:/