ocrfeeder PDF import systematically fails

Bug #1890013 reported by Etienne URBAH
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
ocrfeeder (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Fix Released
Undecided
William Wilson

Bug Description

[Impact]
 When attempting to import PDF files, a scenario frequently
 occurs in which the import fails with no error message.
 This happens because ocrfeeder does not wait for the call
 to ghostscript to finish running before attempting to
 read its output.

[Test Plan]
 * Run ocrfeeder
 * Click File -> Import PDF
 * Choose a PDF file with many pages, as this will
   be more likely to trigger the bug
 * Observe that none of the PDF pages appear in ocrfeeder
   and it states "No images added" in the lower left corner
 * Install ocrfeeder from -proposed
 * Attempt to import the PDF file again, and observe that
   all of the pages appear in ocrfeeder

[Regression Potential]
 If the ghostscript call hangs indefinitely for any
 reason, the ocrfeeder application will also be hung.

[Original Description]
Using the File menu of ocrfeeder :

- Adding a JPG image systematically succeeds,

- Importing a PDF systematically fails (without any error message).

Please fix the handling of PDF by ocrfeeder.

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: ocrfeeder 0.8.1+git20200128.b945089-1
ProcVersionSignature: Ubuntu 5.4.0-42.46-generic 5.4.44
Uname: Linux 5.4.0-42-generic x86_64
ApportVersion: 2.20.11-0ubuntu27.4
Architecture: amd64
CasperMD5CheckResult: skip
CurrentDesktop: X-Cinnamon
Date: Sun Aug 2 01:09:44 2020
ExecutablePath: /usr/bin/ocrfeeder
InstallationDate: Installed on 2019-08-26 (340 days ago)
InstallationMedia: Ubuntu 19.04 "Disco Dingo" - Release amd64 (20190416)
InterpreterPath: /usr/bin/python3.8
PackageArchitecture: all
Python3Details: /usr/bin/python3.8, Python 3.8.2, python3-minimal, 3.8.2-0ubuntu2
PythonDetails: /usr/bin/python2.7, Python 2.7.18rc1, python-is-python2, 2.7.17-4
SourcePackage: ocrfeeder
UpgradeStatus: Upgraded to focal on 2020-06-23 (39 days ago)

Revision history for this message
Etienne URBAH (eurbah) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in ocrfeeder (Ubuntu):
status: New → Confirmed
Revision history for this message
William Wilson (jawn-smith) wrote :

Looks like there are two things going on here:

1) A bug in ghostscript that fails to correctly convert the PDF files https://bugs.launchpad.net/ubuntu/+source/ghostscript/+bug/1913656

2) A race condition in ocrfeeder itself. The function convertPdfToImages returns without waiting for the ghostscript command to finish executing. This sometimes leads to getImagesFromFolder being called before the images have been converted and placed in the correct location for ocrfeeder to read them.

I have already done the SRU for the ghostscript bug. The version of ocrfeeder in jammy correctly calls subprocess.run instead of os.popen, so that fix should be SRU'd as well.

Revision history for this message
William Wilson (jawn-smith) wrote :
description: updated
tags: added: verification-needed verification-needed-focal
Revision history for this message
William Wilson (jawn-smith) wrote :

This bug is not present in bionic as it does not occur with the version of Python in Bionic. Hirsute and impish have a newer version of ocrfeeder in which this bug is fixed.

Revision history for this message
Dirk F (fieldhouse) wrote :

For historical interest, I note that I don't see (2) in ocrfeeder 0.8.1 under Xenial. Fixing (1) by reverting gs to 9.25 or replacing with a patched 9.50 solved my issue with ocrfeeder in that configuration. I presume it's running under Python 2.7.17.

Changed in ocrfeeder (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote :

 $ dput ocrfeeder_0.8.1+git20200128.b945089-1ubuntu1_source.changes
Trying to upload package to ubuntu
Checking signature on .changes
gpg: /tmp/pkgs/focal/ocrfeeder_0.8.1+git20200128.b945089-1ubuntu1_source.changes: Valid signature from 1E918B66765B3E31
Checking signature on .dsc
gpg: /tmp/pkgs/focal/ocrfeeder_0.8.1+git20200128.b945089-1ubuntu1.dsc: Valid signature from 1E918B66765B3E31
Uploading to ubuntu (via ftp to upload.ubuntu.com):
  Uploading ocrfeeder_0.8.1+git20200128.b945089-1ubuntu1.dsc: done.
  Uploading ocrfeeder_0.8.1+git20200128.b945089-1ubuntu1.debian.tar.xz: done.
  Uploading ocrfeeder_0.8.1+git20200128.b945089-1ubuntu1_source.buildinfo: done.
  Uploading ocrfeeder_0.8.1+git20200128.b945089-1ubuntu1_source.changes: done.
Successfully uploaded packages.

Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello Etienne, or anyone else affected,

Accepted ocrfeeder into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ocrfeeder/0.8.1+git20200128.b945089-1ubuntu1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in ocrfeeder (Ubuntu Focal):
status: New → Fix Committed
Changed in ocrfeeder (Ubuntu Focal):
assignee: nobody → William Wilson (jawn-smith)
Revision history for this message
William Wilson (jawn-smith) wrote :

The verification passed for focal. The attached image shows the "No images added" message after attempting to import a PDF.

Revision history for this message
William Wilson (jawn-smith) wrote :

While this second image shows a successful PDF import after installing ocrfeeder from -proposed.

tags: added: verification-done verification-done-focal
removed: verification-needed verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ocrfeeder - 0.8.1+git20200128.b945089-1ubuntu1

---------------
ocrfeeder (0.8.1+git20200128.b945089-1ubuntu1) focal; urgency=medium

  * d/patches/wait-for-ghostcript.patch: resolve race condition causing
    PDF imports to silently fail. (LP: #1890013)

 -- William 'jawn-smith' Wilson <email address hidden> Mon, 22 Nov 2021 14:11:27 -0600

Changed in ocrfeeder (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for ocrfeeder has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.