Comment 0 for bug 509408

Revision history for this message
Yannick Voglaire (yannickv) wrote :

Hi,

This app (modified from that of bug #316722 "python-poppler doesn't close files") takes up more and more memory during its execution :

---------------------------
import os
import poppler
from ctypes import *

glib = CDLL("libgobject-2.0.so")

uri = "file://" + os.path.abspath("test.pdf")
doc = poppler.document_new_from_file(uri, None)

for i in range(1000000):
 page = doc.get_page(0)
---------------------------

For a 24 Kb pdf file, it ends up taking 173 Mb of RAM.
Unreferencing with glib helps, but not as much as we would like: with this app

---------------------------
import os
import poppler
from ctypes import *

glib = CDLL("libgobject-2.0.so")

uri = "file://" + os.path.abspath("test.pdf")
doc = poppler.document_new_from_file(uri, None)

for i in range(1000000):
 page = doc.get_page(0)
 glib.g_object_unref(hash(page))
 del page
---------------------------

we "only" end up with 108 Mb used. (More precisely, to get these numbers, I just added at the end the lines
import time
time.sleep(5)
and checked in "top" the RES memory usage. Looking at the VIRT column gives similar results.)

This seems to be really due to python-poppler and not poppler, as I ran a corresponding C code (derived from test-poppler-glib.cc in poppler source), using only poppler-glib, and
1) without "g_object_unref (G_OBJECT (page));", the memory usage grows steadily (culminating at around 70 Mb) ;
2) with "g_object_unref (G_OBJECT (page));", the memory usage stays the same throughout the execution, so in this case it completely solves the problem.

---------------------------
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include <poppler.h>

#define FAIL(msg) \
 do { fprintf (stderr, "FAIL: %s\n", msg); exit (-1); } while (0)

int main (int argc, char *argv[])
{
  PopplerDocument *document;
  PopplerPage *page;
  GError *error;

  if (argc != 3)
    FAIL ("usage: test-poppler-glib file://FILE PAGE");

  g_type_init ();

  error = NULL;
  document = poppler_document_new_from_file (argv[1], NULL, &error);
  if (document == NULL)
    FAIL (error->message);

  for (gint i=0; i<=1000000; i++)
  {
    g_print("%d", i);
    page = poppler_document_get_page_by_label (document, argv[2]);
    g_object_unref (G_OBJECT (page));
  }

  g_object_unref (G_OBJECT (document));

  return 0;
}
---------------------------

I ran these tests with poppler-0.12.0 and python-poppler-0.10.0 on Ubuntu 9.10.