maximum recursion depth exceeded

Bug #1952551 reported by Bob
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SoundConverter
Confirmed
High
Unassigned

Bug Description

It happens when trying to convert a list of 4000+ FLAC files to MP3.

At some point it stops:

  File "/tmp/soundconverter/soundconverter/gstreamer/converter.py", line 472, in run
    self._conversion_done()
  File "/tmp/soundconverter/soundconverter/gstreamer/converter.py", line 362, in _conversion_done
    self.callback()
  File "/tmp/soundconverter/soundconverter/util/task.py", line 65, in callback_wrapped
    return callback(self)
  File "/tmp/soundconverter/soundconverter/util/taskqueue.py", line 129, in task_done
    self.start_next()
  File "/tmp/soundconverter/soundconverter/util/taskqueue.py", line 142, in start_next
    task.run()

[Repeat a few hundred times (thousands?)]

  File "/tmp/soundconverter/soundconverter/gstreamer/converter.py", line 457, in run
    self.newname = self.name_generator.generate_target_uri(
  File "/tmp/soundconverter/soundconverter/util/namegenerator.py", line 447, in generate_target_uri
    parent_uri = self._get_common_target_uri(sound_file)
  File "/tmp/soundconverter/soundconverter/util/namegenerator.py", line 422, in _get_common_target_uri
    return filename_to_uri(parent)
  File "/tmp/soundconverter/soundconverter/util/fileoperations.py", line 140, in filename_to_uri
    match = split_uri(filename)
  File "/tmp/soundconverter/soundconverter/util/fileoperations.py", line 118, in split_uri
    match = re.match(r'^([a-zA-Z]+://([^/]+?)?)?(/.*)', uri)
  File "/usr/lib/python3.9/re.py", line 191, in match
    return _compile(pattern, flags).match(string)
  File "/usr/lib/python3.9/re.py", line 291, in _compile
    if isinstance(flags, RegexFlag):
RecursionError: maximum recursion depth exceeded while calling a Python object

Since the list of songs to convert can be in the thousands, it really shouldn't be doing recursion...

Debian testing, with version 4.0.3. Reproduced with latest git master.

Revision history for this message
sezanzeb (sezanzeb) wrote :

Seems odd, the queue object (self.pending) should not allow this

>>> from queue import Queue
>>> q = Queue()
>>> q.put(1)
>>> q.put(2)
>>> q.put(3)
>>> q.get()
1
>>> q.get()
2
>>> q.get()
3
>>> q.qsize()
0

afterwards .get() would block.

Revision history for this message
sezanzeb (sezanzeb) wrote (last edit ):

ah, it might be because the call chain is so incredibly long

Revision history for this message
sezanzeb (sezanzeb) wrote (last edit ):

Can you please try this: https://github.com/kassoulet/soundconverter/pull/56 (fork/branch available here: https://github.com/sezanzeb/soundconverter/tree/events-taskqueue)

it uses events instead of starting the next task via the callback, I guess that should do the trick. I haven't tried it with a few thousand sound files yet so please check it out and report back if it works

Revision history for this message
Bob (paul-crapouillou) wrote :

I still get a recursion error, but it's different now:

Traceback (most recent call last):
  File "/tmp/soundconverter/soundconverter/util/taskqueue.py", line 148, in start_next
    task.run()
  File "/tmp/soundconverter/soundconverter/gstreamer/converter.py", line 457, in run
    self.newname = self.name_generator.generate_target_uri(
  File "/tmp/soundconverter/soundconverter/util/namegenerator.py", line 451, in generate_target_uri
    filename = self._get_target_filename(sound_file)
  File "/tmp/soundconverter/soundconverter/util/namegenerator.py", line 430, in _get_target_filename
    self.fill_pattern(sound_file, self.basename_pattern),
  File "/tmp/soundconverter/soundconverter/util/namegenerator.py", line 325, in fill_pattern
    filename = beautify_uri(sound_file.uri)
  File "/tmp/soundconverter/soundconverter/util/fileoperations.py", line 44, in beautify_uri
    match = split_uri(uri)
  File "/tmp/soundconverter/soundconverter/util/fileoperations.py", line 118, in split_uri
    match = re.match(r'^([a-zA-Z]+://([^/]+?)?)?(/.*)', uri)
  File "/usr/lib/python3.9/re.py", line 191, in match
    return _compile(pattern, flags).match(string)
  File "/usr/lib/python3.9/re.py", line 291, in _compile
    if isinstance(flags, RegexFlag):
RecursionError: maximum recursion depth exceeded while calling a Python object

Revision history for this message
sezanzeb (sezanzeb) wrote :

I just converted 7000 test files and didn't get a recursion problem.

Are you really sure you installed the events-taskqueue branch?

Revision history for this message
sezanzeb (sezanzeb) wrote (last edit ):

How many jobs are you running in parallel? Increasing it probably has a positive effect on this. I have set it to 6

Also, on said branch task.run is in line 150. To install it, use

git clone https://github.com/sezanzeb/soundconverter.git
git checkout events-taskqueue
sudo python3 setup.py install
soundconverter

Revision history for this message
Bob (paul-crapouillou) wrote :

I don't know how many jobs, I believe the default is the number of CPU cores? That would be 8 in my case. But it does happen with -j1 as well.

I don't know what I did on last trace, I was definitely in the events-taskqueue branch. My HEAD is commit adcb133 now, and I still get recursion errors:

Traceback (most recent call last):
  File "/tmp/soundconverter/soundconverter/util/taskqueue.py", line 134, in task_done
    self.start_next()
  File "/tmp/soundconverter/soundconverter/util/taskqueue.py", line 150, in start_next
    task.run()
  File "/tmp/soundconverter/soundconverter/gstreamer/converter.py", line 459, in run
    self.newname = self.name_generator.generate_target_uri(
  File "/tmp/soundconverter/soundconverter/util/namegenerator.py", line 451, in generate_target_uri
    filename = self._get_target_filename(sound_file)
  File "/tmp/soundconverter/soundconverter/util/namegenerator.py", line 430, in _get_target_filename
    self.fill_pattern(sound_file, self.basename_pattern),
  File "/tmp/soundconverter/soundconverter/util/namegenerator.py", line 325, in fill_pattern
    filename = beautify_uri(sound_file.uri)
  File "/tmp/soundconverter/soundconverter/util/fileoperations.py", line 44, in beautify_uri
    match = split_uri(uri)
  File "/tmp/soundconverter/soundconverter/util/fileoperations.py", line 118, in split_uri
    match = re.match(r'^([a-zA-Z]+://([^/]+?)?)?(/.*)', uri)
  File "/usr/lib/python3.9/re.py", line 191, in match
    return _compile(pattern, flags).match(string)
  File "/usr/lib/python3.9/re.py", line 291, in _compile
    if isinstance(flags, RegexFlag):
RecursionError: maximum recursion depth exceeded while calling a Python object

Note that I run it with "-e skip" and at this point a good number of my songs are already converted, maybe 3000+, so I get a lot of "output file already exists, skipping". I don't know if that's related.

Revision history for this message
sezanzeb (sezanzeb) wrote :

> That would be 8 in my case. But it does happen with -j1 as well.

with one job it should happen earlier in the queue than with 8. By a factor of 8 actually

Revision history for this message
sezanzeb (sezanzeb) wrote (last edit ):

Try `git checkout 68f52299f6cd587c6e09011e813fb391dab6d9cc` on my fork, that commit has a different approach to the events but was bad design. Maybe that works

I'm not getting recursion errors on any branch, can't reproduce. Even after 10.000 files

Revision history for this message
sezanzeb (sezanzeb) wrote :

It's suspicious that it always complains when compiling the regex. Is that really the case? Or does it also crash at different places every now and then?

Revision history for this message
sezanzeb (sezanzeb) wrote :

In the original post, there was the

self._conversion_done()
self.callback()
return callback(self)
self.start_next()
task.run()
(...)

log. Is this not shown anymore?

Revision history for this message
sezanzeb (sezanzeb) wrote :

I can reproduce it now. It happens when the files already exist, _conversion_done is called directly resulting in a long call chain in that case.

Changed in soundconverter:
milestone: none → 4.0.4
importance: Undecided → High
status: New → Confirmed
Revision history for this message
sezanzeb (sezanzeb) wrote (last edit ):

Please try the latest events-taskqueue commit, it works for me now

Revision history for this message
Bob (paul-crapouillou) wrote :

Seems to work fine now! Thanks!

Revision history for this message
sezanzeb (sezanzeb) wrote (last edit ):

Great! I found an even better way to avoid the problem. GLib.idle_add seems to make GLib start a new callchain/stack/whatever fresh, please try it one more time. Then we can merge this

I also added a unittest to avoid this in the future that does two batch calls for 300 tiny files

Revision history for this message
Bob (paul-crapouillou) wrote :

I tried a build from latest event-taskqueue (3ec5b75) and it works perfectly fine, as long as I also merge your 'no-images' branch (otherwise if I don't it stops after a while, in the middle of the conversion, without eating CPU, just hanging there).

Revision history for this message
sezanzeb (sezanzeb) wrote (last edit ):

ok, cool. That's fine, no-images is merged already into main

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers