Crash in _findHighestSerialFilename

Bug #971255 reported by Jeroen T. Vermeulen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Python OOPS Date-dir repository
Fix Released
Critical
Robert Collins

Bug Description

I just got this traceback with version 0.0.17-0ubuntu1: http://paste.ubuntu.com/911037/

The problem happens in this bit of uniquefileallocator.py:

        for filename in os.listdir(directory):
            logid = filename.rsplit('.', 1)[1]
            if not logid.startswith(prefix):
                continue

This would, I suppose, break whenever some file or subdirectory in the log directory has no dot in its name. I do see one thing in the log directory that doesn't have a dot in it: an oops directory seemingly created by python-oops-datedir-repo itself!

What gives?

Analysis
========
Our best theory is that two components of MaaS were sharing the same datedir repository and one was using an instanceid and one was not. This should allow trivial reproduction.

We could add code to handle this, or we could delete the instanceid based code entirely. AFAIK at this point nothing stops us deleting the code, which would be preferrable, as the non-instanceid code was written because the instanceid based code was architecturally slow and unreliable.

Tags: oops

Related branches

Revision history for this message
Robert Collins (lifeless) wrote :

So a few things: - you shouldn't mix other things into the datedir repo - it will happily delete them if you run a pruner, for instance. I think that would avoid you having this error.

That said, you're using the deprecated naming scheme which is less concurrency safe and is much slower.... How are you constricting your repository object?

Changed in python-oops-datedir-repo:
status: New → Incomplete
Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

Robert notes on IRC: it looks as if the problem is that we're using an obsolete naming scheme. I don't think we made a conscious choice to do that, but will have to look that up.

I should add that in our test configuration at least, we also have other log files in the log directory which are not managed by the oops tools. This seems to make sense as the datedir repo will create sub-directories for the oopses anyway; so we might as well use the top-level directory for other logs. The code I quote seems designed to cope with exactly that, but will also break if any of those files' names lack a dot.

Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

To be honest I have no idea what it means to constrict our repository object; but we create a DateDirRepo with as its arguments an error dir and an instance id. And we have some settings in various places that repeat the error dir, and set a “reporter.” Everything else seems to be left to defaults.

Revision history for this message
Robert Collins (lifeless) wrote :

*constructing*

Revision history for this message
Robert Collins (lifeless) wrote :

We talked on IRC. Our best theory is that two components of MaaS were sharing the same datedir repository and one was using an instanceid and one was not. This should allow trivial reproduction.

We could add code to handle this, or we could delete the instanceid based code entirely. AFAIK at this point nothing stops us deleting the code, which would be preferrable.

description: updated
Changed in python-oops-datedir-repo:
status: Incomplete → Triaged
importance: Undecided → High
importance: High → Critical
tags: added: oops
Revision history for this message
Robert Collins (lifeless) wrote :

So, since we've now had > 6 months API stability, I think its time to delete the legacy code. Anyone using it can choose to not upgrade for a bit, and then migrate when they do.

Changed in python-oops-datedir-repo:
assignee: nobody → Robert Collins (lifeless)
status: Triaged → Fix Committed
Changed in python-oops-datedir-repo:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.