mailman oops directory permission keep getting reset

Bug #300320 reported by Diogo Matsubara
8
Affects Status Importance Assigned to Milestone
Launchpad itself
Invalid
High
Diogo Matsubara

Bug Description

The oops-tools crashes when it finds a directory it can't access. I could make oops-tools just ignore directories it can't access but that would hide OOPS reports for a given date in case the directory can't be accessed.

Trying to access an URL like https://lp-oops.canonical.com/oops.py/?oopsid=OOPS-1055E1152 returns an error:
<type 'exceptions.OSError'> at /
[Errno 13] Permission denied: '/x/launchpad.net-logs/scripts/forster/mailman-xmlrpc/2008-11-20'

a) investigate why mailman-xmlrpc/2008-11-20 have the following permission:
drwxrws--- 2 archvsync archvsync 4.0K 2008-11-20 00:01 2008-11-20

b) fix the permission

c) fix the root cause

Please don't close this bug until the root cause is fixed.

Revision history for this message
Diogo Matsubara (matsubara) wrote :

I'm setting this to high because it blocks the QA team and wastes LOSAs time by having to fix the permission on devpad.

Changed in launchpad-foundations:
importance: Undecided → High
Revision history for this message
Diogo Matsubara (matsubara) wrote :

<stub> The OOPS dir permissions are probably being reset by rsync, so the fix is to ensure permissions are set correctly on forster or to tell rsync to make everything locally world readable

Revision history for this message
Herb McNew (herb) wrote :

The permissions are set correctly on forster but somehow get changed, presumably by the app server when it's creating the oops directory. I would like to fix the problem at it's root, rather than special casing the way we sync logs for the xmlrpc server.

Stuart Bishop (stub)
Changed in launchpad-foundations:
status: New → Triaged
Revision history for this message
Gary Poster (gary) wrote :

Diogo, could you investigate this, asking Stuart and myself for direction as needed?

Changed in launchpad-foundations:
milestone: none → 3.0
assignee: nobody → Diogo Matsubara (matsubara)
Revision history for this message
Diogo Matsubara (matsubara) wrote :

I investigated this with spm's help. It's difficult to debug since we don't have evidence anymore and AFAICT, it's been a long time since it last happened.

While investigating, spm pointed out that bug 325962 was recently fixed. That bug could be causing some strange interaction since it was actually starting an additional app server.

I thought this permission problem could be caused by some race condition with the oops pruner. spm identified that the oops pruner script is not run on forster (or edge, fwiw), only on lpnet. spm filed RT 35400 to have the script enabled on all machines.

I think we could close this as invalid for now, until it happens again. If it happens again we shouldn't use the workaround and leave the evidence in place to help debug. I'll email the losas about this.

Revision history for this message
Diogo Matsubara (matsubara) wrote :

I forgot to mention, Gary/Stuart if you have other ideas of what might be causing this let me know and I'll pursue.

Revision history for this message
Gary Poster (gary) wrote :

Unless Stuart has any thoughts, closing this as invalid sounds like the right thing to do. Thank you!

Changed in launchpad-foundations:
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.