OperationalError: FATAL: the database system is shutting down

Bug #926169 reported by Nat Katin-Borland
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
KARL3
Invalid
High
Unassigned

Bug Description

Several serious errors popped up in the monitor this morning. KARL was completely down for around 1 minute, but seemed to have popped back quickly. Please investigate ASAP!

Fri Feb 3 11:03:49 2012 Error in daemon process Traceback (most recent call last): File "/srv/osfkarl/production/39/eggs/karl-3.80-py2.6.egg/karl/scripting.py", line 94, in run_daemon func() File "/srv/osfkarl/production/39/eggs/karlserve-1.12-py2.6.egg/karlserve/scripts/main.py", line 160, in run func(args) File "/srv/osfkarl/production/39/eggs/karlserve-1.12-py2.6.egg/karlserve/scripts/mailin.py", line 26, in main mailin(args, instance) File "/srv/osfkarl/production/39/eggs/karlserve-1.12-py2.6.egg/karlserve/scripts/mailin.py", line 47, in mailin runner = MailinRunner2(root, zodb_uri, zodb_path, queue) File "/srv/osfkarl/production/39/eggs/karl-3.80-py2.6.egg/karl/utilities/mailin.py", line 194, in __init__ self.queue, self.closer = factory(zodb_uri, queue_name, zodb_path) File "/srv/osfkarl/production/39/eggs/repoze.postoffice-0.16-py2.6.egg/repoze/postoffice/queue.py", line 20, in open_queue db = db_from_uri(zodb_uri) File "/srv/osfkarl/production/39/eggs/repoze.zodbconn-0.11-py2.6.egg/repoze/zodbconn/uri.py", line 19, in db_from_uri db = dbfactory() File "/srv/osfkarl/production/39/eggs/ZODB3-3.10.1-py2.6-linux-i686.egg/ZODB/config.py", line 101, in open storage = section.storage.open() File "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage/config.py", line 33, in open return RelStorage(adapter, name=config.name, options=options) File "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage/storage.py", line 167, in __init__ self._adapter.schema.prepare() File "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage/adapters/schema.py", line 951, in prepare self.connmanager.open_and_call(callback) File "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage/adapters/connmanager.py", line 73, in open_and_call conn, cursor = self.open() File "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage/adapters/postgresql.py", line 192, in open conn = Psycopg2Connection(self._dsn) OperationalError: FATAL: the database system is shutting down

Fri Feb 3 11:03:32 2012 Exception when processing http://osfkarl10.gocept.net:8080/osf/error_monitor_status.txt Referer: None Traceback (most recent call last): File "/srv/osfkarl/production/39/eggs/karl-3.80-py2.6.egg/karl/errorlog.py", line 18, in middleware return app(environ, start_response) File "/srv/osfkarl/production/39/eggs/repoze.retry-0.9.4-py2.6.egg/repoze/retry/__init__.py", line 88, in __call__ app_iter = self.application(environ, replace_start_response) File "/srv/osfkarl/production/39/eggs/repoze.zodbconn-0.11-py2.6.egg/repoze/zodbconn/connector.py", line 21, in __call__ result = self.next_app(environ, start_response) File "/srv/osfkarl/production/39/eggs/repoze.tm2-1.0a5-py2.6.egg/repoze/tm/__init__.py", line 23, in __call__ result = self.application(environ, save_status_and_headers) File "/srv/osfkarl/production/39/eggs/repoze.who-1.0.15-py2.6.egg/repoze/who/middleware.py", line 107, in __call__ app_iter = app(environ, wrapper.wrap_start_response) File "/srv/osfkarl/production/39/eggs/repoze.urchin-0.2-py2.6.egg/repoze/urchin/__init__.py", line 53, in __call__ resp = req.get_response(self.app) File "/srv/osfkarl/production/39/eggs/WebOb-1.1.1-py2.6.egg/webob/request.py", line 1086, in get_response application, catch_exc_info=False) File "/srv/osfkarl/production/39/eggs/WebOb-1.1.1-py2.6.egg/webob/request.py", line 1055, in call_application app_iter = application(self.environ, start_response) File "/srv/osfkarl/production/39/eggs/pyramid-1.2.1-py2.6.egg/pyramid/router.py", line 176, in __call__ response = self.handle_request(request) File "/srv/osfkarl/production/39/eggs/pyramid-1.2.1-py2.6.egg/pyramid/tweens.py", line 17, in excview_tween response = handler(request) File "/srv/osfkarl/production/39/eggs/pyramid-1.2.1-py2.6.egg/pyramid/router.py", line 153, in handle_request response = view_callable(context, request) File "/srv/osfkarl/production/39/eggs/pyramid-1.2.1-py2.6.egg/pyramid/config/views.py", line 319, in viewresult_to_response result = view(context, request) File "/srv/osfkarl/production/39/eggs/karl-3.80-py2.6.egg/karl/views/admin.py", line 832, in error_monitor_status_view queue, closer = _get_postoffice_queue(request.context) File "/srv/osfkarl/production/39/eggs/karl-3.80-py2.6.egg/karl/views/admin.py", line 891, in _get_postoffice_queue return open_queue(zodb_uri, queue_name) File "/srv/osfkarl/production/39/eggs/repoze.postoffice-0.16-py2.6.egg/repoze/postoffice/queue.py", line 20, in open_queue db = db_from_uri(zodb_uri) File "/srv/osfkarl/production/39/eggs/repoze.zodbconn-0.11-py2.6.egg/repoze/zodbconn/uri.py", line 19, in db_from_uri db = dbfactory() File "/srv/osfkarl/production/39/eggs/ZODB3-3.10.1-py2.6-linux-i686.egg/ZODB/config.py", line 101, in open storage = section.storage.open() File "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage/config.py", line 33, in open return RelStorage(adapter, name=config.name, options=options) File "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage/storage.py", line 167, in __init__ self._adapter.schema.prepare() File "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage/adapters/schema.py", line 951, in prepare self.connmanager.open_and_call(callback) File "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage/adapters/connmanager.py", line 73, in open_and_call conn, cursor = self.open() File "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage/adapters/postgresql.py", line 192, in open conn = Psycopg2Connection(self._dsn) OperationalError: FATAL: the database system is shutting down

Changed in karl3:
importance: Undecided → High
Revision history for this message
Paul Everitt (paul-agendaless) wrote : Re: [Bug 926169] [NEW] OperationalError: FATAL: the database system is shutting down
Download full text (12.5 KiB)

That's one to start with gocept on. If it was completely down, then something went wrong at gocept.

--Paul

On Feb 3, 2012, at 11:24 AM, Nat Katin-Borland wrote:

> Public bug reported:
>
> Several serious errors popped up in the monitor this morning. KARL was
> completely down for around 1 minute, but seemed to have popped back
> quickly. Please investigate ASAP!
>
> Fri Feb 3 11:03:49 2012 Error in daemon process Traceback (most recent
> call last): File
> "/srv/osfkarl/production/39/eggs/karl-3.80-py2.6.egg/karl/scripting.py",
> line 94, in run_daemon func() File
> "/srv/osfkarl/production/39/eggs/karlserve-1.12-py2.6.egg/karlserve/scripts/main.py",
> line 160, in run func(args) File
> "/srv/osfkarl/production/39/eggs/karlserve-1.12-py2.6.egg/karlserve/scripts/mailin.py",
> line 26, in main mailin(args, instance) File
> "/srv/osfkarl/production/39/eggs/karlserve-1.12-py2.6.egg/karlserve/scripts/mailin.py",
> line 47, in mailin runner = MailinRunner2(root, zodb_uri, zodb_path,
> queue) File
> "/srv/osfkarl/production/39/eggs/karl-3.80-py2.6.egg/karl/utilities/mailin.py",
> line 194, in __init__ self.queue, self.closer = factory(zodb_uri,
> queue_name, zodb_path) File
> "/srv/osfkarl/production/39/eggs/repoze.postoffice-0.16-py2.6.egg/repoze/postoffice/queue.py",
> line 20, in open_queue db = db_from_uri(zodb_uri) File
> "/srv/osfkarl/production/39/eggs/repoze.zodbconn-0.11-py2.6.egg/repoze/zodbconn/uri.py",
> line 19, in db_from_uri db = dbfactory() File
> "/srv/osfkarl/production/39/eggs/ZODB3-3.10.1-py2.6-linux-i686.egg/ZODB/config.py",
> line 101, in open storage = section.storage.open() File
> "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage/config.py",
> line 33, in open return RelStorage(adapter, name=config.name,
> options=options) File
> "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage/storage.py",
> line 167, in __init__ self._adapter.schema.prepare() File
> "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage/adapters/schema.py",
> line 951, in prepare self.connmanager.open_and_call(callback) File
> "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage/adapters/connmanager.py",
> line 73, in open_and_call conn, cursor = self.open() File
> "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage/adapters/postgresql.py",
> line 192, in open conn = Psycopg2Connection(self._dsn) OperationalError:
> FATAL: the database system is shutting down
>
> Fri Feb 3 11:03:32 2012 Exception when processing
> http://osfkarl10.gocept.net:8080/osf/error_monitor_status.txt Referer:
> None Traceback (most recent call last): File
> "/srv/osfkarl/production/39/eggs/karl-3.80-py2.6.egg/karl/errorlog.py",
> line 18, in middleware return app(environ, start_response) File
> "/srv/osfkarl/production/39/eggs/repoze.retry-0.9.4-py2.6.egg/repoze/retry/__init__.py",
> line 88, in __call__ app_iter = self.application(environ,
> replace_start_response) File
> "/srv/osfkarl/production/39/eggs/repoze.zodbconn-0.11-py2.6.egg/repoze/zodbconn/connector.py",
> line 21, in __call__ result = self.next_app(environ, start_response)
> File
> "/srv/osfkarl/production/39/eggs/re...

Revision history for this message
Nat Katin-Borland (nborland) wrote : RE: [Bug 926169] [NEW] OperationalError: FATAL: the database system is shutting down
Download full text (19.9 KiB)

By down I meant General Error, so I think that means something on the side...

--
Nathaniel Katin-Borland
Support Specialist
Knowledge Management Initiative
KARL Support Team

Open Society Foundations - New York Office
400 West 59th Street
New York, NY 10019
Email: <email address hidden>
Phone: 212-547-6984
http://www.soros.org/
http://www.karlproject.org

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Paul Everitt
Sent: Friday, February 03, 2012 12:26 PM
To: Nathaniel Katin-Borland
Subject: Re: [Bug 926169] [NEW] OperationalError: FATAL: the database system is shutting down

That's one to start with gocept on. If it was completely down, then something went wrong at gocept.

--Paul

On Feb 3, 2012, at 11:24 AM, Nat Katin-Borland wrote:

> Public bug reported:
>
> Several serious errors popped up in the monitor this morning. KARL
> was completely down for around 1 minute, but seemed to have popped
> back quickly. Please investigate ASAP!
>
> Fri Feb 3 11:03:49 2012 Error in daemon process Traceback (most recent
> call last): File
> "/srv/osfkarl/production/39/eggs/karl-3.80-py2.6.egg/karl/scripting.py
> ",
> line 94, in run_daemon func() File
> "/srv/osfkarl/production/39/eggs/karlserve-1.12-py2.6.egg/karlserve/sc
> ripts/main.py",
> line 160, in run func(args) File
> "/srv/osfkarl/production/39/eggs/karlserve-1.12-py2.6.egg/karlserve/sc
> ripts/mailin.py", line 26, in main mailin(args, instance) File
> "/srv/osfkarl/production/39/eggs/karlserve-1.12-py2.6.egg/karlserve/sc
> ripts/mailin.py", line 47, in mailin runner = MailinRunner2(root,
> zodb_uri, zodb_path,
> queue) File
> "/srv/osfkarl/production/39/eggs/karl-3.80-py2.6.egg/karl/utilities/ma
> ilin.py", line 194, in __init__ self.queue, self.closer =
> factory(zodb_uri, queue_name, zodb_path) File
> "/srv/osfkarl/production/39/eggs/repoze.postoffice-0.16-py2.6.egg/repo
> ze/postoffice/queue.py", line 20, in open_queue db =
> db_from_uri(zodb_uri) File
> "/srv/osfkarl/production/39/eggs/repoze.zodbconn-0.11-py2.6.egg/repoze
> /zodbconn/uri.py", line 19, in db_from_uri db = dbfactory() File
> "/srv/osfkarl/production/39/eggs/ZODB3-3.10.1-py2.6-linux-i686.egg/ZOD
> B/config.py", line 101, in open storage = section.storage.open() File
> "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage
> /config.py", line 33, in open return RelStorage(adapter,
> name=config.name,
> options=options) File
> "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage
> /storage.py", line 167, in __init__ self._adapter.schema.prepare()
> File
> "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage
> /adapters/schema.py", line 951, in prepare
> self.connmanager.open_and_call(callback) File
> "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage
> /adapters/connmanager.py", line 73, in open_and_call conn, cursor =
> self.open() File
> "/srv/osfkarl/production/39/eggs/RelStorage-1.5.1-py2.6.egg/relstorage
> /adapters/postgresql.py", line 192, in open conn =
> Psycopg2Connection(self._dsn) OperationalError:
> FATAL: the databas...

Revision history for this message
Paul Everitt (paul-agendaless) wrote :

I'll take this ticket while waiting to hear from gocept why the database server went down.

Changed in karl3:
milestone: none → m89
assignee: nobody → Paul Everitt (paul-agendaless)
Revision history for this message
Paul Everitt (paul-agendaless) wrote :

Wasn't a software issue, was a hosting issue at gocept. From gocept:

"""
before going deeper into the issue: yes, the shutdown was caused by a
reconfiguration. However, the interaction seems non-trivial as the restart
seems to have been triggered by a link-level dependency on OpenLDAP libraries
which I can't see why it was re-built right away.

We'll get a more in-depth analysis next week - the server didn't crash and burn
and should run fine as it did before.

I'm also surprised that this change did trigger a PostgreSQL restart - we
didn't expect that at all.

The downtime in the log is shown between 11:02:52 CET (start of shutdown) -
11:03:55 (accepting connections again).

Changed in karl3:
assignee: Paul Everitt (paul-agendaless) → nobody
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.