Comment 2 for bug 1882094

Revision history for this message
melanie witt (melwitt) wrote :

We have been discussing this on the associated downstream bug [1] and what we have found from the nova perspective is that it wouldn't be the best approach to try to make all of nova's global state "reloadable" to allow it to endure re-running of the WSGI script *without* actually restarting the python interpreter. Nova has A Lot of global state within it (unlike placement) and an approach involving reload-ifying all global data would be an exercise in whack-a-mole and vulnerable to future issues if/when more global state is added.

This mod_wsgi doc about considerations when reloading WSGI scripts in Embedded Mode [2] describes the problem we have here fairly well: "The first issue is that when the script file is imported, if the code makes modifications to sys.path or other global data structures and the changes are additive, checks should first be made to ensure that the change has not already been made, else duplicate data will be added every time the script file is reloaded."

Now, in our poorly behaving scenario, we are *not* running in Embedded Mode but are rather running in Daemon Mode HOWEVER, what we are observing in our downstream apache + mod_wsgi environment is that if the WSGI script fails to load for any reason (in this case it's a DBConnectionError, expected because the DB is rebooting while nova-api is coming up -- we establish connection to the DB as part of the WSGI init script), something (mod_wsgi?) will re-run the script *without* reloading the daemon process at all. And that is where we run into the blow ups with the various global state.

Instead of chasing each piece of global data and making it reloadable, we're thinking there are a couple of other options:

(1) Removing and moving the DB connection establishment out of our WSGI init script so that the init script is dead simple and global data access stays outside in normal python modules

or

(2) Instead of letting exceptions go uncaught in our WSGI init script, catch them and log a message and sys.exit() the python process as we have failed to do the bare minimum of initialization needed for nova-api to work properly

For now I'm favoring option (1) as is also described in [2]: "One should therefore be cautious of what data is kept in a script file. Preferably the script file should only act as a bridge to code and data residing in a normal Python module imported from an entirely different directory."

But I'm not 100% how this will behave, so I need to do some local testing with it. I will try some things out and comment again afterward.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1843798
[2] https://modwsgi.readthedocs.io/en/develop/user-guides/reloading-source-code.html#reloading-in-embedded-mode