also, just to leave myself a note for writing the patch in a few hours, it also looks like one reason the issue happens so frequently with the short-living connection for the neutron_pg_drop creation is that we end up sending the transaction *before* we even send the initial monitor request.
jlibosvar has traced some other issues down to a particular python-ovs commit that I believe was the one where they added support for clustered dbs. part of that change (if I am remembering correctly) added connecting to the _Server db. ovsdbapp had a hack to wait until we'd gotten the initial db, idlutils.wait_for_change(). Since there are now several different things going on, wait_for_change() probably doesn't do what we really want, which is "wait until we get the initial dump of the OVN dbs". So we start sending transactions too early sometimes.
also, just to leave myself a note for writing the patch in a few hours, it also looks like one reason the issue happens so frequently with the short-living connection for the neutron_pg_drop creation is that we end up sending the transaction *before* we even send the initial monitor request.
jlibosvar has traced some other issues down to a particular python-ovs commit that I believe was the one where they added support for clustered dbs. part of that change (if I am remembering correctly) added connecting to the _Server db. ovsdbapp had a hack to wait until we'd gotten the initial db, idlutils. wait_for_ change( ). Since there are now several different things going on, wait_for_change() probably doesn't do what we really want, which is "wait until we get the initial dump of the OVN dbs". So we start sending transactions too early sometimes.