Comment 4 for bug 1424108

Revision history for this message
clayg (clay-gerrard) wrote :

ok, so, there's more than one bug here :P

On one hand it's hard to get into this state and it tends to self heal if there's *any* write activity in the account, or pending with uncommitted data somewhere, but there's a number of things we could fix.

1) the account-auditor does a get_policy_stats call with do_migrations=True, but that only applies the that weird add container count migration (remember how we forgot container count in the policy_stat table!?), it definitely doesn't alter the containers table (that's normally handled in _migrate_add_storage_policy_index), and I'm not even sure if it will add the policy_stat table it doesn't exist.

2) in _really_merge_items we could just setdefault the records shipped across the wire in replication and avoid the KeyError - same as the container.backend does.

Omar,

If you still have some accounts that are acting like this it's because they were in the strange state of having been out-of-sync and also not having any recent activity (it's harder to be out of sync if there's not any activity) - so maybe a node was offline for awhile? Either way if replication doesn't fix it on it's own after a container-updater pass then none of the three replicas of the account have had their schema migrated and there's no new container info that will force it.

You could force it by making some activity - either through uploading an object (which will cause a container updater to send up new rows to account) or just creating and deleting a container - on the effected account(s).

If you don't want to deal with putting any data in them or you have more than one and don't want to track them all down a script to find get the db connection and call _migrate_add_storage_policy_index would work:

    """
    Usage:

    python migrate.py <path/to/db/file>
    """
    import sys
    from swift.account.backend import AccountBroker

    db_path = sys.argv[1]
    broker = AccountBroker(db_path)
    with broker.get() as conn:
        try:
            broker._migrate_add_storage_policy_index(conn)
        except Exception as e:
            if 'duplicate' not in str(e):
                raise
            print 'Already Migrated!'
        else:
            print 'Success!'

You could run it like this:

    for db in $(find /srv/node*/sdb*/account* -name \*.db); do python /vagrant/.scratch/migrate.py $db; done

Lot of over-head starting a new python process for each db, you might need to pull the os.walk into the script if you have more than a good number of accounts - or just be patient ;)

Hopefully we'll get it fixed up better for the next guy in the next version - thanks for the bug report!