Comment 7 for bug 1087483

Revision history for this message
Tom Fifield (fifieldt) wrote :

The below is from NeCTAR's upgrade this week. All of the smarts are in puppet (https://github.com/NeCTAR-RC/), but the general flow might help in solving this:

=Nova=
     API: Stop nova-cells
    CC: stop nova-cells; stop nova-scheduler; stop nova-consoleauth; stop nova-novncproxy
    NODES: stop nova-compute; stop nova-network; stop nova-api-metadata
    Upgrade RabbitMQ servers (see section below)
    CC: puppet agent -t
    CC: Take a backup of nova DB
    CC: apt-get upgrade
    CC: stop nova-cells; stop nova-scheduler; stop nova-consoleauth; stop nova-novncproxy
    CC: /etc/init.d/memcached restart
    CC: nova-manage db sync
    CC: start nova-scheduler; start nova-consoleauth; start nova-novncproxy
    NODES: puppet agent -t
    NODES: apt-get upgrade
    CC: start nova-cells

Post upgrade:
    Need to add availability_zone metadata to your production aggregate.

=RabbitMQ=
     Rabbit moving to version 3.1.3. Need to stop all servers in the cluster to upgrade.

= Swift =
* Upgrade swift nodes first, then proxys

==== Prepare ====
* Stop puppet everywhere
* Backup your ring files
* Stop all background processes on your storage nodes (everything except (container|object|account)-server)
* Set puppet to noop the service resource
* Change openstack version to grizzly
==== Upgrade storage nodes ====
One by one do:
* puppet agent -t
* apt-get upgrade
* reboot
==== Upgrade proxy servers ====
One by one do:
* puppet agent -t
* apt-get upgrade
* reboot
==== Finish off ====
* Set puppet to no longer noop the service resource
* Start puppet everywhere

= glance-api =
* puppet agent -t
* apt-get upgrade

= Known issues =
==== nova-compute won't start with: KeyError: 'instance_type_memory_mb' ====
* This could be due to a stale VM in shutoff state. The log should have the instance uuid. Check in the DB to see if it's deleted. If so you should be able to do a virsh undefine <id>