a race in service report count update

Bug #1458919 reported by Liang Chen
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Confirmed
Low
Unassigned

Bug Description

Update of report_count is not protected in a transaction - https://github.com/openstack/nova/blob/185e00e/nova/servicegroup/drivers/db.py#L88 . When multiple service worker processes are used, they may overwrite each other's report_count update.

Tags: db
Revision history for this message
jichenjc (jichenjc) wrote :

seems service is a object ,why need protection ?

seems report_count is only for inc count purpose, where it's used?

Revision history for this message
Liang Chen (cbjchen) wrote :

It's a looping call, added at - https://github.com/openstack/nova/blob/master/nova/servicegroup/drivers/db.py#L53 .
Every time a work process is created, it joins the service group - https://github.com/openstack/nova/blob/master/nova/service.py#L198 , thus creating that looping call.
When there are multiple work processes configured, each of them fetch the report_count from the database and increment it regardless whether there is already a "inc operation" going on from another work process. They end up overwriting each other's update - only the last update stays in the database.

tags: added: db
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

There is indeed a race between forks of the same service process trying to update the service record, but it's a harmless one: we are not interested in the report_count value itself, but rather in update_at column value, based on which services are considered to be either alive or dead.

And nova-conductor seems to be the only forking service which has a corresponding services table entry.

Changed in nova:
status: New → Confirmed
importance: Undecided → Low
stgleb (gstepanov)
Changed in nova:
assignee: nobody → stgleb (gstepanov)
status: Confirmed → In Progress
stgleb (gstepanov)
description: updated
Revision history for this message
Sean Dague (sdague) wrote :

There are no currently open reviews on this bug, changing the status back to the previous state and unassigning. If there are active reviews related to this bug, please include links in comments.

Changed in nova:
status: In Progress → Confirmed
assignee: stgleb (gstepanov) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.