privsep.daemon takes 9 seconds during taking volume attaching operation

Bug #1735663 reported by Zhijiang Hu
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Expired
Undecided
Unassigned

Bug Description

When we did volume attaching operation (cinder/ceph as backend), cinder took 5 seconds to finish its job, but nova took another 9s for running privsep helper before calling into cinder, cause the total amount time for attaching operation becomes ~15 seconds which seems too long for us.

Here is the log

2017-12-01 03:46:23.028 7 INFO nova.compute.manager [req-0e8c7f34-c0bd-4911-9022-a8747ebc03b8 a942d72fde65421f9d1897f1c23e6c34 289dbde7038f454396fde8b136414701 - default default] [instance: 04b47cd2-600b-40c1-9c3b-43e71898d1b9] Attaching volume 15ed8c33-cc00-4445-8d02-43807abc85b8 to /dev/vdb
2017-12-01 03:46:27.264 7 INFO oslo.privsep.daemon [req-0e8c7f34-c0bd-4911-9022-a8747ebc03b8 a942d72fde65421f9d1897f1c23e6c34 289dbde7038f454396fde8b136414701 - default default] Running privsep helper: ['sudo', 'nova-rootwrap', '/etc/nova/rootwrap.conf', 'privsep-helper', '--config-file', '/usr/share/nova/nova-dist.conf', '--config-file', '/etc/nova/nova.conf', '--privsep_context', 'os_brick.privileged.default', '--privsep_sock_path', '/tmp/tmpZkKVsQ/privsep.sock']
2017-12-01 03:46:28.160 7 INFO oslo.privsep.daemon [req-0e8c7f34-c0bd-4911-9022-a8747ebc03b8 a942d72fde65421f9d1897f1c23e6c34 289dbde7038f454396fde8b136414701 - default default] Spawned new privsep daemon via rootwrap
2017-12-01 03:46:28.106 1345 INFO oslo.privsep.daemon [-] privsep daemon starting
2017-12-01 03:46:28.109 1345 INFO oslo.privsep.daemon [-] privsep process running with uid/gid: 0/0
2017-12-01 03:46:28.111 1345 INFO oslo.privsep.daemon [-] privsep process running with capabilities (eff/prm/inh): CAP_SYS_ADMIN/CAP_SYS_ADMIN/none
2017-12-01 03:46:28.112 1345 INFO oslo.privsep.daemon [-] privsep daemon running as pid 1345
2017-12-01 03:46:32.119 7 INFO nova.compute.resource_tracker [req-3b3a701f-d6a6-4511-acf9-35877317a2df - - - - -] Final resource view: name=host-192-168-11-5 phys_ram=65500MB used_ram=768MB phys_disk=822GB used_disk=10GB total_vcpus=40 used_vcpus=1 pci_stats=[]

Any idea here?

Revision history for this message
Matt Riedemann (mriedem) wrote :

Added Michael Still to this bug report to see if he has any ideas on the performance degradation.

tags: added: privsep volumes
tags: added: libvirt
tags: added: performance
Revision history for this message
Michael Still (mikal) wrote :

I don't see anything in the logs supporting that this took 9 seconds?

2017-12-01 03:46:27.264 7 INFO oslo.privsep.daemon ... Running privsep helper
2017-12-01 03:46:28.112 1345 INFO oslo.privsep.daemon [-] privsep daemon running as pid 1345

So, it took 28.112 - 27.264 = 0.848 seconds to start the privsep daemon. That doesn't sound too bad for something which should only happen once.

What am I missing?

Revision history for this message
sean mooney (sean-k-mooney) wrote :

from the above i think the 9 seconds comes from the fact that at 03:46:23 we attach the volume
then at 03:46:27 we spawn the privsep deamon, then the next message that is not form the privsep deamon is at 03:46:32 which is an info message from the resouce tracker.

so 03:46:32 - 03:46:23 is 9 seconds.

however that is not the time it takes to spawn the privsep deamon which starts
at 03:46:27.264 ands ends at 03:46:28.112 taking 0.848 seconds as michael said.

there just happens to be ~4 seconds before privsep is started after volume attach where there is no logs and ~4 seconds after before the the next log message after privsep starts.
add that up with the 0.8 seconds it takes to start (4+4+.8) and you get the ~9 seconds between
attaching the voluemn at 03:46:23 and updating the resouce tracker at 03:46:32 but the time inbetween does not seam to be used by privsep based on that log snipit.

Revision history for this message
Zhijiang Hu (hu-zhijiang) wrote :

thanks for the reply. It seems need to reproduce it with debug enabled. I will try.

Lee Yarwood (lyarwood)
Changed in nova:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.