router_info's _get_existing_devices execution time is O(n)

Bug #1494961 reported by Ryan Moats
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Invalid
Low
Unassigned

Bug Description

router_info's _get_existing_devices execution time increases as the number of routers scheduled to a network node increases. Ideally, this execution time should be O(1) if possible.

Changed in neutron:
importance: Undecided → Medium
tags: added: l3-ipam-dhcp
Revision history for this message
Brian Haley (brian-haley) wrote :

I'm not saying that _get_existing_devices() or get_devices() in ip_lib isn't slow, but I know there can be issues with a large number of network namespaces, sudo, and /sbin/ip, such that 'sudo ip netns exec...' can be painfully slow. Later kernels and iproute code has fixed this. Some info is here, https://etherpad.openstack.org/p/neutron-agent-exec-performance

Just wanted to make sure you didn't look into something that might already be fixed.

Revision history for this message
Ryan Moats (rmoats) wrote :

Thanks for the pointer Brian - I'll keep that in mind

Revision history for this message
Ryan Moats (rmoats) wrote :

Using 3.19 kernel improves this, but there are still some O(n) behaviour to chase down

Ryan Moats (rmoats)
Changed in neutron:
importance: Medium → Low
tags: added: performance
Brad Behle (behle)
Changed in neutron:
assignee: nobody → Brad Behle (behle)
Brad Behle (behle)
Changed in neutron:
status: New → In Progress
Ryan Moats (rmoats)
tags: added: loadimpact
removed: performance
Revision history for this message
Brad Behle (behle) wrote :

I haven't been able to find a way to improve this. The full command being run is in neutron/agent/linux/ip_lib.py, IPWrapper.get_devices():

output = utils.execute(['ip', 'netns', 'exec', self.namespace,
                                    'find', SYS_NET_PATH, '-maxdepth', '1',
                                    '-type', 'l', '-printf', '%f '],
                                   run_as_root=True,
                                   log_fail_as_error=self.log_fail_as_error
                                   ).split()

with the router namespace ("qrouter-<uuid>") being used for self.namespace. Using "ls" or "ip addr list" to get the list of devices is no faster than find ... and since this code path only wants the devices in a given router's namespace, we can't use os.listdir in python nor something similar.

Revision history for this message
Brad Behle (behle) wrote :

Marked this as Invalid since this isn't really a bug, just a request to investigate improving this code improvement. I've done the investigation and couldn't find a good way to improve it

Changed in neutron:
status: In Progress → Invalid
Brad Behle (behle)
Changed in neutron:
assignee: Brad Behle (behle) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.