Comment 114 for bug 19171

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

So, here's what's happening here then, folks:

Your system boots up and gets to the S:S20modules-init-tools stage, that's where
we read /etc/modules and modprobe the modules in order. Now modprobe is
basically just a kernel request, and these days tends to return pretty quicky to
userspace without blocking for everything to happen.

Deep Black Magic happens inside the kernel, and once it's done it generates a
series of hotplug events which it passes back to userspace through two means; by
running the program specified in /proc/sys/kernel/hotplug with interesting
environment; and also through a netlink socket.

/proc/sys/kernel/hotplug is "udevsend", a tool that gathers up this environment
and sends it over a local socket to the "udevd" process that marshalls all of
these events. If there's no daemon listening it tries to start one up, and will
retry sending the event for a while until it gets to the other end.

Now we have a whole bunch of udevsend processes all run at pretty much the same
time, all of these try to start up udevd and all of the udevd processes try to
bind to the local socket to receive events on. One of them wins, the rest die
and go away. A little time passes by which time all of the running udevsend
will have dispatched their event to this udevd that will marshall them.

This udevd _also_ begins listening on the netlink socket, as it's a better way
to get events from the kernel than having it execute something which mucks
around with IPC to get it to us.

Meanwhile the kernel is happily generating both /proc/sys/kernel/hotplug and
netlink events for what's happening on the box, in fact it's been doing this all
the time udevd has been getting its clothes on.

If the module sequence loaded is something like "psmouse, mousedev, ..., lp"
(exactly as it is in breezy machines that have been upgraded from warty/hoary)
you may find that the first netlink event you receive is actually for the
printer port.

But that's ok, we had udevsend events for the rest...

Well, that's the theory; sadly here's the practice.

On receiving the netlink event for the printer port, udevd disables receipt of
any "sequence numbered" events from udevsend (ie. those that will almost
certainly be duplicated over the netlink socket). Unfortunately this means all
the udevsend events we're about to receive from the processes that backed off a
second or so while fighting over who got to start udevd.

These udevsend processes deliver their events to udevd, which cheerfully ignores
them because it thinks it's going to get another copy over the netlink socket
any second now. Unfortunately the netlink event has already been and gone, and
we just ignored an event we weren't supposed to.

The two problems as I see them are:

1) The fact that receiving a netlink event disables sequence numbered udevsend
events, when there's already code to deal with de-duping events anyway. Is
there actually any need for this additional check, can't we just queue both
events and have them ignored by msg_queue_insert() ?

2) That this ignoring of events is done at receipt, rather than in queue order.
 This means that the "later" parport_pc netlink event is able to disable
queueing of udevsend events with a lower sequence number.

I can envisage that #1 is necessary in case the time between receiving the
udevsend and netlink event is so long that we've already processed and removed
one of the events by the time the second is queued. In which case the problem
becomes fixing #2, however unless the kernel promises strict ordering of events
over the netlink socket (which I doubt, otherwise it wouldn't need sequence
numbers).

I suspect the right solution is actually to implement history of what events
we've already processed, and de-dupe them that way; rather than ignoring
messages on receipt.

In breezy for release, I think I'm just going to comment out the entire netlink
code -- the kernel still runs udevsend for us, so we don't lose anything other
than a sexier kernel/userspace IPC mechanism that we can bring back in dapper
when it's fixed.