Comment 16 for bug 1764853

Revision history for this message
msaxl (saxl) wrote :

i guess I found the problem.

winbindd somewhere does change its uid to the target uid to create the users kerberos cache.
If keytab method contains system keytab (it does in my configuration), in gse_krb5.c fill_mem_keytab_from_system_keytab there is a call to name_to_fqdn. This function uses getaddrinfo to get the machines fqdn. This in turn connects to system dbus (not as uid 0!). system dbus has not cached this uid's "credentials" (there seems to be a hash table, see dbus-userdb.c line 148), so it uses nsswitch configuration to get it. system dbus now connects to winbind. But winbind seems to be blocking in this case (and system dbus now is blocked to).
As soon as pam_winbind times out, the deadlock is broken, the needed information is returned to system dbus, the info is put into the hashtable, dbus is not blocked anymore.

The second time the info is in dbus's hashtable, so the deadlock does not happen (this also explains why the second time I get the systems fqdn but not the first time).

Keep in mind that this means calling getaddrinfo in winbind is only save as uid 0, but I suggest the following (maybe better to be discussed upstream):

insert a if(getuid()==0){ .. } around line 597 and 602 in gee_krb5.c (https://git.samba.org/?p=samba.git;a=blob;f=source3/librpc/crypto/gse_krb5.c;h=4dd39eaf08d8f492b6b332cfb5b2f30e4c1ab575;hb=4dd39eaf08d8f492b6b332cfb5b2f30e4c1ab575#l597)