Cannot resolve users without an existing /etc/krb5.conf

Bug #1893438 reported by Jean-Baptiste Lallement
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
sssd (Ubuntu)
Triaged
Medium
Andreas Hasenack

Bug Description

Tested on Ubuntu Desktop 20.04.1 and Groovy up-to-date.

The setup in one ADC running on Windows Server 2019 and 3 clients, Ubuntu Desktop 20.04.1, Ubuntu Deskop Groovy, Fedora 32.

On Ubuntu clients, after following the documentation at https://discourse.ubuntu.com/t/service-sssd/11579 to connect AD with sssd and realmd, it is not possible to resolve users (id, getent, login, ...) without creating the file /etc/krb5.conf manually.

The documentation mentions that realmd should take care of the configuration. The sssd configuration is generated correctly and identical to Fedora.

Joining the domain with "realm join" works fine and the temporary kerberos config file created by realmd is correct.

In the logs there 2 errors but likely linked, showing that the AD provider is offline.
""""

[sssd[be[warthogs.biz]]] [sasl_bind_send] (0x0080): Extended failure message: [SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Configuration file does not specify default realm)]

""""

[sssd[be[warthogs.biz]]] [get_server_status] (0x1000): Status of server 'adc01.warthogs.biz' is 'name resolved'
[sssd[be[warthogs.biz]]] [get_port_status] (0x1000): Port status of port 0 for server 'adc01.warthogs.biz' is 'not working'
[sssd[be[warthogs.biz]]] [get_port_status] (0x0080): SSSD is unable to complete the full connection request, this internal status does not necessarily indicate network port issues.
[sssd[be[warthogs.biz]]] [fo_resolve_service_send] (0x0020): No available servers for service 'AD'
[sssd[be[warthogs.biz]]] [sdap_id_release_conn_data] (0x4000): releasing unused connection
[sssd[be[warthogs.biz]]] [be_resolve_server_done] (0x1000): Server resolution failed: [5]: Input/output error
[sssd[be[warthogs.biz]]] [sdap_id_op_connect_done] (0x0020): Failed to connect, going offline (5 [Input/output error])
[sssd[be[warthogs.biz]]] [be_mark_offline] (0x2000): Going offline!

""""

It works fine on Fedora 32 and using realmd to join, is enough to be able to resolve the users. Besides, the errors mentioned above are not displayed in the logs.

User resolution works on Ubuntu afte creating a file /etc/krb5.conf containing only this:

"""""""
[libdefaults]
        default_realm = WARTHOGS.BIZ
"""""""

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: sssd 2.2.3-3
ProcVersionSignature: Ubuntu 5.4.0-42.46-generic 5.4.44
Uname: Linux 5.4.0-42-generic x86_64
ApportVersion: 2.20.11-0ubuntu27.8
Architecture: amd64
CasperMD5CheckResult: skip
CurrentDesktop: ubuntu:GNOME
Date: Fri Aug 28 10:05:25 2020
InstallationDate: Installed on 2020-08-27 (0 days ago)
InstallationMedia: Ubuntu 20.04.1 LTS "Focal Fossa" - Release amd64 (20200731)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: sssd
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :
Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :
Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :
description: updated
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Taking a look

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I repeated the steps in a bionic lxd container. I had to install packagekit, which you already have on a desktop, but in the end it's working, and I have no /etc/krb5.conf file at all:

ubuntu@bionic-sssd-desktop-team:~$ id <email address hidden>
uid=1725801106(<email address hidden>) gid=1725800513(domain <email address hidden>) groups=1725800513(domain <email address hidden>),1725801118(<email address hidden>)

I noticed I'm using fully qualified names, but you have "use_fully_qualified_names = False" in your sssd config. The other difference is that realmd (or adcli) added "ldap_sasl_authid = BIONIC-SSSD-DES$" to my sssd.conf

Finally, you also have "ad_server = adc01.warthogs.biz" which I didn't need. I wonder if in my case the client is fetching some configuration from the DNS server? Did you also install DNS on your AD, and integrate it together?

Your log shows "Configuration file does not specify default realm", which is definitely something that would live in /etc/krb5.conf, but it's also set in sssd.conf via "krb5_realm".

Actually, let me take a look at your full debug log, as I was looking just at what you added to the bug description.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Also, do you get a /etc/krb5.conf created when using realm to join the domain on fedora?

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Hm, sorry, I tried on bionic, I don't know why. Trying again on focal and groovy too.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I repeated it with focal, and right after the join, id user@<REALM> worked, and I have no /etc/krb5.conf. There must be something else going on over there.

Can you please make these changes:
- sudo apt install sssd-dbug (if not already installed)
- /etc/sssd/sssd.conf:

[sssd]
services = nss, pam, ifp <--- add "ifp"
debug_level = 6 <--- add

[nss] <--- add
debug_level = 6 <--- add

[pam] <--- add
debug_level = 6 <--- add

[domain/...]
debug_level = 6 <--- add

Then restart sssd: sudo systemctl restart sssd

Now the /var/log/sssd/sssd_nss.log file shall have debug info.

With the "ifp" service, you can now use sssctl commands like these:
root@focal-sssd-desktop-team:~# sssctl domain-list
ad1.example.com
ad2.example.com

root@focal-sssd-desktop-team:~# sssctl domain-status ad1.example.com
Online status: Online

Active servers:
AD Global Catalog: not connected
AD Domain Controller: server1.ad1.example.com

Discovered AD Global Catalog servers:
None so far.
Discovered AD Domain Controller servers:
- server1.ad1.example.com

root@focal-sssd-desktop-team:~# sssctl user-checks <email address hidden>
user: <email address hidden>
action: acct
service: system-auth

SSSD nss user lookup result:
 - user name: <email address hidden>
 - user id: 1725801106
 - group id: 1725800513
 - gecos: John Smith
 - home directory: /<email address hidden>
 - shell: /bin/bash

SSSD InfoPipe user lookup result:
 - name: <email address hidden>
 - uidNumber: 1725801106
 - gidNumber: 1725800513
 - gecos: John Smith
 - homeDirectory: not set
 - loginShell: not set

testing pam_acct_mgmt

pam_acct_mgmt: Permission denied

PAM Environment:
 - no env -

root@focal-sssd-desktop-team:~# sssctl user-show <email address hidden>
Name: <email address hidden>
Cache entry creation date: 08/28/20 18:37:19
Cache entry last update time: 08/28/20 18:47:32
Cache entry expiration time: 08/28/20 20:17:32
Initgroups expiration time: 08/28/20 20:17:32
Cached in InfoPipe: No

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Finally, just in case you were using it, lxd is not the best test environment for this, because of the high uids chosen by sssd which fall outside the range set in /etc/subuid and /etc/subgid. A VM is best to avoid headaches and hard-to-debug issues.

Changed in sssd (Ubuntu):
status: New → Triaged
assignee: nobody → Andreas Hasenack (ahasenack)
importance: Undecided → Medium
Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :

Thanks for looking into this.

Testing is done in VMs, using dnsmasq for name resolution.

I set use_fully_qualified_names = False because I suspect a name resolution issue, and wanted to try another setting than the default set when the configuration file is created by realm. True or False doesn't make any difference.

I'll enable the DNS on AD but it wouldn't explain why it works with Fedora and not Ubuntu on the same setup (ie no DNS enabled on AD)

On Fedora, /etc/krb5.conf is not created by realm.

I'll enable debugging and provide more logs.

Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :

With debugging enabled, we see that the domain is marked offline.

# sssctl domain-list
warthogs.biz

# sssctl domain-status warthogs.biz
Online status: Offline

Active servers:
AD Global Catalog: not connected
AD Domain Controller: adc01.warthogs.biz

Discovered AD Global Catalog servers:
- adc01.warthogs.biz

Discovered AD Domain Controller servers:
- adc01.warthogs.biz

Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :

Doing a reverse DNS lookup on Ubuntu returns:
root@adclient01:~# host 192.168.122.250
250.122.168.192.in-addr.arpa domain name pointer adc01.

While on Fedora is returns the name with the domain:
root@localhost-live ~]# host 192.168.122.250
250.122.168.192.in-addr.arpa domain name pointer adc01.
250.122.168.192.in-addr.arpa domain name pointer adc01.warthogs.biz.

I suspect that the client being unable to associate the domain name with the domain of the AD server makes it fail. It's very likely that moving the DNS to the AD server will fix this.

Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :

On Ubuntu I changed the configuration of the resolver in /etc/resolv.conf to use the DNS directly instead of the local systemd-resolved from "nameserver 127.0.0.53" to "nameserver 10.148.231.1" and it fixes the issue.

This narrows down the issue to a name resolution problem.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I've seen in the freeipa install docs that freeipa expects `hostname` to return the FQDN of the host, and not just the hostname. I always found that odd. Maybe this is what's needed here. Try setting /etc/hostname to the fqdn, with the domain part. Then test with `hostname` and `hostname -f` and both should return the fqdn, and try again.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

> On Ubuntu I changed the configuration of the resolver in /etc/resolv.conf
> to use the DNS directly instead of the local systemd-resolved from
> "nameserver 127.0.0.53" to "nameserver 10.148.231.1" and it fixes the issue.

Could you also check what the status of the systemd resolver was?

sudo systemd-resolve --status

In particular which DNS servers it had available and was using for each network interface, and globally.

FWIW, since a few ubuntu releases I've seen a 5s delay in some name resolutions that happen for the first time: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1765477

So maybe sssd was doing the DNS query to obtain the FQDN of the host, but it took too long and it gave up? Hence fedora's suggestion to set `hostname` to the fqdn, instead of relying on `hostname -f`, which does a DNS query?

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

There is a krb5.conf setting that I see suggested in a few guides and that points to DNS problems as well, and that is `rdns`. It is suggested to set it to `false`.

If there is no /etc/krb5.conf file, then one has to be created just for this:

[libdefaults]
    rdns = false

And, while at it, maybe also add "defaults_realm = <yourrealm>"

Revision history for this message
Andreas Hasenack (ahasenack) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.