Setting locale breaks sss_ssh_authorizedkeys: set_locale() failed (5): Input/output error

Bug #1675118 reported by Graham Leggett
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
sssd (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

Configure an Ubuntu Trusty machine with sssd against an LDAP domain. This fails as follows:

ubuntu@bastion01:~$ /usr/bin/sss_ssh_authorizedkeys [username]
(Wed Mar 22 17:46:15:940434 2017) [/usr/bin/sss_ssh_authorizedkeys] [main] (0x0020): set_locale() failed (5): Input/output error
Error setting the locale

Further login with LDAP users is impossible.

This appears to be a recent regression, a similar machine was deployed on 15 March 2017 using the same orchestration and same configuration without a problem.

ubuntu@bastion01:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.5 LTS
Release: 14.04
Codename: trusty

It looks like this bug or a bug similar to this was fixed in sssd recently, but this doesn't explain the sudden unexpected failure:

https://pagure.io/SSSD/sssd/issue/2785

Revision history for this message
Seth Arnold (seth-arnold) wrote :

Hi Graham, this is unlikely to be an AppArmor issue, the usual results from AppArmor blocking access to a resource will be a Permission Denied response rather than Input/Output error.

Just in case I'm wrong though, please paste in any DENIED messages from AppArmor in your kernel's dmesg buffer.

Thanks

Revision history for this message
Steve Beattie (sbeattie) wrote :

Hi Graham, thanks for reporting your issue, sorry you're having problems.

First, this is not an apparmor issue; likely it's an issue with sssd (so adjusting there).

Second, sssd in trusty did have an update published for it on 15 March 2017 https://launchpad.net/ubuntu/+source/sssd/1.11.8-0ubuntu0.5 (an update from https://launchpad.net/ubuntu/+source/sssd/1.11.8-0ubuntu0.3), so perhaps your previous successful deployment was before that update made it out.

There was also a recent glibc security update released, but that did not touch locale processing code directly. It would be good to verify that reverting to https://launchpad.net/ubuntu/+source/eglibc/2.19-0ubuntu6.9 does not address the issue.

The primary commits referenced in the upstream bug report are (for sssd 1.13 branch):
https://pagure.io/SSSD/sssd/c/4815471669a25566f6772c228c104a206ffa37f7?branch=sssd-1-13
https://pagure.io/SSSD/sssd/c/76ab3eb947f4d6fe6555d8ea0ae97dc3966f02ac?branch=sssd-1-13

affects: apparmor (Ubuntu) → sssd (Ubuntu)
Revision history for this message
Graham Leggett (minfrin-y) wrote :

I thought I set this as an sssd bug, sorry about that.

I have many machines (in the tens to hundreds) across an estate all of whom have the same sssd configuration against LDAP. Machines that came up this morning worked with respect to LDAP, the machine that was brought up yesterday and referenced in this bug report still does not. All machines are orchestrated in the same way by the same version controlled mechanism.

It looks like in my case there are a number of bugs happening simultaneously:

- Running sss_ssh_authorizedkeys from the command line fails with "set_locale() failed (5): Input/output error" when it should not fail. Backporting the fixes above should fix this.

- There are errors in auth.log saying "error: AuthorizedKeysCommand /usr/bin/sss_ssh_authorizedkeys returned status 1"

- sss_ssh_authorizedkeys is broken in that it does not log anything to syslog on failure OR ssh is broken in that when it invokes sss_ssh_authorizedkeys the output of sss_ssh_authorizedkeys is lost. For this reason it is impossible to tell for certain whether the "status 1" is caused by the set_locale failed, or something else.

For a start, it looks like sss_ssh_authorizedkeys need to log properly, something it does not do now.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

> "Machines that came up this morning worked with respect to LDAP"

Very interesting, but that makes it hard to differ from a transient issue that happened just once or if on the given day the combination of packages was in a state to cause this :-/

> "the machine that was brought up yesterday and referenced in this bug report still does not."

So at least we could check to go debugging this one as long as it is broken.

The code throwing that msg is
 53 ret = set_locale();
 54 if (ret != EOK) {
 55 DEBUG(SSSDBG_CRIT_FAILURE,
 56 "set_locale() failed (%d): %s\n", ret, strerror(ret));
 57 ERROR("Error setting the locale\n");
 58 ret = EXIT_FAILURE;
 59 goto fini;
 60 }

So it actually logs about as much as it can which makes up the error message you reported.

The fix referenced seems to apply back to trusty, but since all other machines work fine we have the problem that as soon as we change *anything* we might consider it being the fix while it is not.

Also since neither you nor I can recreate a new system to debug I'd much more like to keep the broken system broken and debug there for now.

Could you follow [1] to add debug symbols.
Then install at least sssd-common-dbgsym (actually all sssd-*-dbgsym might be safer to catch all). As well as libc6-dbg libc6-dbgsym.

I read that a direct call to /usr/bin/sss_ssh_authorizedkeys fails, so you could now run
$ gdb /usr/bin/sss_ssh_authorizedkeys
b set_locale
run <yourusername>

This will get you to the function set_locale of sssd.
It has three real things it does, step #1 is to find which one is failing for you.
Just enter "n" to step through.
One of the thee setlocale, bindtextdomain, textdomain will be the one throwing the I/O error.

I don't see how the setlocale would fail, but the two textdomain related calls at least do I/O.

Once you know which one fails you can start over and use "s" to step into.
Some gdb basics are at [2], but this can quickly get complex depending on your experience with it. Do this as deep as you can and this should bring us closer to understand what your I/O error actually is about.

If anything I'd see LOCALEDIR in the I/O error scope, but that should be /usr/share/locale for you.
Could you check just to be sure there is no random I/O or path issue what an ls to that path gives you and if there is a subdir for your locale?
I love that sssd not even isntalls own bits there :-/ at least what I see with:
$ dpkg -L $(dpkg -l '*sssd*' | awk '/ii/ {print $2}' | xargs) | grep locale

[1]: https://wiki.ubuntu.com/Debug%20Symbol%20Packages
[2]: https://sourceware.org/gdb/onlinedocs/gdb/Continuing-and-Stepping.html#Continuing-and-Stepping

Changed in sssd (Ubuntu):
status: New → Incomplete
Revision history for this message
Graham Leggett (minfrin-y) wrote :
Download full text (4.2 KiB)

Followed instructions to add debug symbols.

The following two packages clashed with one another:

ubuntu@bastion01:~$ sudo apt-get install libc6-dbg libc6-dbgsym
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
  libc6-dbg libc6-dbgsym
0 upgraded, 2 newly installed, 0 to remove and 4 not upgraded.
Need to get 6796 kB of archives.
After this operation, 45.0 MB of additional disk space will be used.
Get:1 http://ddebs.ubuntu.com/ trusty-updates/main libc6-dbgsym amd64 2.19-0ubuntu6.11 [3328 kB]
Get:2 http://ap-southeast-1.ec2.archive.ubuntu.com/ubuntu/ trusty-updates/main libc6-dbg amd64 2.19-0ubuntu6.11 [3468 kB]
Fetched 6796 kB in 5s (1330 kB/s)
Selecting previously unselected package libc6-dbg:amd64.
(Reading database ... 82969 files and directories currently installed.)
Preparing to unpack .../libc6-dbg_2.19-0ubuntu6.11_amd64.deb ...
Unpacking libc6-dbg:amd64 (2.19-0ubuntu6.11) ...
Selecting previously unselected package libc6-dbgsym:amd64.
Preparing to unpack .../libc6-dbgsym_2.19-0ubuntu6.11_amd64.ddeb ...
Unpacking libc6-dbgsym:amd64 (2.19-0ubuntu6.11) ...
dpkg: error processing archive /var/cache/apt/archives/libc6-dbgsym_2.19-0ubuntu6.11_amd64.ddeb (--unpack):
 trying to overwrite '/usr/lib/debug/usr/lib/x86_64-linux-gnu/audit/sotruss-lib.so', which is also in package libc6-dbg:amd64 2.19-0ubuntu6.11
dpkg-deb: error: subprocess paste was killed by signal (Broken pipe)
Errors were encountered while processing:
 /var/cache/apt/archives/libc6-dbgsym_2.19-0ubuntu6.11_amd64.ddeb
E: Sub-process /usr/bin/dpkg returned an error code (1)

Gdb runs fine, but the source code is missing (on Redhat the source is part of the debuginfo packages, don't have as much ubuntu experience, can you confirm what package the source can be found?)

ubuntu@zonza-ap-southeast-1-bastion01:~$ gdb /usr/bin/sss_ssh_authorizedkeys
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
[snip]
Reading symbols from /usr/bin/sss_ssh_authorizedkeys...Reading symbols from /usr/lib/debug/.build-id/71/94b339876fa286ea083f67fd87bfcf1a33461e.debug...done.
done.
(gdb) b set_locale
Breakpoint 1 at 0x4030e0: file ../src/sss_client/ssh/sss_ssh_client.c, line 48.
(gdb) run minfrin
Starting program: /usr/bin/sss_ssh_authorizedkeys minfrin
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, set_locale () at ../src/sss_client/ssh/sss_ssh_client.c:48
48 {
(gdb) n
51 c = setlocale(LC_ALL, "");
(gdb)
52 if (c == NULL) {
(gdb)
53 return EIO;
(gdb)
52 if (c == NULL) {
(gdb)
69 }
(gdb)
main (argc=2, argv=0x7fffffffe428) at ../src/sss_client/ssh/sss_ssh_authorizedkeys.c:54
54 if (ret != EOK) {
(gdb)
53 ret = set_locale();
(gdb)
54 if (ret != EOK) {
(gdb)
55 DEBUG(SSSDBG_CRIT_FAILURE,
(gdb)
(Mon Mar 27 17:14:15:444385 2017) [/usr/bin/sss_ssh_authorizedkeys] [main] (0x0020): set_locale() failed (5): Input/output error
57 ERROR("Error setting the locale\n");

So, the failure is triggered by the following:

51 c = setlocale(LC_ALL, "");
(gdb) ...

Read more...

Revision history for this message
Graham Leggett (minfrin-y) wrote :

Looking further into the manpage for setlocale(), it says the following:

"For glibc,
       first (regardless of category), the environment variable LC_ALL is inspected, next the environment variable with the same name as the category (LC_COLLATE, LC_CTYPE,
       LC_MESSAGES, LC_MONETARY, LC_NUMERIC, LC_TIME) and finally the environment variable LANG."

Seems while the environment variable LANG exists, setlocale() still fails.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote : Re: [Bug 1675118] Re: Setting locale breaks sss_ssh_authorizedkeys: set_locale() failed (5): Input/output error

On Mon, Mar 27, 2017 at 7:33 PM, Graham Leggett <email address hidden> wrote:

> [...]
>

> Gdb runs fine, but the source code is missing (on Redhat the source is
> part of the debuginfo packages, don't have as much ubuntu experience,
> can you confirm what package the source can be found?)
>

You can use "pull-lp-source" from the package ubuntu-dev-tools to get the
source.

[...]

So, the failure is triggered by the following:
>
> 51 c = setlocale(LC_ALL, "");
> (gdb)
> 52 if (c == NULL) {
> (gdb)
> 53 return EIO;
>

Ack, thanks for tracking that down with me.

> Which is in turn fixed by https://pagure.io/SSSD/sssd/issue/2785 (or
> something similar to that).
>

Now that we understand better what is going on I can agree.
Out of that issue the "fix" is this commit:
https://pagure.io/SSSD/sssd/c/43e06ff39584570817949dc5de118d
2b7ca854c1?branch=master
The rest is adding testcases, debug messages and so on.

The actual fix is just to report a message and ignore the error if the
set_locale failed.
So not rocket science, but I agree that we have to wonder why it happens on
your box suddenly.

> According to the man page of setlocale(), NULL is returned when the
> locale cannot be found.
>

[...]

The env check was good, is just calling "locale" output the same on good &
bad systems as well?

We could try to regen-the locale by calling:

$ sudo locale-gen en_US.UTF-8

Might that fix it for you?

Also since you auto-deployed the content should be the same, so we might
just as well check all of /usr/share/locale if the files match.

$ find /usr/share/locale/ | sort | md5sum

Is that the same on both systems, if not what is different?
If the former is the same still some of the files might differ, you could
(lengthy) compare:

$ md5sum $(find /usr/share/locale/ | sort)

> It is possible that the broken locale is a red herring, and the cause of
> the problem is something else. The /var/log/auth.log shows this on the
> broken machine:
>
> Mar 27 17:24:43 yyy-bastion01 sshd[1624]: pam_sss(sshd:account): Access
> denied for user minfrin: 6 (Permission denied)
> Mar 27 17:24:43 yyy-bastion01 sshd[1624]: fatal: Access denied for user
> minfrin by PAM account configuration [preauth]
>

It could of course be a red-herring, yet one that we can try to understand.
It could just as well be that due to that similar issue with the locale on
the auth it takes an exit path with a bad rc which is reported as
permission denied.
So the herring doesn't have to be red.

On your check with LANG set, you could try if the following works instead
and report back?

$ LC_ALL="en_US.UTF-8" /usr/bin/sss_ssh_authorizedkeys minfrin

Finally - after all the tests above to destroy our testcase as late as
possible - you can try this ppa.
It contains an sssd with the related fix that ignores the set_locale fail
and just goes on.
That should at least help to find if fixing the locale issue was indeed a
red herring or not.

=> https://launchpad.net/~ci-train-ppa-service/+archive/ubuntu/2655

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

FYI - There was is an unrelated normal sssd update in the unapproved queue which once accepted and through ptoposed might clash with the version in the test ppa. So far nothing changed, but if it does you can use apt preferences [1] if that is an issue.

Let me know if you need help.

[1]: https://wiki.debian.org/AptPreferences

Revision history for this message
Graham Leggett (minfrin-y) wrote :

> We could try to regen-the locale by calling:
>
> $ sudo locale-gen en_US.UTF-8
>
> Might that fix it for you?

Not seen a change:

ubuntu@bastion01:~$ sudo locale-gen en_US.UTF-8
Generating locales...
  en_US.UTF-8... done
Generation complete.
ubuntu@bastion01:~$ exit
logout
Connection to bastion.x closed.
Little-Net:httpd-2.4.x minfrin$ ssh -A bastion.x
Authentication failed.

Although this does seem different (without explicit locale now works):

ubuntu@bastion01:~$ LC_ALL="en_US.UTF-8" /usr/bin/sss_ssh_authorizedkeys minfrin
ssh-rsa ...
ubuntu@bastion01:~$ /usr/bin/sss_ssh_authorizedkeys minfrin
ssh-rsa ...

Having applied the sssd ppa update, it's made no change:

Little-Net:httpd-2.4.x minfrin$ ssh -A bastion.x
Authentication failed.

What does now work that there is now no crash when the locale is missing:

ubuntu@bastion01:~$ unset LANG
ubuntu@bastion01:~$ /usr/bin/sss_ssh_authorizedkeys minfrin
ssh-rsa ....

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

So to get you correctly, the fix in the ppa does:
- fix the crash
- but does not get the auth to work

What gets the Auth to work is the explicit locale, but that it did before as well.

Hrm, I'm afraid I'm not sssd expert enough to see more here, did in the meantime anything show up to clarify why this one node fails and others don't?

I'm not sure - would in your opinion it be worth for you (and others) to push the fix that avoids the crash even if it is not fixing the auth issue itself? It is a fix after all, yet not all the scope we actually want for your case.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for sssd (Ubuntu) because there has been no activity for 60 days.]

Changed in sssd (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Kolyo Raychinov (edin.tam) wrote :

I had similar issue, as described here: https://answers.launchpad.net/ubuntu/+source/sssd/+question/665654

In my case, the locale was "LC_ALL = (unset)" and when I did $ update-locale LC_ALL="en_GB.UTF-8" and on the next login on the server I was able to ssh from it

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.