winbind does not refresh kerberos tickets

Bug #1037055 reported by Ian Gordon on 2012-08-15
26
This bug affects 3 people
Affects Status Importance Assigned to Milestone
samba
Fix Released
Medium
samba (Ubuntu)
Medium
Unassigned
Precise
Low
Unassigned

Bug Description

[Impact]
* If it happens on the client, the client can't authenticate to any kerberised servers (Windows or Linux).
* If it happens on the server, all clients (Windows or Linux) are unable to connect to that server any more.
* The main impact is very flaky network authentication on an LTS release that we will have to live with for a few more years.

[Workaround]
On the desktop run kinit to create a new ticket cache, or on a server restart the winbind daemon after logging in with a local account. This usually needs to be done once or twice a week on my desktop, but less frequently on servers.

[Test Case]
Requires an AD (or Samba 4?) domain with winbind configured to use it.
Use winbind refresh ticket = true
Set cached_login for pam_winbind.
Log onto a domain member using a domain account.
Winbind will create a standard Kerberos credential cache containing a TGT (Ticket Granting Ticket - eg something like krbtgt/REALM@REALM).
The klist command will verify the existence of the cache and the TGT in it.
At some point before the renewal lifetime is up, the credential cache will disappear preventing Kerberos apps from working. It is often at about 25-50% of the renewal lifetime, but not always.
The klist command will now report that it can't find the ccache.
With the bugfix, the ccache never disappears and Winbind will successfully renew the TGT.

[Original Description]

winbindd will renew kerberos tickets until they expire, but it seems unable to refresh them before expiry.

I have the following in smb.conf:

winbind refresh ticket = true

and have cached_login set for pam_winbind

After 7 days ( the renewal limit on AD kerberos tickets) the ticket expires and I lose access to my NFS home directory which uses sec=krb5

I have tried to debug why this is happening and have come to the conclusion that there are two important variables for ticket refreshing to work (both in winbind/winbindd_cred_cache.c):

ccache_list
memory_creds_list

and that the function that stores the password for later refreshing use is called

winbindd_add_memory_creds

This function though requires that the user is in ccache_list before it stores the password in a way it can be used by the rekinit part of the function krb5_ticket_refresh_handler.

The problem as I see it is that winbind forks and the parent populates ccache_list and the child populates memory_creds_list.
This leads to the password not being stored in a way that can be used by the rekinit code in krb5_ticket_refresh_handler.

As a dirty hack (attached) I tried populating memory_creds_list from the same location as ccache_list get populated (winbindd_raw_kerberos_login in winbind/winbindd_pam.c).

This hack "fixes" the problem.

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: winbind 2:3.6.3-2ubuntu2.3
ProcVersionSignature: Ubuntu 3.2.0-27.43-generic 3.2.21
Uname: Linux 3.2.0-27-generic x86_64
ApportVersion: 2.0.1-0ubuntu12
Architecture: amd64
Date: Wed Aug 15 11:30:27 2012
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Release amd64 (20120425)
ProcEnviron:
 LANGUAGE=en_GB:en
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_GB.UTF-8
 SHELL=/bin/bash
SambaClientRegression: No
SourcePackage: samba
UpgradeStatus: No upgrade log present (probably fresh install)
mtime.conffile..etc.default.winbind: 2012-07-06T14:00:57
mtime.conffile..etc.init.d.winbind: 2012-07-06T14:00:57

Ian Gordon (ian-gordon) wrote :
Ian Gordon (ian-gordon) on 2012-08-15
description: updated
Robie Basak (racb) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. I appreciate the quality of this bug report and I'm sure it'll be helpful to others experiencing the same issue.

This sounds like an upstream bug to me. Please can you verify this by building directly from the latest upstream source? If this can be confirmed as an upstream bug, the best route to getting it fixed in Ubuntu in this case would be to file an upstream bug if you're able to do that. Otherwise, I'm not sure what we can do directly in Ubuntu to fix the problem.

If you do end up filing an upstream bug, please link to it from here. Thanks!

Changed in samba (Ubuntu):
importance: Undecided → Critical
importance: Critical → Medium

The attachment "diry hack to "fix" issue" of this bug report has been identified as being a patch. The ubuntu-reviewers team has been subscribed to the bug report so that they can review the patch. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-reviewers team please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

tags: added: patch
Ian Gordon (ian-gordon) wrote :

I compiled 2.6.7 from the original source and it seems to not refresh kerberos tickets either.

I have reported this upstream.

See https://bugzilla.samba.org/show_bug.cgi?id=9098

Alexander Lazarević (e11bits) wrote :

Thanks for reporting this bug and any supporting documentation. Since this bug has enough information provided for a developer to begin work, I'm going to mark it as confirmed and let them handle it from here. Thanks for taking the time to make Ubuntu better!

Changed in samba (Ubuntu):
status: New → Confirmed
Changed in samba:
importance: Unknown → Medium
status: Unknown → Fix Released
Philippe Clérié (pclerie) wrote :

It looks like I am being hit by this. I have Windows users being periodically unable to access shares on a Samba server. I believe Samba put in a patch in 3.6.7 or thereabouts. Could we please update?

Thanks

Robie Basak (racb) wrote :

This was fixed upstream in 3.6.8 with commit 02c4886863e9a4066b89f2dcb8ff853bfbda7e86. Raring is on 2:3.6.9-1ubuntu1 so already contains the fix.

It looks like it'll be trivial to backport a fix to 12.04 if anybody needs this. But for this to happen, we need a well-defined test case and other information (see https://wiki.ubuntu.com/StableReleaseUpdates#SRU_Bug_Template), along with a commitment for somebody to test the fix when it lands in precise-proposed. Without this commitment, the update won't land in -updates.

Can somebody commit to testing a proposed update and write the impact statement and test case, please?

styro (anton-list) wrote :

I'm also hit by what seems to be the same bug on 12.04. This happens on both desktops and servers using winbind (pam_winbind) to manage kerberos keytabs and ticket caches.

We are authenticating against an Active Directory domain controller (2008R2).
We use the winbind/kerberos combo for:
* logging into Ubuntu desktops,
* transparent SSH access (via GSSAPI) to other Ubuntu/Debian machines,
* single sign on for webapps running on both Linux and Windows servers,
* and authenticating access to file shares (both Samba and Windows)

We often find our kerberos credential caches disappearing. This stops kerberos authentication working for eg SSH, HTTP(S), CIFS etc. Things work very well otherwise.

Impact:
* If it happens on the client, the client can't authenticate to any kerberised servers (Windows or Linux).
* If it happens on the server, all clients (Windows or Linux) are unable to connect to that server any more.
* The main impact is very flaky network authentication on an LTS release that we will have to live with for a few more years.

Workaround:
On the desktop run kinit to create a new ticket cache, or on a server restart the winbind daemon after logging in with a local account. This usually needs to be done once or twice a week on my desktop, but less frequently on servers.

Test case:
I don't have a good understanding on how to reliably reproduce it apart from waiting several days for it to stop authenticating. But the earlier posters above seem to have a better handle on that part.

I will commit to testing any proposed updates.

Robie Basak (racb) wrote :

I've uploaded a test fix to my experimental PPA (https://launchpad.net/~racb/+archive/experimental). Since I can't verify the fix myself, please can you test the package available from here before I request an archive upload? Once this is checked, I will request the fix be uploaded officially. The final proposed package will then need to be verified again before it can enter precise-updates.

styro (anton-list) wrote :

Thanks Robie, I've installed your PPA for testing.

As soon as I know if it has fixed anything (or if it hasn't), I'll report back.

styro (anton-list) wrote :

I've done some testing on machines with and without the new packages.

Conclusion: I think things have improved with the new packages.

More details:

It is hard to tell for sure as there are various things (eg using sudo, or unlocking the desktop etc) other than winbind that will refresh the Ticket Granting Ticket (TGT) and update/recreate the credentials cache. This can mask the original problem.

I managed to shorten the Active Directory ticket lifetimes (1 hour) and renewal periods (1 day) to the minimum to speed up testing. But after this I noticed that tickets were no longer being renewed at all, and expired tickets would stay in the credentials cache breaking authentication. This was worse than the original problem.

On a machine without the updates installed, the original problem was still happening even with the shorter ticket lifetimes. ie the credentials cache and Ticket Granting Ticket disappearing before the TGT reached it's renewal time limit. This problem never happened with the updated packages though.

Suspecting that the expired ticket problem was caused by the extremely short ticket lifetimes, I extended Active Directory ticket settings to 5hr expiry and 2 day renewal periods. This has slowed down testing a bit, but seems to have made that new expired ticket problem go away. ie tickets are now renewing properly again, and I haven't noticed the cache disappearing before the TGTs renewal period was up.

So - things do seem improved with the new packages (provided stupidly short ticket lifetimes aren't in use). The problem I encountered with very short lifetimes is unrelated to this bug report.

But without a reliable way to reproduce the original problem, I still can't be 100% certain that absence of evidence (not seeing the bug so far) equates to evidence of absence (the bug has been fixed).

styro (anton-list) wrote :

After further testing, I'm certain the updated packages have fixed the bug.

Leaving two machines running logged in and idle over the weekend, the unpatched machine lost its credential cache (again) while the patched one succesfully renewed its TGT all weekend. And it also successfully got a new one after the renewal limit was reached.

Thanks. It would be great if these updates could make their way into precise and quantal. I gather raring already has them from upstream.

styro (anton-list) wrote :

Just checking in...

These PPA updates have been solid for me still.

Is there any more testing or anything that needs doing to progress this further?

Robie Basak (racb) wrote :

@styro

Thanks for testing my package and sorry I haven't taken this further yet. I need to prepare an SRU. But this week is UDS, Linaro Connect and Raring Feature Freeze so I'm a bit tied up. I appreciate the reminder and please do poke me again if I haven't done anything by next week.

styro (anton-list) wrote :

Just a gentle prod...

:)

Q: Will updates be published for both precise and quantal? And will I need to further test both?

Robie Basak (racb) on 2013-03-20
description: updated
Robie Basak (racb) wrote :
Robie Basak (racb) wrote :
Robie Basak (racb) wrote :

@styro

I've prepared updates for both precise and quantal. Now awaiting a sponsor.

In the meantime, please could you fix up the test case? It needs to contain steps to reproduce the problem such that others are able to perform the same steps that you are.

Someone will need to further verify both fixes once they have been accepted into -proposed in order for them to get to -updates.

styro (anton-list) on 2013-03-20
description: updated
Changed in samba (Ubuntu):
status: Confirmed → Fix Released
Changed in samba (Ubuntu Precise):
importance: Undecided → Low
status: New → Triaged
Sebastien Bacher (seb128) wrote :

Thanks, I've verified that the fix is in raring and I'm sponsoring the fix to precise, subscribing ubuntu-sru.

I'm also going to skip quantal unless somebody does a strong case to get the bug fixed there, that's a nonLTS version and the SRU and QA teams are already stretched, so we are trying to create extra load on those series only when really required.

Changed in samba (Ubuntu Precise):
status: Triaged → In Progress
styro (anton-list) wrote :

Although inconvenient, personally I'm ok with quantal being skipped.

Thanks.

Hello Ian, or anyone else affected,

Accepted samba into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/samba/2:3.6.3-2ubuntu2.5 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in samba (Ubuntu Precise):
status: In Progress → Fix Committed
tags: added: verification-needed
styro (anton-list) wrote :

Thanks Brian, I've installed the winbind, libpam-winbind, libwbclient0, samba-common, smbclient packages (3.6.3-2ubuntu2.5) from proposed.

I'll keep you posted. It might take a week or two before I'm confident they are working correctly.

styro (anton-list) wrote :

Just an update...

3.6.3-2ubuntu2.5 is still working fine for me, and has not had any of the problems listed above reappear.

tags: added: verification-done
removed: verification-needed

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package samba - 2:3.6.3-2ubuntu2.5

---------------
samba (2:3.6.3-2ubuntu2.5) precise; urgency=low

  * d/patches/winbind-kerberos-refresh.patch: correctly cache credentials for
    automatic Kerberos ticket renewal (LP: #1037055).
 -- Robie Basak <email address hidden> Wed, 20 Mar 2013 07:48:57 +0000

Changed in samba (Ubuntu Precise):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.