openafs-modules-dkms 1.6.1-1+ubuntu0.2: module FTBFS on 3.8.0

Bug #1206387 reported by Luke Faraone on 2013-07-30
176
This bug affects 28 people
Affects Status Importance Assigned to Milestone
Precise Backports
Undecided
Unassigned
openafs (Ubuntu)
High
Unassigned
Precise
High
Unassigned

Bug Description

[Impact]
Since the backported Raring kernel 3.8 was released into Precise and is installed by default, OpenAFS cannot be installed on Precise. The out-of-tree OpenAFS kernel module needs to be upgraded to support kernel 3.8.

This happened before for the backported Quantal kernel 3.5 (bug 1015925), but this time so many patches are needed that it’s less risky to take a new upstream stable release, which is already well-tested, than to try to decide which patches to cherry-pick.

[Test Case]
apt-get install openafs-modules-dkms
(This should succeed on all supported Precise kernels, 3.2, 3.5, and 3.8.)

[Regression Potential]
OpenAFS 1.6.5 has been well-tested in the OpenAFS PPA https://launchpad.net/~openafs/+archive/stable, which has many users at MIT. The 1.6.x series is focused on important bug fixes and new kernel support (with main development happening on the master branch that will become 1.8.x, and Windows development happening on 1.7.x).

ProblemType: Package
DistroRelease: Ubuntu 12.04
Package: openafs-modules-dkms 1.6.1-1+ubuntu0.2
ProcVersionSignature: Ubuntu 3.5.0-37.58~precise1-generic 3.5.7.16
Uname: Linux 3.5.0-37-generic x86_64
NonfreeKernelModules: openafs
ApportVersion: 2.0.1-0ubuntu17.3
Architecture: amd64
DKMSKernelVersion: 3.8.0-27-generic
Date: Tue Jul 30 02:32:14 2013
InstallationMedia: Ubuntu 12.04.2 LTS "Precise Pangolin" - Release amd64 (20130214)
MarkForUpload: True
PackageArchitecture: all
PackageVersion: 1.6.1-1+ubuntu0.2
SourcePackage: openafs
Title: openafs-modules-dkms 1.6.1-1+ubuntu0.2: openafs kernel module failed to build
UpgradeStatus: No upgrade log present (probably fresh install)

Luke Faraone (lfaraone) wrote :
tags: removed: need-duplicate-check
Anders Kaseorg (andersk) wrote :

Hmm. We’re going to need to cherry-pick _a lot_ of patches to backport kernel 3.8 support to OpenAFS 1.6.1. I count at least 22 (on top of the ones we already took for kernel 3.5):

5842f85 afsd: include sys/resource.h in afsd_kernel.c
54db9af Linux: bypass: consolidate copy_page macros into a single function
76ab286 Linux 3.6: kmap_atomic API change
1bba976 Linux 3.6: dentry_open API change
b5a66fb Linux 3.6: d_alias and i_dentry are now hlists
aecd183 Linux: fix variable used to test for the iop create API
5210d97 Linux 3.6: create inode operation API change
4ab59d7 Linux 3.6: revalidate dentry op API change
6c22f2e Linux 3.6: lookup inode operation API change
0506af9 Linux: osi_vcache: Fix loop for the hlist case
5aae6e0 Linux 3.7: putname is no longer exported
cf33252 Linux: fix afs_putname wrapper for pre-3.7 kernels
ca94c83 Linux: Rework handling of names in the lookup functions
c21fded Linux: change test for new putname API
bbc6ee9 Linux 3.7: key instantiate API change
68c8f30 Linux 3.7: remove use of param.h and ioctl.h
cd91bba libafs: use kthread_run when available
5a21be4 LINUX: Indent osi_machdep.h maze
314fcfa Linux 3.8: session_keyring changes
bf9bcd0 Linux 3.8: vmtruncate removal
b0a1060 Linux: setpag() may replace credentials
a71cc55 Linux: osi_TryEvictVCache: Don’t skip the first dentry if D_ALIAS_IS_HLIST

But I’m not even a little bit confident that I’ve correctly identified all the right patches. Would it totally crazy to sync OpenAFS 1.6.5 into precise instead?

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in openafs (Ubuntu):
status: New → Confirmed
André Rummler (andre-rummler) wrote :

I just installed Raring HWE thus updating to the 3.8 kernel. Now I am also affected by this incompatibilty. Is it possible to get OpenAFS 1.6.5 in the standard repo? Or is there a recomended third party repo for OpenAFS?

Anders Kaseorg (andersk) wrote :

https://launchpad.net/~openafs/+archive/stable has the current version for all supported Ubuntu releases.

But yeah, I think the only real solution is to get OpenAFS 1.6.5 into precise universe. This is a patch for backporting the current OpenAFS package to precise.

The attachment "openafs_1.6.5-1ubuntu0.12.04.1.debdiff" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags: added: patch

Open a bug task for precise.

Changed in openafs (Ubuntu Precise):
status: New → Confirmed
importance: Undecided → High
Changed in openafs (Ubuntu):
status: Confirmed → Invalid
Luke Faraone (lfaraone) wrote :

I endorse the backporting of 1.6.5 as an SRU. This version of AFS has been widely tested and deployed, and looking over the changelog there do not appear to be material changes that would cause compatibility issues.

No packages depend on OpenAFS libraries, and the semantics for interacting with AFS have not changed.

Anders Kaseorg (andersk) on 2013-10-01
description: updated
Anders Kaseorg (andersk) on 2013-10-08
tags: added: regression-update
Anders Kaseorg (andersk) wrote :

Updated backport for saucy’s 1.6.5-1ubuntu3, which adds one patch for kernel 3.11 support (bug 1222376), so we won’t have to go through this yet again for the 12.04.4 cycle. No changes to the backport diff itself, just the debian/changelog context.

Anders Kaseorg (andersk) wrote :
Luke Faraone (lfaraone) on 2013-11-11
Changed in openafs (Ubuntu Precise):
status: Confirmed → In Progress
assignee: nobody → Luke Faraone (lfaraone)
Luke Faraone (lfaraone) wrote :

I've uploaded 1.6.5.1-1ubuntu0.12.04.1 and it is currently sitting in the queue for precise-proposed

Changed in openafs (Ubuntu Precise):
status: In Progress → Confirmed
assignee: Luke Faraone (lfaraone) → nobody
Luke Faraone (lfaraone) wrote :

I have validated the package pending acceptance in proposed (1.6.5.1-1~ubuntu0.12.04.1) builds and functions as an OpenAFS client on all four supported kernels in Precise. That is, 3.2.x, 3.5.x, 3.8.x, 3.11.x.

This was on a system with Debathena packages that installed MIT's OpenAFS and Kerberos configs and related Athena helper utilities, but was otherwise pristine.

Testing done:
 1. sudo dkms build -m openafs -v 1.6.5.1 -k <version>
 2. sudo dkms install […]
 3. Reboot into relevant kernel
 4. kinit <email address hidden>
 5. aklog
 6. ls /mit/lfaraone (symlink to /afs/athena.mit.edu/user/l/f/lfaraone)
 7. Verify file list is as expected

Steve Langasek (vorlon) wrote :

I don't think we can accept this backport as an enablement SRU. The openafs package doesn't just include a kernel module, it also includes all the related userspace components (as you know), including libraries and pam modules, for which you've included no test case; and there are extensive packaging changes in this version vs. the version in precise that are nearly impossible to review.

Since cherry-picking is a concern, I would suggest taking a wholesale backport of only the kernel module part of the upstream source, and validate that against an unchanged 1.6.1 userspace.

I am regrettably rejecting the uploaded package from the precise queue.

Jonathan Reed (jdreed) wrote :

Running a 1.6.5 module against a 1.6.1 userspace seems like a worse idea than taking 1.6.5 as an SRU. Would you accept the backport if we provided additional userland test cases?

I'm not interested in starting a debate about the HWE process, but from our perspective, the HWE stack broke a package in universe (albeit one with a small userbase compared to, say, Xorg). From our perspective, any working package is better than a broken package. Because of AFS' small userbase, this SRU should also have far less of an impact than one for, say, Xorg.

Jonathan Reed <email address hidden> writes:

> Running a 1.6.5 module against a 1.6.1 userspace seems like a worse idea
> than taking 1.6.5 as an SRU.

It *should* be harmless, but I think it's also safe to say that it's not
something anyone's likely intentionally testing.

You could pull up only the necessary changes for Linux kernel portability,
I think there were quite a few, but that's probably still easier than the
solution that Steve recommended. Transplanting the whole kernel source
tree will also require substantial changes to the Autoconf macros to probe
for and set the new defines, at which point you're doing all the work that
you had to do in order to cherry-pick the required changes anyway, but
doing it in a fairly unstable way.

--
Russ Allbery (<email address hidden>) <http://www.eyrie.org/~eagle/>

Steve Langasek (vorlon) wrote :

> You could pull up only the necessary changes for Linux kernel portability,
> I think there were quite a few, but that's probably still easier than the
> solution that Steve recommended.

Indeed, that would be our first choice from an SRU POV. I was just offering
the "upgrade only the kernel bits" option as a compromise given the concerns
that cherry-picking the kernel fixes would be too difficult.

I don't think we would want to accept a wholesale update, even with added
userspace test cases. The previous SRU upload was based on a package *not*
intended for a stable release update; it includes many changes that are
clearly appropriate to make in a development release in preparation for the
next stable release, but just identifying an appropriate set of test cases
for all of the userspace changes (including the packaging changes) would be
far more time-consuming than just cherry-picking the necessary kernel
changes.

Russ Allbery (rra-debian) wrote :

Steve Langasek <email address hidden> writes:

> I don't think we would want to accept a wholesale update, even with
> added userspace test cases. The previous SRU upload was based on a
> package *not* intended for a stable release update; it includes many
> changes that are clearly appropriate to make in a development release in
> preparation for the next stable release, but just identifying an
> appropriate set of test cases for all of the userspace changes
> (including the packaging changes) would be far more time-consuming than
> just cherry-picking the necessary kernel changes.

I will say that this is the serious problem with accepting new kernels
into a stable release. If you accept a new kernel version but don't
accept new upstream releases of all the separately-packaged kernel
modules, you basically break all those packages for stable users. I had
that concern when Debian was talking about doing the same thing in stable
releases. You can require backporting of just the kernel compilation
fixes, but often that's quite a lot of work and it ends up just not
happening, so the packages just stay broken for users.

It doesn't affect me directly, of course, since I don't use Ubuntu, and
y'all should certainly feel free to decide on the strategy that works for
your community, but it might be an interesting data point that this was
one of my arguments against supporting Ubuntu internally in my group when
we had that discussion internally a couple of weeks ago.

--
Russ Allbery (<email address hidden>) <http://www.eyrie.org/~eagle/>

Steve Langasek (vorlon) wrote :

On Mon, Dec 02, 2013 at 10:31:26PM -0000, Russ Allbery wrote:
> I will say that this is the serious problem with accepting new kernels
> into a stable release. If you accept a new kernel version but don't
> accept new upstream releases of all the separately-packaged kernel
> modules, you basically break all those packages for stable users. I had
> that concern when Debian was talking about doing the same thing in stable
> releases. You can require backporting of just the kernel compilation
> fixes, but often that's quite a lot of work and it ends up just not
> happening, so the packages just stay broken for users.

The kernel hardware enablement policy takes this into consideration, and
does not push new upstream versions of the kernel to users who installed
from older media. Only if you use the new point release media, or
explicitly opt in to the new enablement stack, do you get the newer kernels.

  https://wiki.ubuntu.com/Kernel/LTSEnablementStack

Even without considering third-party modules, we certainly don't have the
resources to guarantee that a new upstream kernel version will cause no
regressions for our users - every new kernel upstream release is likely to
regress support for *some* subset of older hardware. The compromise, in
order to enable newer hardware on older LTS releases, is to continue to
support users installing from older media.

If you have newer hardware that requires the newer enablement stack, it's
better to have it available than not, even if that means some out-of-tree
modules are not available. If you don't need the new enablement stack, then
it's recommended that you use the LTS kernel.

And if there's sufficient demand for openafs support on top of the hwe
kernels, then we're happy to facilitate that, but it needs to be done in a
way that is equally low risk for users of the LTS kernels.

Jonathan Reed (jdreed) wrote :

On Dec 2, 2013, at 6:11 PM, Steve Langasek wrote:

> Only if you use the new point release media, or
> explicitly opt in to the new enablement stack, do you get the newer kernels.
>
> [...]
>
> If you have newer hardware that requires the newer enablement stack, it's
> better to have it available than not, even if that means some out-of-tree
> modules are not available. If you don't need the new enablement stack, then
> it's recommended that you use the LTS kernel.

Not to get too off topic (and I tried to bring this up in UDS chats when rolling releases/HWE became a thing), but that really only works in homogenous environments, which are not really a thing in academia or other large non-enterprise environments. In our case (and I suspect Russ speaks for a similar environment), we replace our workstations on a rolling 3-year cycle (meaning we get new hardware whenever Dell feels like changing the chipset). So some machines require HWE, others don't.

The other challenge is that to the average end user, it is not at all clear that using point release media means an HWE stack, but using original media and upgrading to the point release doesn't. We had a terrible time when the Quantal HWE came out, and whether or not they had working OpenAFS was determined by whether they installed from an ISO image or a net install.

Steve Langasek (vorlon) wrote :

On Tue, Dec 03, 2013 at 12:04:52AM -0000, Jonathan Reed wrote:
> Not to get too off topic (and I tried to bring this up in UDS chats when
> rolling releases/HWE became a thing), but that really only works in
> homogenous environments, which are not really a thing in academia or
> other large non-enterprise environments. In our case (and I suspect
> Russ speaks for a similar environment), we replace our workstations on a
> rolling 3-year cycle (meaning we get new hardware whenever Dell feels
> like changing the chipset). So some machines require HWE, others don't.

I understand, but the point is that this is not something that would have
been addressed by *not* offering the HWE stacks. Not having HWE stacks
would have simply meant that the hardware in question wouldn't work with the
LTS at all.

Micheal Waltz (ecliptik) wrote :

Is it possible to get the OpenAFS 1.6.5 packages into precise-backports while not replacing the original 1.6.1 packages? I thought the backports repository was build for situations like this where newer versions can be installed on older LTS releases.

Changed in openafs (Ubuntu Precise):
status: Confirmed → In Progress
assignee: nobody → Rafael David Tinoco (inaddy)
Changed in openafs (Ubuntu):
assignee: nobody → Rafael David Tinoco (inaddy)

After this discussion (and some other customers/users requests on the same bug), knowing that the 1.6.5 backport was not eligible for a SRU (just like Steve pointed out) I started to cherry-pick code from openafs 1.6.5 to openafs 1.6.1, so the openafs dkms module was able to compile on HWE kernels.

With that I could see that this approach would also not result in a eligible RSU. There are too many changes from kernel 3.2 to kernel 3.11 and openafs has a huge amount of pre-defined code based on these changes (Rightly pointed out by Anders).

After fixing some wrongly auto-generated includes, I could see that there were 2 approaches for bringing 1.6.5 behavior to 1.6.1:

1) Remove "STRUCT_TASK_STRUCT_HAS_CRED" define. Autotools is correctly checking for the existence of a "credentials" structure inside task_struct (kernel). Since newer kernels (3.x) have this structure, the code defines STRUCT_TASK_STRUCT_HAS_CRED variable and starts accessing all credential variables directly from kernel defined structures (includes). Removing this would make openafs behave like it used to in the past (older kernels from 2.6.x) and would imply fixing all "current_task"->cred structure (changing upstream code on that specific version). Of course, after this, even more changes would be expected.

-> not a good approach

2) Cherry-pick code from 1.6.5 to 1.6.1. There are 398 commits between this two versions and, taking in consideration only dkms module, this could be a reasonable direction. The problem is that since there are a huge amount of changes in the kernel structures for process, sched, security (between v3.2 and v3.11) the changes wouldn't be acceptable on SRU.

Some of needed changes would be: dentry_open new prototype, kmap_atomic new prototype, vmtruncated deprecated, task_struct cred session_keyring location, new proc_create function on module, and so on...

-> not a good approach also

* see next comment

IMHO backporting OpenAFS from trusty to precise (-backports) would be the best in this case and probably meet with customers/users requirement/expectation (just like Micheal pointed out).

With that in mind,

I have backported openafs, from trusty (1.6.7-1) to precise (1.6.7-1ubuntu1), and made available packages into a ppa that customers and users can access. Of course, just like in universe repository (where openafs stands), support here is based on a best-effort/community scenario.

https://launchpad.net/~inaddy/+archive/lp1206387

I do need for this packages to be tested since I'm asking here for this to be available on -backports repository as well. For now, I was able to compile successfully openafs dkms on different kernels (3.5, 3.8, 3.11, 3.13) running on precise.

inaddy@12-04-precise-lts-amd64:/var/lib/dkms/openafs/1.6.7$ find . -name openafs.ko
./3.13.0-27-generic/x86_64/module/openafs.ko
./3.8.0-41-generic/x86_64/module/openafs.ko
./3.11.0-22-generic/x86_64/module/openafs.ko
./3.5.0-49-generic/x86_64/module/openafs.ko

Hope this helps... Attaching source on next commits.
Please provide feedback on test results.

Micheal Waltz (ecliptik) wrote :

Thank you very much Rafael. I'll test the packages from the PPA today on our HWE installs and provide any feedback we may have.

Anders Kaseorg (andersk) wrote :

The version number is too high. Packages in -backports need to have a smaller version number than the releases they were backported from; otherwise release upgrades will not work as expected. Usually this is accomplished with a decreasing ‘~’, as in 1.6.7-1~precise1 or 1.6.7-1~ubuntu8.04.1.

Anders Kaseorg (andersk) wrote :

(Er, I meant ~ubuntu10.04.1, of course.)

Changed in openafs (Ubuntu):
assignee: Rafael David Tinoco (inaddy) → nobody

Of course Anders. You're absolutely right, my fault.

I've updated package version and fixed ppa repository.
I'm also attaching new package source.

Thanks!

PS: Functional tests still needed.

openafs_1.6.7-1~precise1.debian.tar.xz

openafs_1.6.7-1~precise1_source.changes

Stephen Corbin (bigredsshop) wrote :
Download full text (4.4 KiB)

This system has been removed and the fils all backed up for futurew use of a new install. Thanks for the support Rafael.

Stephen Corbin
<email address hidden>

-----Original Message-----
From: Rafael David Tinoco <email address hidden>
To: bigredsshop <email address hidden>
Sent: Tue, May 27, 2014 8:11 am
Subject: [Bug 1206387] Re: openafs-modules-dkms 1.6.1-1+ubuntu0.2: module FTBFS on 3.8.0

After this discussion (and some other customers/users requests on the
same bug), knowing that the 1.6.5 backport was not eligible for a SRU
(just like Steve pointed out) I started to cherry-pick code from openafs
1.6.5 to openafs 1.6.1, so the openafs dkms module was able to compile
on HWE kernels.

With that I could see that this approach would also not result in a
eligible RSU. There are too many changes from kernel 3.2 to kernel 3.11
and openafs has a huge amount of pre-defined code based on these changes
(Rightly pointed out by Anders).

After fixing some wrongly auto-generated includes, I could see that
there were 2 approaches for bringing 1.6.5 behavior to 1.6.1:

1) Remove "STRUCT_TASK_STRUCT_HAS_CRED" define. Autotools is correctly
checking for the existence of a "credentials" structure inside
task_struct (kernel). Since newer kernels (3.x) have this structure, the
code defines STRUCT_TASK_STRUCT_HAS_CRED variable and starts accessing
all credential variables directly from kernel defined structures
(includes). Removing this would make openafs behave like it used to in
the past (older kernels from 2.6.x) and would imply fixing all
"current_task"->cred structure (changing upstream code on that specific
version). Of course, after this, even more changes would be expected.

-> not a good approach

2) Cherry-pick code from 1.6.5 to 1.6.1. There are 398 commits between
this two versions and, taking in consideration only dkms module, this
could be a reasonable direction. The problem is that since there are a
huge amount of changes in the kernel structures for process, sched,
security (between v3.2 and v3.11) the changes wouldn't be acceptable on
SRU.

Some of needed changes would be: dentry_open new prototype, kmap_atomic
new prototype, vmtruncated deprecated, task_struct cred session_keyring
location, new proc_create function on module, and so on...

-> not a good approach also

* see next comment

--
You received this bug notification because you are subscribed to a
duplicate bug report (1249289).
https://bugs.launchpad.net/bugs/1206387

Title:
  openafs-modules-dkms 1.6.1-1+ubuntu0.2: module FTBFS on 3.8.0

Status in “openafs” package in Ubuntu:
  Invalid
Status in “openafs” source package in Precise:
  In Progress

Bug description:
  [Impact]
  Since the backported Raring kernel 3.8 was released into Precise and is
installed by default, OpenAFS cannot be installed on Precise. The out-of-tree
OpenAFS kernel module needs to be upgraded to support kernel 3.8.

  This happened before for the backported Quantal kernel 3.5 (bug
  1015925), but this time so many patches are needed that it’s less
  risky to take a new upstream stable release, which is already well-
  tested, than to try to decide which patches to cherry-pick.

  [Test Case]
...

Read more...

Micheal Waltz (ecliptik) wrote :

I was able to install the openafs 1.6.7-1~precise1 packages from the provided PPA and they installed and built modules via DKMS with the HWE linux-generic-lts-saucy kernel on a 12.04 system.

Currently we can use these packages in a separate repo we add to systems during install time and doing apt-get update, however having openafs 1.6.7 or 1.6.5 in the backports repository for precise would save some extra steps and keep up with any upstream package updates.

Micheal,

I have fulfilled the -backports request as guidelines request on public bug:

https://bugs.launchpad.net/precise-backports/+bug/1324288

Awaiting for ubuntu-backporters to respond.

Sorry, the correct LP number is LP #1324288.

Changed in openafs (Ubuntu Precise):
status: In Progress → Fix Released
Anders Kaseorg (andersk) wrote :

This is fixed in precise-backports, not precise. Can you please unmark this as fixed in precise, if nothing else so that affected users can find this report?

Changed in precise-backports:
status: New → Fix Released
Changed in openafs (Ubuntu Precise):
assignee: Rafael David Tinoco (inaddy) → nobody
Anders Kaseorg (andersk) wrote :

Since I don’t have the ability to unmark this as fixed in Precise (which it is not), as opposed to Precise Backports (which it is), I’m reopening it in Ubuntu. Users are still running into this problem because most users do not have backports enabled. For example, bug 1325352 was just filed today.

If someone with the appropriate permissions can set this back to Confirmed in Precise, that would be appreciated.

Changed in openafs (Ubuntu):
status: Invalid → Confirmed
Felix Geyer (debfx) wrote :

> If someone with the appropriate permissions can set this back to Confirmed in Precise, that would be appreciated.

Done.

Changed in openafs (Ubuntu Precise):
status: Fix Released → Confirmed
Anders Kaseorg (andersk) wrote :

Thanks, Felix.

Changed in openafs (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.