update from natty to oneiric hangs on libpam0g upgrade and cups reload

Bug #854490 reported by Jean-Baptiste Lallement
38
This bug affects 8 people
Affects Status Importance Assigned to Milestone
apt (Ubuntu)
Invalid
Medium
Unassigned
Oneiric
Won't Fix
Medium
Michael Vogt
Precise
Invalid
Medium
Unassigned
cups (Ubuntu)
Fix Released
High
Martin Pitt
Oneiric
Fix Released
High
Martin Pitt
Precise
Invalid
Undecided
Unassigned

Bug Description

TEST CASE:
1. Install Natty and apply updates
2. run update-manager -d and proceed with the upgrade to 11.10

Result:
The upgrade hangs at libpam0g upgrade when cups is reloaded. The last message displayed on console and terminal log is:
Restarting services possibly affected by the upgrade:
  gdm: reloading...done.
  cups: stopping...starting...

The following message is displayed and repeated in syslog
Sep 20 09:19:36 ubuntu-VirtualBox kernel: [ 1893.528204] type=1400 audit(1316503176.331:210): apparmor="STATUS" operation="profile_replace" name="/usr/lib/cups/backend/cups-pdf" pid=18386 comm="apparmor_parser"
Sep 20 09:19:36 ubuntu-VirtualBox kernel: [ 1893.528494] type=1400 audit(1316503176.331:211): apparmor="STATUS" operation="profile_replace" name="/usr/sbin/cupsd" pid=18386 comm="apparmor_parser"
Sep 20 09:19:36 ubuntu-VirtualBox init: cups main process (18387) terminated with status 127
Sep 20 09:19:36 ubuntu-VirtualBox init: cups main process ended, respawning

The following cups processes are running:
root 16275 0.0 0.0 2028 512 pts/2 S+ 09:11 0:00 /bin/sh -e /etc/init.d/cups start
root 16279 0.0 0.1 4944 1292 pts/2 S+ 09:11 0:00 start cups

Killing process 16275 allows to continue the upgrade.

ProblemType: Bug
DistroRelease: Ubuntu 11.04
Package: update-manager 1:0.150.3
ProcVersionSignature: Ubuntu 2.6.38-11.48-generic 2.6.38.8
Uname: Linux 2.6.38-11-generic i686
Architecture: i386
Date: Tue Sep 20 09:13:29 2011
GConfNonDefault:
 /apps/update-manager/check_new_release_ignore=
 /apps/update-manager/first_run=false
 /apps/update-manager/window_size=(600,600)
InstallationMedia: Ubuntu 11.04 "Natty Narwhal" - Release i386 (20110427.1)
PackageArchitecture: all
ProcEnviron:
 LANGUAGE=en_US:en
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: update-manager
UpgradeStatus: Upgraded to natty on 2011-09-20 (0 days ago)

Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :
Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :
Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :
Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :
Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :
Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :
affects: update-manager (Ubuntu Oneiric) → cups (Ubuntu Oneiric)
Changed in cups (Ubuntu Oneiric):
importance: Undecided → Critical
assignee: nobody → Canonical Desktop Team (canonical-desktop-team)
tags: added: iso-testing
Revision history for this message
Till Kamppeter (till-kamppeter) wrote :

pitti, can you look into it? Looks like an upstart issue.

Revision history for this message
Michael Vogt (mvo) wrote :

I see this in my upgrade test as well, syslog tells me that cups exits with status 127 and is started again by init. This goes on forever. I re-run this now with more debug options.

Changed in cups (Ubuntu Oneiric):
status: New → Confirmed
Revision history for this message
Till Kamppeter (till-kamppeter) wrote :

Can you set CUPS to extended debug mode by editing /etc/cups/cupsd.conf, making sure it contains a line "LogLevel debug2" and no other LogLevel line, then repeat the test and after that post the /var/log/cups/error_log file? Thanks.

Changed in cups (Ubuntu Oneiric):
status: Confirmed → Incomplete
Revision history for this message
Michael Vogt (mvo) wrote :

Ok, turns out the problem is that when init tries to launch cupsd there is a missing symbol: "_cups_strncasecmp".

Revision history for this message
Michael Vogt (mvo) wrote :

# /usr/sbin/cupsd
/usr/sbin/cupsd: symbol lookup error: /usr/lib/i386-linux-gnu/libcupsmime.so.1: undefined symbol: _cups_strncasecmp

This happens very early during the upgrade in my i386 test VM.

Changed in cups (Ubuntu Oneiric):
status: Incomplete → Confirmed
Revision history for this message
Michael Vogt (mvo) wrote :

<tkamppeter> mvo, this look like the CUPS library and the CUPS daemon being updated in the wrong order, the daemon being restarted with a non-matching CUPS library being present.
<mvo> tkamppeter: right, at this point libcupsmime1 is unpacked but not configured yet (according to the logs)

Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :

cups error log when upgrade is frozen.

Revision history for this message
Martin Pitt (pitti) wrote :

I think the error is in libpam0g.postinst. It tries to restart services while cups (and its libraries) are unconfigured and not running, which is prone to fail, not only for cups, also for all other services.

It looks like libpam0g.postinst makes no effort to only restart services which are actually running, it just tries to restart them all. The only special case is "gdm" where it checks for "idl status | grep -q stop/waiting". IMHO it should do this for all services.

Steve, can you please have a look at this? Does this make sense?

affects: cups (Ubuntu Oneiric) → pam (Ubuntu Oneiric)
Changed in pam (Ubuntu Oneiric):
assignee: Canonical Desktop Team (canonical-desktop-team) → Steve Langasek (vorlon)
Revision history for this message
Martin Pitt (pitti) wrote :

Steve, while you are at it, can you please replace "postgresql-8.2 postgresql-8.3" with "postgresql"?

tags: added: rls-mgr-o-tracking
Revision history for this message
Martin Pitt (pitti) wrote :

I'll have a look why cups hangs indefinitely here. It can't succeed, as the package and it's dependencies are not configured, but it should not hang forever.

Changed in cups (Ubuntu Oneiric):
assignee: nobody → Martin Pitt (pitti)
importance: Undecided → High
status: New → In Progress
Revision history for this message
Steve Langasek (vorlon) wrote :
Download full text (3.1 KiB)

This is a cups-specific failure, and indicates a real bug in the cups upstart job. While we could treat cups differently in the pam postinst, this would be a workaround as pam *already* checks for whether the service is installed+configured before trying to restart it. So dpkg reports cups is installed, which is why pam tries to restart it, and the fact that it doesn't restart is a cups bug.

> <mvo> tkamppeter: right, at this point libcupsmime1 is unpacked but not configured yet (according to the logs)

However, *cups* was not deconfigured when the new libcupsmime1 was unpacked. If the new libcupsmime1 doesn't work with the old cups, there's a bug in the package dependencies here. It makes no difference if libcupsmime1 is configured vs. unpacked, because configuring libcupsmime1 would not change the missing symbols.

Here's the ldd output from natty cupsd binary on an oneiric system, showing the issue:

$ ldd -d -r cups/usr/sbin/cupsd > /dev/null
undefined symbol: _pwgGetPageSize (cups/usr/sbin/cupsd)
undefined symbol: _pwgDestroy (cups/usr/sbin/cupsd)
undefined symbol: _pwgGetOutputBin (cups/usr/sbin/cupsd)
undefined symbol: _pwgGetType (cups/usr/sbin/cupsd)
undefined symbol: _pwgGetSize (cups/usr/sbin/cupsd)
undefined symbol: _pwgGetInputSlot (cups/usr/sbin/cupsd)
undefined symbol: _pwgGetSource (cups/usr/sbin/cupsd)
undefined symbol: _pwgCreateWithPPD (cups/usr/sbin/cupsd)
undefined symbol: _pwgCreateWithFile (cups/usr/sbin/cupsd)
undefined symbol: _pwgWriteFile (cups/usr/sbin/cupsd)
undefined symbol: _pwgGetMediaType (cups/usr/sbin/cupsd)
$

It looks like these symbols came from libcups2, not from libcupsmime. When these symbols were dropped, libcups *should* have had an soname change upstream. Since it didn't, the next best fix would have been to change the package name. But it's probably too late in the cycle for this now as well, which means that the new libcups2 needs to add a Breaks: against the old cups - and against any other packages which could have been using these symbols (which I assume were considered "private" symbols for cups internal use only).

The other part of this bug is that the cups upstart job is written in such a way that it evades the upstart respawn limit. The default respawn limit in upstart marks a job as failed if it respawns 10 times in 5 seconds; but because the cups job is failing *after* the main process is considered "running" (due to the use of a foregrounded job instead of 'expect daemon' or 'expect fork', which prevents upstart from catching the symbol error when it happens), the post-start script has already been spawned, and the post-start script takes 10 seconds on its own to time out. So the respawns are slow - one every 10 seconds - and upstart never gives up!

(An argument can be made that upstart should have some way of detecting when a job consistently dies before the post-start runs and handle this case the same as hitting the respawn limit if it happens COUNT times in a row.)

Between the foregrounded service process and the polling in the post-start script, this job seems to actually do everything in its power to be slow and fragile. ;) I think we would have better results with an...

Read more...

Revision history for this message
Steve Langasek (vorlon) wrote :

Attached is a cups upstart job that avoids the need for polling in the post-start script by using 'expect daemon', which is how this *should* work. However, there seem to be a pair of bugs preventing this from working as designed. First, upstart does not wait until cups is daemonized before considering the main process started; this may have to do with the fact that cups re-execs itself with different arguments (strange?). Second, even though cupsd immediately dies with a symbol error, upstart marks the job as "running" with no PID.

We still need to fix the libcups2/cups package relationships regardless, and we should probably just leave the cups upstart job alone for now. I'll raise bugs on upstart for the discovered issues.

Revision history for this message
Steve Langasek (vorlon) wrote :
Changed in pam (Ubuntu Oneiric):
status: Confirmed → Invalid
Revision history for this message
Steve Langasek (vorlon) wrote :

Ah, no, libcups2 already Breaks: old cups, so that's not the issue.

The issue is that the new libcupsmime has been unpacked and it has a dependency on the new libcups2 which has not yet been unpacked. But apt doesn't deconfigure reverse-dependencies of libraries when unpacking the new library, even though that library is not in a state that it could be configured because its dependencies have not been unpacked yet. :/

Could apt do something here to improve the ordering? Otherwise, a workaround in pam is the only option, since we still have the old version of cups unpacked, so an upstart job fix is insufficient.

affects: pam (Ubuntu Oneiric) → apt (Ubuntu Oneiric)
Changed in apt (Ubuntu Oneiric):
assignee: Steve Langasek (vorlon) → Michael Vogt (mvo)
status: Invalid → New
Revision history for this message
Naba Kumar (naba) wrote :

For those interested to get past this stuck state to continue 11.10 installation can do following:

1. Edit the file /etc/init.d/cups, and add "exit 0" right in the beginning as first statement. This will bypass the failures in starting cups.
2. Find out PID of current running process of this script with "ps ax | grep init.d" and kill it with "kill -9 <PID>". After this, installation will continue.
3. Revert back step 1 once installation is done, and wait for cups related updates to arrive :)

Revision history for this message
Martin Pitt (pitti) wrote :
Changed in cups (Ubuntu Oneiric):
status: In Progress → Fix Committed
Changed in apt (Ubuntu Oneiric):
importance: Critical → Medium
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cups - 1.5.0-6bzr1

---------------
cups (1.5.0-6bzr1) oneiric; urgency=low

  Upload current Debian packaging bzr.

  * debian/local/cups.upstart: Lower the timeout for waiting for the socket
    to 3 seconds. Tweak the respawn count and limit to be longer than the
    socket poll time, so that a continuously failing cupsd process will not
    cause an eternal respawn loop. (LP: #854490)
  * debian/control: Have libcupsmime1 break pre-1.5 versions of cups as well,
    so that we avoid having a configured "cups" packae with an upgraded, but
    unconfigured libcupsmime1 package. This takes care of LP: #854490 during
    upgrades.
 -- Martin Pitt <email address hidden> Wed, 21 Sep 2011 08:04:56 +0200

Changed in cups (Ubuntu Oneiric):
status: Fix Committed → Fix Released
Revision history for this message
Steve Langasek (vorlon) wrote :

Reports are that the cups change (added Breaks: against cups) work around this problem, so I don't think any apt change is needed here. Next cycle we should get some good improvements to the apt resolver.

Changed in apt (Ubuntu Oneiric):
status: New → Won't Fix
Changed in apt (Ubuntu):
assignee: Michael Vogt (mvo) → nobody
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in apt (Ubuntu):
status: New → Confirmed
Changed in apt (Ubuntu Precise):
status: New → Confirmed
importance: Undecided → Medium
Changed in cups (Ubuntu Precise):
status: New → Invalid
tags: removed: rls-mgr-o-tracking
Revision history for this message
dino99 (9d9) wrote :
Changed in apt (Ubuntu):
status: Confirmed → Invalid
Changed in apt (Ubuntu Precise):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.