circular dependency causes zookeeperd to not be running after installation of zookeeperd

Bug #1007433 reported by Scott Moser on 2012-06-01
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
apt (Ubuntu)
Medium
Unassigned
dpkg (Ubuntu)
Medium
Unassigned
zookeeper (Ubuntu)
Medium
Unassigned

Bug Description

In a fresh quantal cloud image (i used ubuntu-quantal-daily-amd64-server-20120531), 'apt-get install zookeeperd' does not result in a running zookeeperd.

to reproduce:
 * launch fresh instance
 * sudo apt-get install --assume-yes zookeeperd
  * status zookeeper

Note, if you have previously installed zookeeper (not zookeeperd) then things will be fine.

ProblemType: Bug
DistroRelease: Ubuntu 12.10
Package: zookeeperd 3.3.5+dfsg1-2
ProcVersionSignature: User Name 3.4.0-3.8-generic 3.4.0
Uname: Linux 3.4.0-3-generic x86_64
ApportVersion: 2.1.1-0ubuntu1
Architecture: amd64
Date: Fri Jun 1 13:44:36 2012
Ec2AMI: ami-000000ed
Ec2AMIManifest: FIXME
Ec2AvailabilityZone: nova
Ec2InstanceType: m1.small
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
PackageArchitecture: all
ProcEnviron:
 TERM=screen
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: zookeeper
UpgradeStatus: No upgrade log present (probably fresh install)

Scott Moser (smoser) wrote :
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in zookeeper (Ubuntu):
status: New → Confirmed
Scott Moser (smoser) wrote :

This is reproducible if you purge everything that was installed, and then try again.
$ addl="ca-certificates-java default-jre-headless fontconfig-config icedtea-7-jre-cacao icedtea-7-jre-jamvm java-common libavahi-client3 libavahi-common-data libavahi-common3 libcups2 libfontconfig1 libjline-java libjpeg-turbo8 libjpeg8 liblcms2-2 liblog4j1.2-java libnetty-java libnspr4 libnss3 libnss3-1d libservlet2.5-java libxerces2-java libxml-commons-external-java libxml-commons-resolver1.1-java libzookeeper-java openjdk-7-jre-headless openjdk-7-jre-lib ttf-dejavu-core tzdata-java zookeeper zookeeperd"
$ sudo apt-get --purge remove $addl
$ sudo apt-get install zookeeperd

It looks like the issue is that zookeeperd uses java (/usr/bin/java) but that does not get set up until update-alternatives is run, which is running after zookeeperd is installed.

Vincent Ladeuil (vila) wrote :
Download full text (4.9 KiB)

I encounter the same issue while trying to use the same image for juju.

If I use a precise image, cloud-init-output.log ends with:

Installed /usr/lib/juju/juju
Processing dependencies for juju==0.5
Finished processing dependencies for juju==0.5
  % Total % Received % Xferd Average Speed Time Time Time Current
                                 Dload Upload Total Spent Left Speed

  0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 10 100 10 0 0 80 0 --:--:-- --:--:-- --:--:-- 81
2012-06-01 09:42:15,150 INFO Initializing zookeeper hierarchy
2012-06-01 09:42:15,203 INFO 'initialize' command finished successfully
juju-machine-agent start/running, process 4776
juju-provision-agent start/running, process 4779

with the quantal image that's:

Installed /usr/lib/juju/juju
Processing dependencies for juju==0.5
Finished processing dependencies for juju==0.5
  % Total % Received % Xferd Average Speed Time Time Time Current
                                 Dload Upload Total Spent Left Speed

  0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 10 100 10 0 0 118 0 --:--:-- --:--:-- --:--:-- 120
could not connect before timeout
2012-06-01 13:43:21,156 ERROR could not connect before timeout

Accordingly, (fuzzily matching the /var/log/syslog files) in precise:

Jun 1 09:41:16 server-11237 acpid: 1 rule loaded
Jun 1 09:41:16 server-11237 acpid: waiting for events: event logging is off
Jun 1 09:41:16 server-11237 cron[874]: (CRON) INFO (Running @reboot jobs)
Jun 1 09:41:16 server-11237 kernel: [ 10.366463] init: plymouth-upstart-bridge main process (781) killed by TERM signal
Jun 1 09:41:21 server-11237 ntpdate[597]: step time server 91.189.94.4 offset 1.815666 sec
Jun 1 09:42:03 server-11237 dhclient: DHCPREQUEST of 10.55.60.51 on eth0 to 10.55.60.1 port 67

in quantal:

Jun 1 13:42:04 server-11254 acpid: 1 rule loaded
Jun 1 13:42:04 server-11254 acpid: waiting for events: event logging is off
Jun 1 13:42:04 server-11254 cron[915]: (CRON) INFO (Running @reboot jobs)
Jun 1 13:42:05 server-11254 kernel: [ 10.863797] init: plymouth-stop pre-start process (1010) terminated with status 1
Jun 1 13:42:09 server-11254 ntpdate[669]: step time server 91.189.94.4 offset 1.476571 sec
Jun 1 13:42:17 server-11254 ntpdate[1043]: adjust time server 91.189.94.4 offset -0.000002 sec
Jun 1 13:42:39 server-11254 ntpdate[1565]: adjust time server 91.189.94.4 offset 0.000011 sec
Jun 1 13:42:49 server-11254 kernel: [ 53.716385] init: zookeeper main process (7600) terminated with status 2
Jun 1 13:42:49 server-11254 kernel: [ 53.716407] init: zookeeper main process ended, respawning
Jun 1 13:42:49 server-11254 kernel: [ 53.721173] init: zookeeper main process (7603) terminated with status 2
Jun 1 13:42:49 server-11254 kernel: [ 53.721194] init: zookeeper main process ended, respawning
Jun 1 13:42:49 server-11254 kernel: [ 53.726910] init: zookeeper main process (7606) terminated with status 2
Jun 1 13:42:49 server-11254 kernel: [ 53.726933] init: zookeeper main process ended, respawning
...

Read more...

Scott Moser (smoser) on 2012-06-01
Changed in zookeeper (Ubuntu):
importance: Undecided → Medium
Clint Byrum (clint-fewbar) wrote :

I think the problem lies here:

adduser: Warning: The home directory `/var/lib/zookeeper' does not belong to the user you are currently creating.

But in the packaging:

debian/zookeeper.dirs:/var/lib/zookeeper

Since the package "owns" the dir, it will be owned by root, which is wrong.

So the fix is to remove that line from zookeeper.dirs. I don't believe this will have any ill effects, since the adduser will create it.

I am testing that change right now.

Changed in zookeeper (Ubuntu):
status: Confirmed → In Progress
assignee: nobody → Clint Byrum (clint-fewbar)
Scott Moser (smoser) wrote :

@Clint,
  I'm pretty sure the issue is that java is not ready for use when zookeeperd attempts to start.
  Evidence to that fact is that if you install 'default-jre-headless' before installing zookeeperd, you get functional zookeeper, and you still get the warning.

  I think the issue is that the alternatives are just not getting setup (providing /usr/bin/java) before zookeeperd starts.

Clint Byrum (clint-fewbar) wrote :

Scott, good call. That was definitely a red herring.

So this appears to be a circular dependency problem with dpkg... because

zookeeperd -> zookeeper -> default-jre-headless -> openjdk-7-jre-headless

This should always guarantee that openjdk-7-jre-headless is configured before zookeeperd. Instead it is somehow being configured *after* default-jre-headless and thus, the alternatives are not set up.

There is a workaround.. if you install with --no-install-recommends

The order is correct.

This also exposes two other bugs to me:

1) /var/lib/zookeeper should belong to zookeeper
2) the upstart job should be reporting the failure, not 'start/running'

Changed in zookeeper (Ubuntu):
status: In Progress → Triaged
assignee: Clint Byrum (clint-fewbar) → nobody
summary: - zookeeperd not running after installation of zookeeperd
+ circular dependency causes zookeeperd to not be running after
+ installation of zookeeperd
Raphaël Hertzog (hertzog) wrote :

Clint, where do you see a *circular* dependency? The dependencies that you show are not circular.

Also the ordering of package installation is decided by apt and not dpkg, so this is an apt bug if any.

Excerpts from Raphaël Hertzog's message of 2012-06-02 06:25:02 UTC:
> Clint, where do you see a *circular* dependency? The dependencies that
> you show are not circular.
>
> Also the ordering of package installation is decided by apt and not
> dpkg, so this is an apt bug if any.
>

I don't necessarily see a circular dependency, but it doesn't make
sense that in this case we see default-jre-headless configured *before*
its dependency, openjdk-7-jre-headless. This leads to zookeeper being
configured before openjdk-7-jre-headless as well, which then leads to
a missing /usr/bin/java. I thought the order of configuring was a dpkg
operation, not apt.

Changed in apt (Ubuntu):
importance: Undecided → Medium
Changed in dpkg (Ubuntu):
importance: Undecided → Medium
James Page (james-page) wrote :

This bug is certainly 'revealed' by the switch of default Java from 6 to 7.

I did a bit of testing around the last few packages needed to support removal of openjdk-6 from main and if I tweak the dependencies for ca-certificates-java to runtime depend on openjdk-7-jre-headless everything starts working OK again.

Although I need to make that change anyway that does not sound like the correct fix - I should be able to use default-jre-headless which depends on openjdk-7-jre-headless in the same way that zookeeper does!

James Page (james-page) wrote :

@Clint

With regards to the two other potential bugs in zookeeper:

1) /var/lib/zookeeper should belong to zookeeper

A --no-create-home might help here (or I can make it quiet) - the permissions get set post user creation so they should be correct at runtime.

2) the upstart job should be reporting the failure, not 'start/running'

I guess a pre-start check for 'java' might be a good idea - however I don't think its unreasonable to expect java to be configured prior to zookeeperd during installation so I'm not sold on the idea.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers