[regression] euca_rootwrap fixes affected eucalyptus power management (powerwake)

Bug #458163 reported by Dustin Kirkland  on 2009-10-22
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Release Notes for Ubuntu
Undecided
Unassigned
eucalyptus (Ubuntu)
Wishlist
Dustin Kirkland 
Karmic
Wishlist
Dustin Kirkland 
powernap (Ubuntu)
High
Dustin Kirkland 
Karmic
High
Dustin Kirkland 

Bug Description

cluster/handlers.c: euca_rootwrap rework did not whitelist powerwake; however, powerwake does *not* need root privs, drop euca_rootwrap wrapper.

===========
SRU Verification

This fix is bundled with a set of unrelated SRU fixes for Eucalyptus.

To test this:
 1) From a working UEC cluster, ensure that you have availability on your nodes:
  $ euca-describe-availability-zones verbose
 2) Ensure that your nodes support:
  a) one of [ pm-suspend | pm-hibernate | poweroff ]
  b) and wake-on-lan (might need to check bios, sudo ethtool eth0)
 3) Adjust your eucalyptus scheduling algorithm to POWERSAVE on the front end
  $ sudo sed -i "s/^SCHEDPOLICY=.*/SCHEDPOLICY=POWERSAVE/" /etc/eucalyptus/eucalyptus.conf
 4) Restart eucalyptus
  $ sudo restart eucalyptus
 5) Make sure there are no running instances in your cloud
 6) Watch the output of the CC, which will note the status of each node, waiting for 300 seconds of inactivity before putting the node to sleep
  $ tail -f /var/log/eucalyptus/cc.log
  ...
  [Tue Nov 3 14:39:09 2009][023511][EUCADEBUG ] node 192.168.1.126 idle since 1257280677: (72/300) seconds
  ...
[Tue Nov 3 16:09:57 2009][017368][EUCADEBUG ] sending powerdown to node: 192.168.1.126, http://192.168.1.126:8775/axis2/services/EucalyptusNC
 ...
 7) After 300 seconds, you should see the node either suspend, hibernate, or poweroff, based on its capability
 8) Once the node(s) are sleeping, run an instance. Eucalyptus should dynamically resume (powerwake) the node and deploy the new instance there.
  $ euca-run-instances $EMI -k mykey -t c1.medium

Regression potential should be minimal, and restricted to Eucalyptus SCHEDPOLICY=POWERSAVE users, which is not the default scheduling policy.
===========

Dustin Kirkland  (kirkland) wrote :

The fix is trivial.

Remove the euca_rootwrap wrapper around powerwake.

:-Dustin

Changed in eucalyptus (Ubuntu):
status: New → In Progress
assignee: nobody → Dustin Kirkland (kirkland)
milestone: none → ubuntu-9.10
importance: Undecided → High
tags: added: regression-potential
Dustin Kirkland  (kirkland) wrote :

I think this is pretty critical, as this functionality has been working for almost all of Karmic.

And it was featured rather prominently in Mark's Karmic announcement:
 * https://lists.ubuntu.com/archives/ubuntu-devel-announce/2009-February/000536.html

  "A savvy Koala knows that the best way to conserve energy is to go
  to sleep, and these days even servers can suspend and resume, so imagine
  if we could make it possible to build a cloud computing facility that
  drops its energy use virtually to zero by napping in the midday heat,
  and waking up when there's work to be done. No need to drink at the
  energy fountain when there's nothing going on. If we get all of this
  right, our Koala will help take the edge off the bear market."

Changed in powernap (Ubuntu Karmic):
status: New → In Progress
importance: Undecided → High
assignee: nobody → Dustin Kirkland (kirkland)
milestone: none → ubuntu-9.10
Dustin Kirkland  (kirkland) wrote :

Minor change also required to powernap, to test for the writability of /var/cache/powerwake/ethers, before attempting to update the cache. Committed upstream.

Now that I've patched both of these, Cloud Power Management once again works as expected!

:-Dustin

Matt Zimmerman (mdz) on 2009-10-22
tags: removed: regression-potential
Changed in eucalyptus (Ubuntu Karmic):
milestone: ubuntu-9.10 → karmic-updates
Changed in powernap (Ubuntu Karmic):
milestone: ubuntu-9.10 → karmic-updates
Changed in eucalyptus (Ubuntu Karmic):
importance: High → Medium
Changed in powernap (Ubuntu Karmic):
status: In Progress → Fix Committed
Dustin Kirkland  (kirkland) wrote :

Okay, actually, the Eucalyptus fix is purely wishlist.

We can make this work entirely from powerwake, by simply testing the writability of the cache file. Absolutely trivial fix, to enable this major feature. Please consider this for GA...

:-Dustin

Changed in eucalyptus (Ubuntu Karmic):
importance: Medium → Wishlist
summary: - [regression] euca_rootwrap fixes broke eucalyptus power management
+ [regression] euca_rootwrap fixes affected eucalyptus power management
(powerwake)
Dustin Kirkland  (kirkland) wrote :

A bit more explanation...

Before euca_rootwrap was rewritten:
 * powerwake ran as root:root
 * thus, powerwake had write access to /var/cache/powerwake/ethers
 * and powerwake worked just fine

After euca_rootwrap was rewritten:
 * powerwake was whitelisted, but wasn't given root access
 * now, powerwake runs as eucalyptus:eucalyptus
 * which is fine to send the magic WoL packet
 * however, the cache file is not writable
 * and thus, powerwake now itself throws an exception

As stated above, we can fix this entirely in powerwake with a trivial fix, testing W_OK on /var/cache/powerwake/ethers.

:-Dustin

Dustin Kirkland  (kirkland) wrote :

If we have to document around this, it can be fixed with a configuration change in /etc/eucalyptus/wrappers.conf.

Currently, there is an entry for:
powerwake /usr/bin/powerwake

However, it's not given root privilege.

To adjust this, change that line to:
powerwake /usr/bin/powerwake 0

:-Dustin

Dustin Kirkland  (kirkland) wrote :

For those suffering from this now, do:

sudo sed -i "s:^powerwake.*$:powerwake /usr/bin/powerwake 0:" /etc/eucalyptus/wrappers.conf

:-Dustin

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package powernap - 1.7-0ubuntu2

---------------
powernap (1.7-0ubuntu2) karmic; urgency=low

  * powerwake: test ethers file for writability, LP: #458163

 -- Dustin Kirkland <email address hidden> Thu, 22 Oct 2009 09:50:16 -0500

Changed in powernap (Ubuntu Karmic):
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2009-10-26
Changed in eucalyptus (Ubuntu Karmic):
status: In Progress → Triaged
Dustin Kirkland  (kirkland) wrote :

This was fixed in powernap. The release note is no longer needed.

Changed in ubuntu-release-notes:
status: New → Invalid
Matt Zimmerman (mdz) on 2009-10-29
Changed in eucalyptus (Ubuntu Karmic):
status: Triaged → Fix Committed
description: updated
description: updated
description: updated
Dustin Kirkland  (kirkland) wrote :

I've tested the package in Mathias' ppa. The functionality related to this bug seems to work well.

Mathias Gug (mathiaz) on 2009-11-05
tags: added: uec

Accepted eucalyptus into karmic-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

tags: added: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package eucalyptus - 1.6~bzr931-0ubuntu7.3

---------------
eucalyptus (1.6~bzr931-0ubuntu7.3) karmic-proposed; urgency=low

  * debian/eucalyptus-cc.postinst: restart avahi daemon so that it uses
    eucalyptus specific configuration file (LP: #458904).
  * debian/eucalyptus-cc.eucalyptus-cc-publication{,-ip}.upstart: Respawn
    avahi publication jobs if they die (LP: #480885).

eucalyptus (1.6~bzr931-0ubuntu7.2) karmic-proposed; urgency=low

  [ Dustin Kirkland ]
  * cluster/handlers.c: euca_rootwrap rework did not whitelist powerwake;
    however, powerwake does *not* need root privs, drop euca_rootwrap wrapper
    (LP: #458163)
  * debian/rules, debian/euclayptus-cc.install: install the avahi-daemon.conf
    in /etc/eucalyptus, (LP: #458904).

  [ Thierry Carrez ]
  * clc/modules/www/src/main/java/edu/ucsb/eucalyptus/admin/public/EucalyptusWebInterface.html:
    Fix HTML title in the web UI for more consistency in naming (LP: #455293)
  * debian/eucalyptus-common.eucalyptus.upstart: Add -l to eucalyptus-cloud
    options so that cloud-output.log is affected by LOGLEVEL (LP: #458001)

  [ Mathias Gug ]
  * cluster/handlers.c: Fix the networkIndex returned by describeInstances.
    (LP: #454405 - upstream revno 933).
  * debian/eucalyptus-cc.eucalyptus-cc-publication{,-ip}.upstart: add an
    upstart job to explicitly publish the IP/CC hostname mapping via avahi
    instead of publishing the CC IP address via the service name (LP: #458904).
  * debian/avahi-daemon.conf: ship a specific avahi-daemon configuration file
    that doesn't publish IP addresses by default. (LP: #458904).
  * debian/eucalyptus-cloud.postinst: Fix postfix configuration to accept
    confirmation emails sent by eucalyptus (LP: #459101)
  * debian/eucalyptus-cc.upstart: Don't clean the CC network state when the CC is
    stopped by default (LP: #460089).
 -- Mathias Gug <email address hidden> Wed, 11 Nov 2009 15:15:48 -0500

Changed in eucalyptus (Ubuntu Karmic):
status: Fix Committed → Fix Released
Martin Pitt (pitti) wrote :

I copied the karmic-proposed package to lucid. Ffor karmic-updates it is still missing two verifications.

Changed in eucalyptus (Ubuntu):
status: Fix Committed → Fix Released
Changed in eucalyptus (Ubuntu Karmic):
status: Fix Released → Fix Committed
Thierry Carrez (ttx) wrote :

@Dustin:
I couldn't get POWERSAVE to work on my setup either.
[EUCADEBUG ] node 192.168.0.160 idle since 1259338161: (294/300) seconds
[EUCADEBUG ] node 192.168.0.160 idle since 1259338161: (300/300) seconds
[EUCADEBUG ] node 192.168.0.160 idle since 1259338161: (306/300) seconds
[EUCADEBUG ] node 192.168.0.160 idle since 0: (1259338472/300) seconds
[EUCADEBUG ] node 192.168.0.160 idle since 1259338472: (6/300) seconds

fwiw cc.log says "scheduler using ROUNDROBIN policy to find next resource" when I run an instance while I have SCHEDPOLICY="POWERSAVE" in eucalyptus.conf. Maybe SCHEDPOLICY is ignored ?

Dustin Kirkland  (kirkland) wrote :

Okay, I can now confirm that this works well in the karmic-proposed package.

My testing was affected by a different bug that I will file now and fix in another SRU.

Martin, you can mark this "verification-done" and push to -updates.

:-Dustin

Martin Pitt (pitti) on 2009-11-30
tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package eucalyptus - 1.6~bzr931-0ubuntu7.3

---------------
eucalyptus (1.6~bzr931-0ubuntu7.3) karmic-proposed; urgency=low

  * debian/eucalyptus-cc.postinst: restart avahi daemon so that it uses
    eucalyptus specific configuration file (LP: #458904).
  * debian/eucalyptus-cc.eucalyptus-cc-publication{,-ip}.upstart: Respawn
    avahi publication jobs if they die (LP: #480885).

eucalyptus (1.6~bzr931-0ubuntu7.2) karmic-proposed; urgency=low

  [ Dustin Kirkland ]
  * cluster/handlers.c: euca_rootwrap rework did not whitelist powerwake;
    however, powerwake does *not* need root privs, drop euca_rootwrap wrapper
    (LP: #458163)
  * debian/rules, debian/euclayptus-cc.install: install the avahi-daemon.conf
    in /etc/eucalyptus, (LP: #458904).

  [ Thierry Carrez ]
  * clc/modules/www/src/main/java/edu/ucsb/eucalyptus/admin/public/EucalyptusWebInterface.html:
    Fix HTML title in the web UI for more consistency in naming (LP: #455293)
  * debian/eucalyptus-common.eucalyptus.upstart: Add -l to eucalyptus-cloud
    options so that cloud-output.log is affected by LOGLEVEL (LP: #458001)

  [ Mathias Gug ]
  * cluster/handlers.c: Fix the networkIndex returned by describeInstances.
    (LP: #454405 - upstream revno 933).
  * debian/eucalyptus-cc.eucalyptus-cc-publication{,-ip}.upstart: add an
    upstart job to explicitly publish the IP/CC hostname mapping via avahi
    instead of publishing the CC IP address via the service name (LP: #458904).
  * debian/avahi-daemon.conf: ship a specific avahi-daemon configuration file
    that doesn't publish IP addresses by default. (LP: #458904).
  * debian/eucalyptus-cloud.postinst: Fix postfix configuration to accept
    confirmation emails sent by eucalyptus (LP: #459101)
  * debian/eucalyptus-cc.upstart: Don't clean the CC network state when the CC is
    stopped by default (LP: #460089).
 -- Mathias Gug <email address hidden> Wed, 11 Nov 2009 15:15:48 -0500

Changed in eucalyptus (Ubuntu Karmic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers