Broken with v4 isc-dhcp-server in Natty

Reported by Will Daniels on 2011-02-11
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Eucalyptus
Invalid
Undecided
Daniel Nurmi
Release Notes for Ubuntu
Undecided
Unassigned
isc-dhcp (Ubuntu)
Critical
Dave Walker
Natty
Critical
Dave Walker

Bug Description

Using the new v4 DHCP server from Natty, Eucalyptus fails to start the server to assign IP addresses to instances on their private networks.

Steps to Reproduce
==================

Install and configure Eucalyptus in MANAGED mode on Natty (expect other modes have the same problem also). Modify the file /etc/eucalyptus/wrappers.conf to allow calling the new DHCP server bin (dhcpd instead of dhcpd3):

- dhcpd3 /usr/sbin/dhcpd3 0 #cap_net_admin
+ dhcpd3 /usr/sbin/dhcpd 0 #cap_net_admin

Modify /etc/eucalyptus/eucalyptus.conf to use the new dhcpd bin:

- VNET_DHCPDAEMON="/usr/sbin/dhcpd3"
+ VNET_DHCPDAEMON="/usr/sbin/dhcpd"

Install an image and start it.

Result
======

In httpd-cc_error_log...

---
Internet Systems Consortium DHCP Server 4.1.1-P1
Copyright 2004-2010 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/
WARNING: Host declarations are global. They are not limited to the scope you declared them in.
Wrote 0 deleted host decls to leases file.
Wrote 0 new dynamic host decls to leases file.
Wrote 0 leases to leases file.

No subnet declaration for eucabr10 (no IPv4 addresses).
** Ignoring requests on eucabr10. If this is not what
   you want, please write a subnet declaration
   in your dhcpd.conf file for the network segment
   to which interface eucabr10 is attached. **

Not configured to listen on any interfaces!
---

Notes
=====

Confirmed that replacing the v4 DHCP server with the old v3 one solves the problem.

Possibly this is a bug in the DHCP server as the error suggests it is not finding the subnet on the bridge interface eucabr10 ("no IPv4 addresses"), which was assigned correctly according to "ip addr show" using netlink calls, whereas the ioctl calls used by the DHCP server do not return any address for eucabr10. I looked briefly at discover.c in the DHCP server code and nothing jumped out as being substantially different between v3 and v4 though.

It seems that the DHCP config file is written out to /var/run/eucalyptus/net/euca-dhcp.conf and the command used to start the server is like this:

/usr/lib/eucalyptus/euca_rootwrap /usr/sbin/dhcpd -cf //var/run/eucalyptus/net/euca-dhcp.conf -lf //var/run/eucalyptus/net/euca-dhcp.leases -pf //var/run/eucalyptus/net/euca-dhcp.pid -tf //var/run/eucalyptus/net/euca-dhcp.trace eucabr10 eth0

As best I can tell, both the config file and command line look correct. Sample config:

---
# automatically generated config file for DHCP server
default-lease-time 1200;
max-lease-time 1200;
ddns-update-style none;

shared-network euca {
subnet 10.32.8.0 netmask 255.255.255.0 {
  option subnet-mask 255.255.255.0;
  option broadcast-address 10.32.8.255;
  option domain-name-servers 127.0.0.1, 10.32.8.1;
  option routers 10.32.8.1;
}

host node-10.32.8.2 {
  hardware ethernet D0:0D:4C:1A:08:C4;
  fixed-address 10.32.8.2;
}

host node-10.32.8.3 {
  hardware ethernet D0:0D:38:D8:06:CE;
  fixed-address 10.32.8.3;
}
}
---

I don't know what has changed in the v4 DHCP server, but this is not working and I don't really know enough about these things to help further. Probably someone with experience/knowledge of the ISC DHCP server could spot the problem or suggest a simple solution more easily.

Scott Lyons (scottalyons) wrote :

I can confirm this occurs with MANAGED-NOVLAN on Natty Alpha 2

Dave Walker (davewalker) on 2011-02-17
Changed in eucalyptus (Ubuntu):
status: New → Confirmed
James Page (james-page) on 2011-02-17
Changed in eucalyptus (Ubuntu):
importance: Undecided → High
status: Confirmed → Triaged
Daniel Nurmi (nurmi) on 2011-02-21
Changed in eucalyptus (Ubuntu Natty):
assignee: nobody → Daniel Nurmi (nurmi)
Changed in eucalyptus:
assignee: nobody → Daniel Nurmi (nurmi)
Dave Walker (davewalker) on 2011-02-25
tags: added: server-nrs
Will Daniels (wdaniels) on 2011-03-01
description: updated
tags: added: iso-testing
C de-Avillez (hggdh2) on 2011-03-08
Changed in eucalyptus (Ubuntu Natty):
milestone: none → ubuntu-11.04-beta-1
Daniel Nurmi (nurmi) wrote :

All,

Thanks for this bug report, it does indeed appear that there is a bug in the new DHCP server that is preventing it from learning that some interfaces may have multiple IP addresses associated with them. Here is the result of my findings after looking at the dhcp server code for both dhcpd3 from lucid (works) with the new isc-dhcp-server.

In discover.c, there is a routine for discovering configured interfaces, reading any assigned IPs on those interfaces, and matching those assigned IPs with subnets defined in the dhcpd configuration file. Fundamentally there is a big loop that goes over all interfaces and that reads the configured IPs. In v3, the buffer that was iterated over was populated by an ioctl with SIOCGIFCONF, which fills a buffer with interfaces and their assigned addresses; if an interface has more than one assigned address, then the buffer contained multiple entries for that interface each with an assigned IP. The new logic has been reworked (it looks like to solve the problem of supporting more UNIX variants cleanly), such that there are routines that populate a global buffer, grab the 'next' interface, and cleanup when done. For Linux, the routines go through /proc/net/dev to get the device names, and then use an ioctl with SIOCGIFADDR to get the address from each interface. Here lies the problem, which is twofold!

1.) the ioctl and later logic will only return a single address from each interface
2.) /proc/net/dev only contains one entry per interface

So, if an interface has more than one assigned address, the server will only ever see the 'first' one.

Possible solutions:

1.) It looks like, during the re-design of the iteration logic, that a set of routines was written for BSD/OSX that use the getifaddrs() call to get a complete list of all configured interfaces with addresses. This looks really clean, and Linux does also support getifaddrs(). We could potentially suggest to the upstream that this same logic be used for linux as well as OSX/BSD.

2.) The iteration goes over each interface, and we could add some code for each interface to interrogate the interface for all configured addresses.

I've attached a patch that makes a proof of concept attempt at solution #2 and verified that it works for Eucalyptus (should be general), but I'm not necessarily comfortable suggesting this as the 'real' solution as I believe that I'm missing some context as to the UNIX variant agnostic nature of the DHCP server code. It may be worth using as a starting point, however.

In any case, I do believe that this issue should be brought to the attention of the upstream maintainer, as I believe that, even ignoring Eucalyptus altogether, this bug will cause problems for folks who have more than one IP assigned to a single interface.

-Dan

Will Daniels (wdaniels) wrote :

Thanks for the update Dan, great work!

I had a feeling this was going to result in a can of worms for upstream changes, which is why I didn't pursue the problem myself (I just don't have the FOSS experience or general kudos to get anything done there).

IMHO it is a clear regression upstream, but no doubt there are reasons for it that only those more intimate with the DHCP server development would know.

However, I will test your patch tomorrow and if there's anything else specific I might be able to help with, I'd be happy to.

tags: added: patch
C de-Avillez (hggdh2) wrote :

Dave and I ran it yesterday on the test rig. It still does not work, with the CC reporting that dhcpd startup failed. I will attach the logs.

Changed in eucalyptus (Ubuntu Natty):
importance: High → Medium
importance: Medium → Critical
Daniel Nurmi (nurmi) wrote :

It turns out that I had(ve) apparmor disabled while working on this problem, but the new dhcpd needs a slight change to its profile in order to work. Here is what I saw with apparmor enabled:

root@eucahost-4-243:/var/log/eucalyptus# dmesg
[ 800.347860] type=1400 audit(1300832242.358:24): apparmor="DENIED" operation=\
"capable" parent=10292 profile="/usr/sbin/dhcpd" pid=10293 comm="dhcpd" capabil\
ity=1 capname="dac_override"

Chris and Garrett on our side pointed at the solution of adding:

  capability dac_override,

to the /etc/apparmod.d/usr.sbin.dhcpd profile. Once I added this, rebooted and tried again, eucalyptus is able to run the dhcpd process on instance start.

Regards,
-Dan

C de-Avillez (hggdh2) wrote :

Hi Dan,

I was looking at it... here are the messages logged in the cc.log for DHCPD:

[Tue Mar 22 20:19:59 2011][027210][EUCADEBUG ] refresh_instances(): node 10.55.55.3 idle since 1300839443: (156/300) seconds
[Tue Mar 22 20:19:59 2011][027210][EUCADEBUG ] refresh_instances(): done
[Tue Mar 22 20:19:59 2011][027210][EUCAWARN ] vnetKickDHCP(): failed to create/open euca-dhcp.leases
[Tue Mar 22 20:19:59 2011][027210][EUCADEBUG ] vnetKickDHCP(): executing: ///usr/lib/eucalyptus/euca_rootwrap chgrp -R dhcpd //var/run/eucalyptus/net
[Tue Mar 22 20:19:59 2011][027210][EUCADEBUG ] vnetKickDHCP(): executing: ///usr/lib/eucalyptus/euca_rootwrap chmod -R 0775 //var/run/eucalyptus/net
[Tue Mar 22 20:19:59 2011][027210][EUCAINFO ] vnetKickDHCP(): executing: ///usr/lib/eucalyptus/euca_rootwrap /usr/sbin/dhcpd -cf //var/run/eucalyptus/net/euca-dhcp.conf -lf //var/run/eucalyptus/net/euca-dhcp.leases -pf //var/run/eucalyptus/net/euca-dhcp.pid -tf //var/run/eucalyptus/net/euca-dhcp.trace eth0
[Tue Mar 22 20:19:59 2011][027210][EUCAINFO ] vnetKickDHCP(): RC from cmd: 256
[Tue Mar 22 20:19:59 2011][027210][EUCAERROR ] monitor_thread(): cannot start DHCP daemon
[Tue Mar 22 20:19:59 2011][027210][EUCADEBUG ] monitor_thread(): done

I then looked at the syslog, and what I see is this:

Mar 22 20:22:36 cempedak dhcpd: Internet Systems Consortium DHCP Server 4.1.1-P1
Mar 22 20:22:36 cempedak dhcpd: Copyright 2004-2010 Internet Systems Consortium.
Mar 22 20:22:36 cempedak dhcpd: All rights reserved.
Mar 22 20:22:36 cempedak dhcpd: For info, please visit https://www.isc.org/software/dhcp/
Mar 22 20:22:36 cempedak dhcpd: WARNING: Host declarations are global. They are not limited to the scope you declared them in.
Mar 22 20:22:36 cempedak dhcpd: Wrote 0 deleted host decls to leases file.
Mar 22 20:22:36 cempedak dhcpd: Wrote 0 new dynamic host decls to leases file.
Mar 22 20:22:36 cempedak dhcpd: Wrote 0 leases to leases file.
Mar 22 20:22:36 cempedak dhcpd:
Mar 22 20:22:36 cempedak dhcpd: No subnet declaration for eth0 (10.55.55.2).
Mar 22 20:22:36 cempedak dhcpd: ** Ignoring requests on eth0. If this is not what
Mar 22 20:22:36 cempedak dhcpd: you want, please write a subnet declaration
Mar 22 20:22:36 cempedak dhcpd: in your dhcpd.conf file for the network segment
Mar 22 20:22:36 cempedak dhcpd: to which interface eth0 is attached. **
Mar 22 20:22:36 cempedak dhcpd:
Mar 22 20:22:36 cempedak dhcpd:
Mar 22 20:22:36 cempedak dhcpd: Not configured to listen on any interfaces!

And, indeed, the /var/lib/eucalyptus/net/euca-dhcpd.conf does not contain any reference to the 10.55.55.0/24 range:

# automatically generated config file for DHCP server
default-lease-time 1200;
max-lease-time 1200;
ddns-update-style none;

shared-network euca {
subnet 172.19.1.0 netmask 255.255.255.224 {
  option subnet-mask 255.255.255.224;
  option broadcast-address 172.19.1.31;
  option domain-name-servers 10.55.55.1, 10.55.55.100;
  option routers 172.19.1.1;
}

host node-172.19.1.4 {
  hardware ethernet D0:0D:46:86:07:8D;
  fixed-address 172.19.1.4;
}
}

This is an all-in-one install, BTW.

Daniel Nurmi (nurmi) wrote :

Carlos,

This is the message that was being thrown when the original problem was cropping up; if you're still seeing this message with the patch installed, then it (the patch) is not working properly. The server should run even though there is no address in the config on the 10.55.55.0/24 subnet, since there should be another IP from the 172.19 subnet assigned to one of the eth0* interfaces. Can you please verify that the patch is applied, and perhaps show the output of 'ip addr show' when this error is occurring? I've tested again on my local setup with the patched dhcpd and it is launching successfully. It should look something like this, where

169.254.169.254/32 is the metadata service redirect IP
192.168.6.148/18 is the physical interface IP (analogous to your 10.55.55.0/24 subnet)
192.168.16.40/32 is one of the VNET_PUBLICIP addresses specified in eucalyptus.conf
172.19.1.1/27 is drawn from the VNET_SUBNET/VNET_NETMASK network specified in eucalyptus.conf

2: eth4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether d0:aa:aa:bb:cc:dd brd ff:ff:ff:ff:ff:ff
    inet 169.254.169.254/32 scope link eth4
    inet 192.168.6.148/18 brd 192.168.63.255 scope global eth4
    inet 192.168.16.40/32 scope global eth4
    inet 172.19.1.1/27 brd 172.19.1.31 scope global eth4:priv
    inet6 fe80::d2aa:aaff:febb:ccdd/64 scope link
       valid_lft forever preferred_lft forever

root@eucahost-4-243:~# ps ax | grep dhcpd
 4519 ? Ss 0:00 /usr/sbin/dhcpd -cf //var/run/eucalyptus/net/euca-dhcp.conf -lf //var/run/eucalyptus/net/euca-dhcp.leases -pf //var/run/eucalyptus/net/euca-dhcp.pid -tf //var/run/eucalyptus/net/euca-dhcp.trace eth4

Daniel Nurmi (nurmi) wrote :

Carlos,

As a debugging aid, I've attached a stand-alone program that uses the same loop as the patch for dhcpd: running it on the CC right after a run-instances should result in output like this:

root@eucahost-4-243:~# gcc getifaddrs.c; ./a.out
INTERFACE=lo SCRUBBEDINTERFACE=lo ADDR=127.0.0.1
INTERFACE=eth4 SCRUBBEDINTERFACE=eth4 ADDR=169.254.169.254
INTERFACE=eth4 SCRUBBEDINTERFACE=eth4 ADDR=192.168.6.148
INTERFACE=eth4:priv SCRUBBEDINTERFACE=eth4 ADDR=172.19.1.1
INTERFACE=eth4:pub SCRUBBEDINTERFACE=eth4 ADDR=192.168.16.40
INTERFACE=virbr0 SCRUBBEDINTERFACE=virbr0 ADDR=192.168.122.1

It will help, I think, to see the output of this program on a failing system, if that is possible!

Regards,
-Dan

C de-Avillez (hggdh2) wrote :

Here you go:

ubuntu@cempedak:/var/log$ ~/getifaddrs
INTERFACE=lo SCRUBBEDINTERFACE=lo ADDR=127.0.0.1
INTERFACE=eth0:metadata SCRUBBEDINTERFACE=eth0 ADDR=169.254.169.254
INTERFACE=eth0 SCRUBBEDINTERFACE=eth0 ADDR=10.55.55.2
INTERFACE=eth0:priv SCRUBBEDINTERFACE=eth0 ADDR=172.19.1.1
INTERFACE=eth0:pub SCRUBBEDINTERFACE=eth0 ADDR=10.55.55.100

This specific test rig has been running for a few days now. Today the CLC/CC/SC/Walrus had a dist-upgrade done, and rebooted some times, due to a different test. After This other test, I went in to run the getifaddrs as you asked. I got the output above, and -- to my surprise -- it seems now we do get an IP address on the instances.

BUT the log entries still show that Euca's DHCPD startup failed. Since I do not believe in magic, I and re-checking this all.

Kate Stewart (kate.stewart) wrote :

Moving to beta-2. Prefer to see fix pushed out before then.

Changed in eucalyptus (Ubuntu Natty):
milestone: ubuntu-11.04-beta-1 → ubuntu-11.04-beta-2
Daniel Nurmi (nurmi) wrote :

All,

I think I've found the problem all! So, looking at:

bzr branch lp:ubuntu/isc-dhcp

1.) the patch I've supplied is being placed at the end of the '00list' file in debian/patches. not a problem normally i suspect, however:
2.) the patch is not present in the actual code that is being used when I try to build a package using 'bzr bd'. the reason for this is:
3.) in debian/rules file, there is a line:

        dpatch deapply-until dhcp-4.1.0-ldap-code

which removes the patch since it is listed after 'dhcp-4.1.0-ldap-code' in '00list'. The fix is:
4.) put the 'multi-ip-addr-per-if' list right after the 'dhclient-more-debug' line in '00list' and rebuild.

I just went through this process on my natty box, installed the resulting isc-dhcp-server deb, and ran an instance or two through eucalyptus without problem. I think thats probably it (!)

thanks all
-dan

Dave Walker (davewalker) on 2011-04-05
Changed in eucalyptus (Ubuntu Natty):
status: Triaged → Invalid
Dave Walker (davewalker) on 2011-04-11
Changed in eucalyptus (Ubuntu Natty):
milestone: ubuntu-11.04-beta-2 → none
Robbie Williamson (robbiew) wrote :

A little confused by the unmilestoning, so are we saying we won't be able to have this working correctly in Natty?

Will Daniels (wdaniels) wrote :

The patch went in to isc-dhcp source package at 4.1.1-P1-15ubuntu6 (16th March) so it should be there. I would think it might have to do with book-keeping (to have the bug under the package concerned) though I don't see any alternative bug logged for isc-dhcp, so maybe not.

Dave Walker (davewalker) wrote :

@Robbie, Yes - as Will points out - it turned out to not require changes in Eucalyptus. I removed the mile-stoning, as it was purely causing noise on the tracking lists.

Robbie Williamson (robbiew) wrote :

Got it thanks. For release tracking purposes, I'm going to tag this one against the right package and mark it fixed released.

affects: eucalyptus (Ubuntu Natty) → isc-dhcp (Ubuntu Natty)
Changed in isc-dhcp (Ubuntu Natty):
assignee: Daniel Nurmi (nurmi) → Dave Walker (davewalker)
status: Invalid → Fix Released
Changed in eucalyptus:
status: New → Invalid
Will Daniels (wdaniels) wrote :

I'm not too sure that this is fixed guys :( Finally got back to this project yesterday and experiencing the same problem, although the code snippet Dan provided (which is the same as I can see in the isc-dhcp patch) does pick up the correct IPs for each interface...when actually running dhcpd it doesn't appear to work. The error is the same as before.

Strangely, I'm also finding that now I have to add the "capability dac_override" to dhcpd apparmor profile otherwise it fails even earlier with "permission denied" trying to create the trace file. I can see that permissions for the trace file should be granted explicitly already in the profile, yet still have to add the more general dac_override for some reason :S

Has this been successfully tested by others?

C de-Avillez (hggdh2) wrote :

@Will: yes, I tested it, and it did work. We did hear we would need dac_override; when we started testing this, I manually aa-(disable|enable)d /usr/sbin/dhcpd to check on the requirement. I have just looked at the current isc-dhcp on natty, and I do not see the update to apparmour...

This is weird, checking again.

Robbie Williamson (robbiew) wrote :

Changing state back to In Progress, b/c this is apparently not completely resolved.

Changed in isc-dhcp (Ubuntu Natty):
status: Fix Released → In Progress
C de-Avillez (hggdh2) wrote :

Hum. All my sessions, on my first try with a current ISO, fail. I cannot see the error yet -- but it is not the dac_override on the DHCPD apparmour's configuration (we do not have it, either, but I see no failures there -- and /usr/sbin/dhcpd is in enforce mode.

What I see, right now, is that all instances are seemingly failing to boot; the console.log for these instances have one single line:

multibooting (hd0,1)/boot/grub/core.img

I do not remember any other update that could cause that, from Beta2 to now. We will have to investigate it further; I am not even sure this would be the bug to follow this. Anyway, we are adding a note to the Release Notes on that.

C de-Avillez (hggdh2) wrote :

@Will: can you please add in here the logs showing the dac_override issue? Also, the versions for all of the euca\* packages and isc-dhcp\*. Thank you.

Robbie Williamson (robbiew) wrote :

@Carlos If we determine it's a separate issue, then definitely open a new bug and we can release note that....and restore this to Fix Released

Will Daniels (wdaniels) wrote :
Download full text (5.7 KiB)

The error message was like this:

    dhcpd.c(473): trace_begin: //var/run/eucalyptus/net/euca-dhcp.trace: Permission denied

I had that copied somewhere, but unfortunately I deleted all the logs yesterday and I don't even recall exactly which log that came from now (sorry).

In any case, I can no longer reproduce that particular problem, even after removing dac_override from the profile :S

But the other problem (and the original subject of this bug) still remains (for me, using MANAGED mode):

---
root@whirlpool:~/debs# /usr/sbin/dhcpd -cf //var/run/eucalyptus/net/euca-dhcp.conf -lf //var/run/eucalyptus/net/euca-dhcp.leases -pf //var/run/eucalyptus/net/euca-dhcp.pid -tf //var/run/eucalyptus/net/euca-dhcp.trace eucabr10 eth0.2187
Internet Systems Consortium DHCP Server 4.1.1-P1
Copyright 2004-2010 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/
WARNING: Overwriting trace file "//var/run/eucalyptus/net/euca-dhcp.trace"
WARNING: Host declarations are global. They are not limited to the scope you declared them in.
Wrote 0 deleted host decls to leases file.
Wrote 0 new dynamic host decls to leases file.
Wrote 0 leases to leases file.

No subnet declaration for eth0.2187 (169.254.169.254).
** Ignoring requests on eth0.2187. If this is not what
   you want, please write a subnet declaration
   in your dhcpd.conf file for the network segment
   to which interface eth0.2187 is attached. **

No subnet declaration for eucabr10 (no IPv4 addresses).
** Ignoring requests on eucabr10. If this is not what
   you want, please write a subnet declaration
   in your dhcpd.conf file for the network segment
   to which interface eucabr10 is attached. **

Not configured to listen on any interfaces!
---
root@whirlpool:~/debs# cat /var/run/eucalyptus/net/euca-dhcp.conf
# automatically generated config file for DHCP server
default-lease-time 1200;
max-lease-time 1200;
ddns-update-style none;

shared-network euca {
subnet 10.32.8.0 netmask 255.255.255.0 {
  option subnet-mask 255.255.255.0;
  option broadcast-address 10.32.8.255;
  option domain-name-servers 127.0.0.1, 10.32.8.1;
  option routers 10.32.8.1;
  option interface-mtu 1496;
}

host node-10.32.8.2 {
  hardware ethernet D0:0D:4E:02:07:90;
  fixed-address 10.32.8.2;
}
}
---
I have tried using isc-dhcp version 4.1.1-P1-15ubuntu9 (latest) as well as specifically 15ubuntu6 where the patch was first applied. I also tried building it myself to make extra sure that the patch is applied...I still get the same problem.

---
root@whirlpool:~/src/getifaddrs# dpkg -l 'isc-dhcp*'
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Description
+++-========================-========================-================================================================
rc isc-dhcp-client 4.1.1-P1-15ubuntu5 ISC DHCP client
ii isc-dhcp-common 4.1.1-P1-15ubuntu9 common files used by all the isc-dhcp* packages
ii isc-dhcp-server 4.1.1-P1-15ubuntu9 ...

Read more...

Will Daniels (wdaniels) wrote :
Download full text (3.2 KiB)

Additional info:

--
root@whirlpool:~# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:25:90:13:44:a4 brd ff:ff:ff:ff:ff:ff
    inet a.b.c.211/24 brd 188.165.231.255 scope global eth0
    inet x.y.z.220/32 scope global eth0
    inet x.y.z.221/32 scope global eth0
    inet x.y.z.223/32 scope global eth0
    inet instance.public.ip.address/32 scope global eth0:pub
    inet6 fe80::225:90ff:fe13:44a4/64 scope link
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 00:25:90:13:44:a5 brd ff:ff:ff:ff:ff:ff
4: eth0.2187@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether 00:25:90:13:44:a4 brd ff:ff:ff:ff:ff:ff
    inet 169.254.169.254/32 scope link eth0.2187
    inet 10.0.0.1/8 scope global eth0.2187
    inet6 fe80::225:90ff:fe13:44a4/64 scope link
       valid_lft forever preferred_lft forever
7: eucabr10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
    link/ether 00:25:90:13:44:a4 brd ff:ff:ff:ff:ff:ff
    inet 10.32.8.1/24 brd 10.32.8.255 scope global eucabr10:priv
8: <email address hidden>: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether 00:25:90:13:44:a4 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::225:90ff:fe13:44a4/64 scope link
       valid_lft forever preferred_lft forever
---
root@whirlpool:~# cat /etc/eucalyptus/eucalyptus.conf
# /etc/eucalyptus/eucalyptus.conf
#
# These are the Ubuntu Enterprise Cloud's default Eucalyptus parameters.

# Affects: All
# See: **NOTE** below
EUCALYPTUS="/"
EUCA_USER="eucalyptus"

# Affects: CLC, Walrus, SC
DISABLE_DNS="N"
CLOUD_OPTS="-Xmx512m"

# Affects: SC
DISABLE_EBS="N"
DISABLE_ISCSI="N"

# Affects: CC, NC
# See: **NOTE** below
ENABLE_WS_SECURITY="Y"
LOGLEVEL="DEBUG"
VNET_PUBINTERFACE="eth0"
VNET_CLOUDIP="x.y.z.220"
VNET_PRIVINTERFACE="eth0.2187"
VNET_MODE="MANAGED"
VNET_DHCPOPTS="option interface-mtu 1496;"

# Affects: CC
# See: **NOTE** below
CC_PORT="8774"
SCHEDPOLICY="ROUNDROBIN"
POWER_IDLETHRESH="300"
POWER_WAKETHRESH="300"
NC_SERVICE="axis2/services/EucalyptusNC"
VNET_DHCPDAEMON="/usr/sbin/dhcpd"
VNET_DHCPUSER="dhcpd"
DISABLE_TUNNELLING="N"
NODES=""
VNET_ADDRSPERNET="256"
VNET_SUBNET="10.32.0.0"
VNET_NETMASK="255.240.0.0"
#VNET_DNS=""
VNET_PUBLICIPS="instance.public.ips.208-instance.public.ips.223"
--

NB: You may notice that VNET_PRIVINTERFACE already is a VLAN, but this seems to work OK so long as I reduce the MTU on instances by 4 bytes for the VLAN tag. And it still works fine with dhcpd3.

Perhaps dhcpd goes through a different code path for discovery on VLAN interfaces?

I'm just too tired to work any more on this today...I'll debug everything properly some time over the next week, but this system is far from "clean" and probably not entirely typical, so I wouldn't want to assert that things are still not fi...

Read more...

C de-Avillez (hggdh2) wrote :

Well, one point at a time...

First off: my failure to run the instances was due to having installed a i386 server ISO, and trying to run 64-bit instances -- at least so far, I have not re-installed with an AMD64 ISO to verify it runs kosher on AMD64 images. But it does look like this one we can chalk off as a short-circuit between the chair and the keyboard.

When I originally tested the DHCPD change -- note that 4.1.1-P1-15ubuntu9 is indeed the correct one, it will *not* work with any previous versions -- I had a similar error: DHCPD still complained as if it had not been upgraded. I no longer see this error, and I could not find what caused it. So... I am wondering if something gets left in on upgrade. Just because of that, I think Robbie was correct when he reset the bug to 'in-progress'. Something is missing.

For the error " dhcpd.c(473): trace_begin: //var/run/eucalyptus/net/euca-dhcp.trace: Permission denied": Eucalyptus starts the DHCPD like this:

[Mon Apr 25 13:55:30 2011][004142][EUCAINFO ] vnetKickDHCP(): executing: ///usr/lib/eucalyptus/euca_rootwrap /usr/sbin/dhcpd -cf //var/run/eucalyptus/net/euca-dhcp.conf -lf //var/run/eucalyptus/net/euca-dhcp.leases -pf //var/run/eucalyptus/net/euca-dhcp.pid -tf /var/run/eucalyptus/net/euca-dhcp.trace eth0

The '-tf //var/run/eucalyptus/net/euca-dhcp.trace' creates a trace file that can be played back later, if DHCPD core-dumps. It is not critical to the cloud operation. Why this would have needed dac_override I do not know.

I will reinstall now with the AMD64 ISO; I will also try an upgrade from a previous server version. So:

QUESTION: from which Ubuntu version did you upgrade?

Will Daniels (wdaniels) wrote :

@Carlos

The upgrade was from maverick (10.10), but this is a server that I have been using for development and has been on natty devel branch since early days in the cycle. I have had to do a bit of jiggery-pokery with eucalyptus and other packages at points, as tends to be the way during development (instance IFACE problem in the upstart scripts IIRC).

"Why this would have needed dac_override I do not know" - me neither, and that seemed to be a temporary problem. I cannot make that happen now...I'm happy to forget that one ;)

Currently, the only problem I have is that dhcpd won't identify and listen on eucabr10. After that, everything works (or at least as well as it did on maverick...I have the issue with 32-bit images and also that restarting eucalyptus seems to lose the public IP, but that was the case before anyway).

Not sure what other info I can provide, other than what I expect to see instead; switching back to dhcp3, things immediately work as expected:

==> /var/log/eucalyptus/httpd-cc_error_log <==

No subnet declaration for eth0.2187 (169.254.169.254).
** Ignoring requests on eth0.2187. If this is not what
   you want, please write a subnet declaration
   in your dhcpd.conf file for the network segment
   to which interface eth0.2187 is attached. **

Listening on LPF/eucabr10/00:25:90:13:44:a4/euca
Sending on LPF/eucabr10/00:25:90:13:44:a4/euca
Sending on Socket/fallback/fallback-net

...I then have both public and private connectivity to the started instance!

A slight aside, I don't know why Eucalyptus passes the VNET_PRIVINTERFACE as last parameter in it's command line to dhcpd[3] as this causes (as it should, since there is no subnet in the config for that interface) the notice seen above in httpd-cc_error_log...which is not unexpected, so cluttering the "error" log could be avoided here?

I would feel better about this bug if somebody has tested it successfully using VLANs in MANAGED mode? There must be something that happens differently because I'm as certain as I can be that Dan's patch is compiled into the dhcpd bin that is still failing.

C de-Avillez (hggdh2) wrote :

I have just ran tests on the 11.04 RC (image 20110425.3), in both MANAGED-NONVLAN and MANAGED. Both runs succeeded.

Robbie Williamson (robbiew) wrote :

Setting this back to "Fixed Release".

@Will, can you please open a new bug for your specific issue and paste the new bug # into this one. We can then work on your issue there.

Changed in ubuntu-release-notes:
status: New → Invalid
Changed in isc-dhcp (Ubuntu Natty):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments