[SRU] eucalyptus-cloud doesn't reply to requests (eucalyptus doesn't work after reboot or services restart issues due to upstart networking behavior)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
eucalyptus (Ubuntu) |
Fix Released
|
High
|
Dustin Kirkland | ||
Karmic |
Fix Released
|
High
|
Dustin Kirkland | ||
Lucid |
Fix Released
|
High
|
Dustin Kirkland |
Bug Description
Using 1.6.2~bzr1120-
ubuntu@uec-cc:~$ sudo euca_conf --get-credentials mycreds.zip
ERROR: you need to be on the CLC host and the CLC needs to be running.
A wget on the register url also times out:
ubuntu@uec-cc:~$ wget -T 10 -t 1 -O - --no-check-
--2010-01-04 20:39:34-- https:/
Connecting to 127.0.0.1:8443... connected.
WARNING: cannot verify 127.0.0.1's certificate, issued by `/C=US/
Self-signed certificate encountered.
WARNING: certificate common name `localhost' doesn't match requested host name `127.0.0.1'.
HTTP request sent, awaiting response... Read error (Connection timed out) in headers.
Giving up.
The eucalyptus-cloud process is running a listening on port 8443.
I can see the following errors in /var/log/
0:36:55 [log:653891498@
java.lang.
at com.eucalyptus.
at com.eucalyptus.
at com.eucalyptus.
at edu.ucsb.
at edu.ucsb.
at edu.ucsb.
at javax.servlet.
at javax.servlet.
at org.mortbay.
at org.mortbay.
at org.mortbay.
at org.mortbay.
at org.mortbay.
at org.mortbay.
at org.mortbay.
at org.mortbay.
at org.mortbay.
at org.mortbay.
at org.mortbay.
at org.mortbay.
at org.mortbay.
at org.mortbay.
at org.mortbay.
at org.mortbay.
at org.mortbay.
Caused by: javax.persisten
at org.hibernate.
at org.hibernate.
at com.eucalyptus.
... 24 more
Caused by: org.hibernate.
at org.hibernate.
at org.hibernate.
at org.hibernate.
at org.hibernate.
at org.hibernate.
at org.hibernate.
at org.hibernate.
at org.hibernate.
at org.hibernate.
... 25 more
Caused by: java.sql.
at org.hsqldb.
at org.hsqldb.
at sun.reflect.
at sun.reflect.
at java.lang.
at org.logicalcobw
at $Proxy27.
at org.hibernate.
at org.hibernate.
... 30 more
== SRU ==
IMPACT: Users of 9.10 UEC will often experience long, non-deterministic delays (10-20 minutes in many cases) when restarting eucalyptus services (which includes reboots, package upgrades, service restarts). This is highly inconvenient, yielding UEC unusable until database network connections reset.
HOW FIXED: The fix consists of two trivial iptables calls being added to the eucalyptus upstart script. Upstream had these calls in there init scripts, but were inadvertently dropped when porting Eucalyptus to upstart. These iptables commands will ensure that the iptables kernel module (and most importantly, the ip connection tracker) is loaded and active before Eucalyptus comes up. WIthout said ip connection tracker, Eucalyptus will often establish a connection to the database, then iptables is loaded and connections are mangled, breaking the connection to the database. The user will see the problem in any one of a number of disguising ways (front end not working, api tools not responding, etc). All of these problems are due to an inaccessible database. After a while (10-20 minutes), Eucalyptus will reset the database connection. With this fix, the above problems should never happen. Eucalyptus should be back up and running within 1-2 minutes of boot (if not immediately).
MINIMAL PATCH:
diff -u eucalyptus-
--- eucalyptus-
+++ eucalyptus-
@@ -11,6 +11,10 @@
# Check if installed
[ -f /usr/sbin/euca_conf ] || { stop; exit 0; }
+ # Ensure that the iptables module gets loaded here
+ iptables -t nat -L -n >/dev/null
+ iptables -L -n > /dev/null
+
mkdir -p /var/run/
chown eucalyptus:
REPRODUCING THE BUG:
Reboot your UEC (or sudo restart eucalyptus). If restarting eucalyptus takes a *long* time, you are experiencing one symptom of this bug. Once upstart thinks that eucalyptus is up, try: $(sudo wget --no-check-
REGRESSION POTENTIAL:
I cannot see any possible regression potential. The iptables modules will be loaded eventually. This patch just ensures that they get loaded before Eucalyptus tries to start services.
Changed in eucalyptus (Ubuntu): | |
importance: | Undecided → High |
tags: | added: iso-testing |
Changed in eucalyptus (Ubuntu): | |
assignee: | nobody → Dustin Kirkland (kirkland) |
milestone: | none → lucid-alpha-3 |
summary: |
- eucalyptus-cloud doesn't reply to requests (due to upstart networking - issue) + eucalyptus-cloud doesn't reply to requests (eucalyptus doesn't work + after reboot or services restart issues due to upstart networking + behavior) |
Changed in eucalyptus (Ubuntu Karmic): | |
status: | New → Triaged |
importance: | Undecided → High |
assignee: | nobody → Dustin Kirkland (kirkland) |
milestone: | none → karmic-updates |
tags: |
added: verification-done removed: verification-needed |
I ran into this one as well. I had to stop eucalyptus / start eucalyptus to get a running system.