OCFS2 is not ready on boot

Bug #481795 reported by cray23kl on 2009-11-13
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ocfs2-tools (Ubuntu)
Medium
Unassigned

Bug Description

Binary package hint: ocfs2-tools

Hello,

# lsb_release -rd
Description: Ubuntu 9.10
Release: 9.10

# apt-cache policy ocfs2-tools
ocfs2-tools:
  Installiert: 1.4.2-1
  Kandidat: 1.4.2-1
  Versions-Tabelle:
 *** 1.4.2-1 0
        500 http://de.archive.ubuntu.com karmic/main Packages
        100 /var/lib/dpkg/status

After reboot, the OCFS2 cluster ist not ready. dmesg says:

[ 12.969167] OCFS2 Node Manager 1.5.0
[ 13.021414] OCFS2 DLM 1.5.0
[ 13.035066] ocfs2: Registered cluster interface o2cb
[ 13.046026] OCFS2 DLMFS 1.5.0
[ 13.046435] OCFS2 User DLM kernel interface loaded
[ 13.195795] (1996,2):o2net_open_listening_sock:1924 ERROR: unable to bind socket at 141.52.167.65:7777, ret=-99
[ 14.703635] device-mapper: table: 252:1: multipath: error getting device
[ 14.703717] device-mapper: ioctl: error adding target to table

After reboot I have to do manually:

# /etc/init.d/o2cb load
# /etc/init.d/o2cb online ocfs2
Setting cluster stack "o2cb": OK
Starting O2CB cluster ocfs2: OK
# mount /dev/mapper/3600601608f0b1600128e232c60cfde11 /mnt/ocfs2test/

When I do so, I see in dmesg:

[ 1515.544963] o2net: accepted connection from node scchpblade02b (num 3) at 141.52.167.68:7777
[ 1515.546658] o2net: accepted connection from node scchpblade01b (num 1) at 141.52.167.66:7777
[ 1517.284154] o2net: accepted connection from node scchpblade02a (num 2) at 141.52.167.67:7777
[ 1519.579744] OCFS2 1.5.0
[ 1519.585590] ocfs2_dlm: Nodes in domain ("373BFB8718FB4F20B6497063C7BF3BBC"): 0 1 2 3

I tried the following:

# dpkg-reconfigure -f readline ocfs2-tools

The question to start ocfs2 at boot time is answered with yes.

At the end the output is:

update-rc.d: warning: o2cb start runlevel arguments (S) do not match LSB Default-Start values (2 3 5)
update-rc.d: warning: o2cb stop runlevel arguments (0 6) do not match LSB Default-Stop values (none)
Cluster ocfs2 already online
update-rc.d: warning: ocfs2 start runlevel arguments (S) do not match LSB Default-Start values (2 3 5)
update-rc.d: warning: ocfs2 stop runlevel arguments (0 6) do not match LSB Default-Stop values (none)
Starting Oracle Cluster File System (OCFS2) OK

These warnings look like a bug for me.

That OCFS2 is not startet at boot time is also a bug maybe.

Best Regards,
   Christian

js1 (sujiannming) wrote :

Yep, having this same problem. Starting ocfs2 and o2cb doesn't seem to listen on port 7777.

 (3156,0):o2net_open_listening_sock:1924 ERROR: unable to bind socket at 192.168.244.238:7777, ret=-99

Do you maybe use dynamic IP addresses?

cray23kl (cray-unix-ag) wrote :

No, I don't use dynamic IP addresses.

cray23kl (cray-unix-ag) wrote :

The error still exists. I was not able so solve it. Has anybody here an idea?

Ante Karamatić (ivoks) wrote :

I'm looking into the problem...

Ante Karamatić (ivoks) wrote :

I managed to hit this only if my IP is dynamic (given by DHCP server). Could you paste your /etc/network/interfaces? Is your IP available before ocfs2 services are started?

Changed in ocfs2-tools (Ubuntu):
status: New → Incomplete
importance: Undecided → Medium
cray23kl (cray-unix-ag) wrote :

# cat /etc/network/interfaces
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet manual

auto br0
iface br0 inet static
 address 141.52.167.65
 netmask 255.255.254.0
 network 141.52.166.0
 broadcast 141.52.167.255
 gateway 141.52.166.1
 # dns-* options are implemented by the resolvconf package, if installed
 dns-nameservers 141.52.3.3
 dns-search fzk.de
 bridge_ports eth0
 bridge_fd 9
 bridge_hello 2
 bridge_maxage 12
 bridge_stp off

I don't know, if the IP is available before ocfs2 is started.

Ante Karamatić (ivoks) wrote :

Ah... bridge. It is possible that IP isn't accessible before ocfs tools start. You might add post-up commands to restart ocfs init scripts. Or you could do that through /etc/network/ip-up.d/.

Ante Karamatić (ivoks) on 2009-11-18
Changed in ocfs2-tools (Ubuntu):
status: Incomplete → Invalid
cray23kl (cray-unix-ag) wrote :

The bridge is created by Eucalyptus (Ubuntu packages)

cray23kl (cray-unix-ag) wrote :

Now, I inserted this line inside /etc/network/interfaces

post-up /etc/init.d/o2cb online ocfs2

It is working now, but the solution is not elegant.

cray23kl wrote:

> It is working now, but the solution is not elegant.

Well, that's how it should be done. Once ocfs-tools are 'upstartized',
upstart should take care of time when ocfs services are started.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers