With all those changes, that I'll provide either by suggesting upstream patches and backporting to Debian Unstable -> Ubuntu Devel, or by creating changes as regular Debian Package patches, I was able to make CTDB NFS HA to work flawless:
(k)inaddy@ctdbclient01:~$ sudo mount -t nfs -o vers=3 ctdbserver01.public:/mnt/glusterfs/data /mnt/ctdbserver01
(k)inaddy@ctdbclient01:~$ sudo mount -t nfs -o vers=3 ctdbserver02.public:/mnt/glusterfs/data /mnt/ctdbserver02
(k)inaddy@ctdbclient01:~$ sudo mount -t nfs -o vers=3 ctdbserver03.public:/mnt/glusterfs/data /mnt/ctdbserver03
0+2 records in
0+2 records out
160 bytes copied, 0.0127338 s, 12.6 kB/s
0+2 records in
0+2 records out
158 bytes copied, 0.00846823 s, 18.7 kB/s
0+2 records in
0+2 records out
158 bytes copied, 0.0096586 s, 16.4 kB/s
0+2 records in
0+2 records out
158 bytes copied, 0.01485 s, 10.6 kB/s
0+2 records in
0+2 records out
158 bytes copied, 0.0134006 s, 11.8 kB/s
0+2 records in
0+2 records out
158 bytes copied, 0.00700728 s, 22.5 kB/s
0+2 records in
0+2 records out
158 bytes copied, 0.0109971 s, 14.4 kB/s
0+2 records in
0+2 records out
During a failure in one of the ctdbservers (ctdbserver03):
(k)inaddy@ctdbserver02:~$ ctdb status
Number of nodes:3
pnn:0 172.16.9.1 OK
pnn:1 172.16.9.2 OK (THIS NODE)
pnn:2 172.16.9.3 DISCONNECTED|UNHEALTHY|STOPPED|INACTIVE
Generation:92168728
Size:2
hash:0 lmaster:0
hash:1 lmaster:1
Recovery mode:NORMAL (0)
Recovery master:0
And the public addresses were correctly set in interface "eth1" as they were supposed:
(k)inaddy@ctdbserver01:~$ ip addr show eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:1c:31:c3 brd ff:ff:ff:ff:ff:ff
inet 172.16.17.1/24 brd 172.16.17.255 scope global eth1
valid_lft forever preferred_lft forever
inet 172.16.17.3/24 brd 172.16.17.255 scope global secondary eth1
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe1c:31c3/64 scope link
valid_lft forever preferred_lft forever
(k)inaddy@ctdbserver02:~$ ip addr show eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:af:f8:53 brd ff:ff:ff:ff:ff:ff
inet 172.16.17.2/24 brd 172.16.17.255 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:feaf:f853/64 scope link
valid_lft forever preferred_lft forever
according to variable NFS_HOSTNAME. Note that ctdbserver01 has its own public ip address and ip address of ctdbserver03, which I failed on purpose. NFS client kept access like it should.
With all those changes, that I'll provide either by suggesting upstream patches and backporting to Debian Unstable -> Ubuntu Devel, or by creating changes as regular Debian Package patches, I was able to make CTDB NFS HA to work flawless:
(k)inaddy@ ctdbclient01: ~$ sudo mount -t nfs -o vers=3 ctdbserver01. public: /mnt/glusterfs/ data /mnt/ctdbserver01 ctdbclient01: ~$ sudo mount -t nfs -o vers=3 ctdbserver02. public: /mnt/glusterfs/ data /mnt/ctdbserver02 ctdbclient01: ~$ sudo mount -t nfs -o vers=3 ctdbserver03. public: /mnt/glusterfs/ data /mnt/ctdbserver03
(k)inaddy@
(k)inaddy@
(k)inaddy@ ctdbclient01: ~$ while true; do sleep 2; dd if=/dev/random of=/mnt/ ctdbserver01/ file bs=1k count=2 ; dd if=/dev/random of=/mnt/ ctdbserver02/ file bs=1k count=2; dd if=/dev/random of=/mnt/ ctdbserver03/ file bs=1k count=2; done
0+2 records in
0+2 records out
160 bytes copied, 0.0127338 s, 12.6 kB/s
0+2 records in
0+2 records out
158 bytes copied, 0.00846823 s, 18.7 kB/s
0+2 records in
0+2 records out
158 bytes copied, 0.0096586 s, 16.4 kB/s
0+2 records in
0+2 records out
158 bytes copied, 0.01485 s, 10.6 kB/s
0+2 records in
0+2 records out
158 bytes copied, 0.0134006 s, 11.8 kB/s
0+2 records in
0+2 records out
158 bytes copied, 0.00700728 s, 22.5 kB/s
0+2 records in
0+2 records out
158 bytes copied, 0.0109971 s, 14.4 kB/s
0+2 records in
0+2 records out
During a failure in one of the ctdbservers (ctdbserver03):
(k)inaddy@ ctdbserver02: ~$ ctdb status UNHEALTHY| STOPPED| INACTIVE
Number of nodes:3
pnn:0 172.16.9.1 OK
pnn:1 172.16.9.2 OK (THIS NODE)
pnn:2 172.16.9.3 DISCONNECTED|
Generation:92168728
Size:2
hash:0 lmaster:0
hash:1 lmaster:1
Recovery mode:NORMAL (0)
Recovery master:0
And the public addresses were correctly set in interface "eth1" as they were supposed:
(k)inaddy@ ctdbserver01: ~$ ip addr show eth1 MULTICAST, UP,LOWER_ UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 ff:fe1c: 31c3/64 scope link
3: eth1: <BROADCAST,
link/ether 52:54:00:1c:31:c3 brd ff:ff:ff:ff:ff:ff
inet 172.16.17.1/24 brd 172.16.17.255 scope global eth1
valid_lft forever preferred_lft forever
inet 172.16.17.3/24 brd 172.16.17.255 scope global secondary eth1
valid_lft forever preferred_lft forever
inet6 fe80::5054:
valid_lft forever preferred_lft forever
(k)inaddy@ ctdbserver02: ~$ ip addr show eth1 MULTICAST, UP,LOWER_ UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 ff:feaf: f853/64 scope link
3: eth1: <BROADCAST,
link/ether 52:54:00:af:f8:53 brd ff:ff:ff:ff:ff:ff
inet 172.16.17.2/24 brd 172.16.17.255 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::5054:
valid_lft forever preferred_lft forever
according to variable NFS_HOSTNAME. Note that ctdbserver01 has its own public ip address and ip address of ctdbserver03, which I failed on purpose. NFS client kept access like it should.