Comment 11 for bug 722201

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

I have reviewed this bug and read all CTDB documentation and my initial thoughts for a proper CTDB Ubuntu Enablement are:

1)

(not addressed here, should we have a default pkg dependency here ? there is none nowadays)

There is a mandatory dependency on a clustered file-system to exist, since all the file atomicity is guaranteed through the filesystem layer, and not CTDB. Existing options are:

a) GFS2 - Ubuntu has gfs2-utils and it is part of Debian HA imports. It depends on having CLVM (clustered LVM2) running, changing LVM2 locking type to 3. clvm depends on having the distributed locking manager running and we had issues in the past (because of redhat -> debianization). Check: https://bugs.launchpad.net/ubuntu/+source/dlm/+bug/1248054 . TODO: Would have to make sure clvmd + dlm + lvm2 locking + gfs2 are good for Eoan (initially), Disco, Cosmic and Bionic (LTS) at least. There is no specific change need in SMB or NFS config files.

b) Gluster - Its a straight forward installation / configuration with existing packages already. Supports other interconnects (like Infiniband). There is no specific change need in SMB or NFS config files.

c) GPFS - Its proprietary (IBM) and could/should be enabled by IBM (possibly 4 hands with us, if ever intended).

d) OCFS2 - Open Source and mature project. Supported by Ubuntu kernel and ocfs2-tools package. Samba config file changes needed:

vfs objects = fileid
fileid:algorithm = fsid

NFS does not need any apparent config file change.

- Its better that the clustered filesystem supports uniform device number across the nodes! (https://wiki.samba.org/index.php/Setting_up_a_cluster_filesystem#Checking_uniformity_of_device_and_inode_numbering)

e) LustreFS - Open source but its almost entirely dependant on Mellanox OFED packages (for Infiniband support) and/or CentOS dkms (usually packages done by DDN). I know DDN/whomcloud is working in upstreaming a kernel tree supporting LustreFS and that could lead Debian to have LustreFS support, but not the case here.

2)

(partially addressed by comment #10)

NFS RPC service ports need to be bound statically to the same ports in all nodes (not the default). There is no proper decision on which ports should be (to replace some old service ports, like smalltalk ?). Would this be changed during package installation ? That might lead us to problems in already existing NFS servers (example).

- NFSv4 is not recommended and should be disabled whenever CTDB is enabled (would install script recommend or force this ?). Official documentation says:

"Unfortunately, RPCNFSDOPTS isn't used by Debian Sys-V init, so there is no way to disable NFSv4 via the configuration file"

We would have to re-verify nfs-kernel-server systemd dependencies and (possibly ?) set it to NFSv3 only (something like work done in https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1590799). CTDB + NFS also depends on specific (to CTDB) environment variables (like the NFS hostname).

3)

https://ctdb.samba.org/manpages/ctdb.7.html:

PRIVATE and PUBLIC addresses + LVS + NATGW + POLICY ROUTING

CTDB configures network interfaces automatically based on decisions made by its internal recovery algorithm (something similar to a master election during reconfs, taking in consideration online votes).

LVS:

- Client sends request packet to LVSMASTER.
- LVSMASTER passes the request on to one node across the internal network.
- Selected node processes the request.
- Node responds back to client.

and NATGW:

When the NATGW functionality is used, one of the nodes is selected to act as a NAT gateway for all the other nodes in the group when they need to communicate with the external services. The NATGW master is selected to be a node that is most likely to have usable networks.

All those features would have to be checked for full enablement and compatibility against systemd-networkd / ifupdown / netplan / initialization scripts. It might not need anything, but testing public and private interfaces being configured in different ways would be a plus here.

4) Specific (to Debian) patches to be carried over on upstream merges:

From patch ctdb_ubuntu_nfs.patch:

+nfs_service="nfs-kernel-server"
+nfslock_service=""
+nfs_config="/etc/default/nfs-kernel-server"

If dealing with systemd (for starting/stopping services) we would have to make sure environment files (from /etc/default) are being read. If dealing with systemd + sysv generator, then the compatibility would be higher and integration easier (existing scripts dealing with script start/stop).

During package installation, would nfs or samba service be disabled and stopped ? This is a requirement since CTDB would control start/stop of those services like if they were resources of a cluster (same decision took from Debian HA packages might be needed here, for corosync + pacemaker resources).

5) Tests that could be integrated at the end:

https://wiki.samba.org/index.php/Setting_up_CTDB_for_Clustered_NFS (# File Handle Consistency)

https://wiki.samba.org/index.php/Setting_up_a_cluster_filesystem (# Checking uniformity of device and inode numbering)

https://wiki.samba.org/index.php/Configuring_clustered_Samba (# Using Samba4 smbtorture)

OBS: Samba ≤ 4.8 needs different configuration options (but I'm assuming this would be a Ubuntu Dev Enablement for future releases).

----

IMO

if we don't address, at least, those concerns, it would not make much of a difference to accept those patches. We would have to guarantee that all items described above are addressed in Ubuntu. For example, accepting a patch that might work if nfs-kernel-server systemd scripts are calling sysv legacy ones, but won't work with systemd-only scripts, would be no good, right ? Just one example I can think of.

TL;DR

I vote for enabling this together with the "Ubuntu HA" effort (on clustered file-systems, distributed locking managers, clustering softwares), guaranteeing that we are enabling/supporting properly all HA software in Ubuntu (possibly documenting different ways of guaranteeing HA on daemons, using one or another, with examples that we will have to use to validate this).