pcs cluster auth does not generate "tokens" file

Bug #1640923 reported by Jonathan Meisel
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
pcs (Debian)
Fix Released
Unknown
pcs (Ubuntu)
Opinion
Medium
Rafael David Tinoco
Nominated for Xenial by Rafael David Tinoco
Nominated for Yakkety by Rafael David Tinoco
Nominated for Zesty by Rafael David Tinoco

Bug Description

PCS cluster auth does not generate a tokens file when /etc/corosync.conf is present, but still shows that authorization completed successfully.

----------------->%-----------------
lsb_release -rd
 Description: Ubuntu 16.04.1 LTS
 Release: 16.04

-----------------%<-----------------

apt-cache policy pcs
 pcs:
   Installed: 0.9.149-1
   Candidate: 0.9.149-1
   Version table:
  *** 0.9.149-1 500
 ----------------->%-----------------

sudo pcs cluster auth uby2 uby3 -u hacluster
Password:
uby2: Authorized
uby3: Authorized

ls -l /var/lib/pcsd/tokens
ls: cannot access '/var/lib/pcsd/tokens': No such file or directory

sudo pcs cluster setup uby2 uby3 --name jmclus2
Error: uby2: error checking node availability: Unable to authenticate to uby2 - (HTTP error: 401), try running 'pcs cluster auth'
Error: uby3: error checking node availability: Unable to authenticate to uby3 - (HTTP error: 401), try running 'pcs cluster auth'
Error: nodes availability check failed, use --force to override. WARNING: This will destroy existing cluster on the nodes.

 -------------------------
If corosync and pacemaker are stopped and /etc/corosync.conf is removed, then /var/lib/pcsd/tokens gets created successfully.

systemctl stop pacemaker
systemctl stop corosync
rm /etc/corosync/corosync.conf

sudo pcs cluster auth uby2 uby3 -u hacluster
Password:
uby2: Authorized
uby3: Authorized

ls -l /var/lib/pcsd/tokens
-rw------- 1 root root 168 Nov 10 11:54 /var/lib/pcsd/tokens

 ----------------->%-----------------

It seems like "pcs cluster auth" should exit with an error message if it can't generate a tokens file. Furthermore, should corosync and pacemaker be started by default when installed as dependencies of "pcs?" pcs cluster start will start these services. Right now the only way to set up a cluster after running "apt-get install pcs" is to manually stop corosync and pacemaker, delete corosync.conf, run "pcs cluster auth", "pcs cluster setup", and then "pcs cluster start."

I uploaded the following sosreports:
sosreport-J.Meisel.1640923-20161115104001.tar.xz -- this is from before installing pcs/pacemaker/corosync

sosreport-J.Meisel.1640923-20161115105603.tar.xz -- this is after "pcs cluster auth" did not generate a tokens file

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in pcs (Ubuntu):
status: New → Confirmed
Changed in pcs (Ubuntu):
assignee: nobody → Rafael David Tinoco (inaddy)
importance: Undecided → Medium
description: updated
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Worst than that, if you don´t have corosync installed:

http://pastebin.ubuntu.com/23548460/

So, likely, corosync should be put as a dependency for pcs.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :
Download full text (4.0 KiB)

At least the errors are being put into /var/log/pcsd:

~$ sudo pcs cluster auth cluster01 cluster02
Username: hacluster
Password:
I, [2016-11-28T11:52:35.006533 #7817] INFO -- : Running: /usr/sbin/corosync-cmapctl totem.cluster_name
I, [2016-11-28T11:52:35.006602 #7817] INFO -- : CIB USER: hacluster, groups:
I, [2016-11-28T11:52:35.022368 #7817] INFO -- : Return Value: 0
I, [2016-11-28T11:52:35.022619 #7817] INFO -- : Attempting login by 'hacluster'
I, [2016-11-28T11:52:35.034284 #7817] INFO -- : Running: id -Gn hacluster
I, [2016-11-28T11:52:35.034367 #7817] INFO -- : CIB USER: hacluster, groups:
I, [2016-11-28T11:52:35.040102 #7817] INFO -- : Return Value: 0
I, [2016-11-28T11:52:35.040184 #7817] INFO -- : Successful login by 'hacluster'
192.168.65.53 - - [28/Nov/2016:11:52:35 -0200] "POST /remote/auth HTTP/1.1" 200 36 0.0344
192.168.65.53 - - [28/Nov/2016:11:52:35 -0200] "POST /remote/auth HTTP/1.1" 200 36 0.0344
192.168.65.53 - - [28/Nov/2016:11:52:35 BRST] "POST /remote/auth HTTP/1.1" 200 36
- -> /remote/auth
I, [2016-11-28T11:52:35.173157 #7817] INFO -- : Running: /usr/sbin/corosync-cmapctl totem.cluster_name
I, [2016-11-28T11:52:35.173221 #7817] INFO -- : CIB USER: hacluster, groups:
I, [2016-11-28T11:52:35.196961 #7817] INFO -- : Return Value: 0
I, [2016-11-28T11:52:35.197219 #7817] INFO -- : Attempting login by 'hacluster'
I, [2016-11-28T11:52:35.207612 #7817] INFO -- : Running: id -Gn hacluster
I, [2016-11-28T11:52:35.207690 #7817] INFO -- : CIB USER: hacluster, groups:
I, [2016-11-28T11:52:35.213160 #7817] INFO -- : Return Value: 0
I, [2016-11-28T11:52:35.213236 #7817] INFO -- : Successful login by 'hacluster'
W, [2016-11-28T11:52:35.213704 #7817] WARN -- : Cannot read config 'tokens' from '/var/lib/pcsd/tokens': No such file or directory @ rb_sysopen - /var/lib/pcsd/tokens
E, [2016-11-28T11:52:35.213786 #7817] ERROR -- : Unable to parse tokens file: A JSON text must at least contain two octets!
I, [2016-11-28T11:52:35.213820 #7817] INFO -- : SRWT Node: cluster01 Request: check_auth
E, [2016-11-28T11:52:35.213841 #7817] ERROR -- : Unable to connect to node cluster01, no token available
W, [2016-11-28T11:52:35.219341 #7817] WARN -- : Cannot read config 'tokens' from '/var/lib/pcsd/tokens': No such file or directory @ rb_sysopen - /var/lib/pcsd/tokens
E, [2016-11-28T11:52:35.219381 #7817] ERROR -- : Unable to parse tokens file: A JSON text must at least contain two octets!
I, [2016-11-28T11:52:35.221016 #7817] INFO -- : SRWT Node: cluster02 Request: check_auth
E, [2016-11-28T11:52:35.223004 #7817] ERROR -- : Unable to connect to node cluster02, no token available
I, [2016-11-28T11:52:35.225237 #7817] INFO -- : No response from: cluster01 request: /auth, exception: undefined method `[]' for nil:NilClass
I, [2016-11-28T11:52:35.227582 #7817] INFO -- : No response from: cluster02 request: /auth, exception: undefined method `[]' for nil:NilClass
192.168.65.53 - - [28/Nov/2016:11:52:35 -0200] "POST /remote/auth HTTP/1.1" 200 36 0.0549
192.168.65.53 - - [28/Nov/2016:11:52:35 -0200] "POST /remote/auth HTTP/1.1" 200 36 0.0550
192.168.65.53 - - [28/Nov/2016:11:52:35 BRST] "POST /remote/auth HTTP/1.1" 200 36
- -> /rem...

Read more...

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Regarding this statement:

> It seems like "pcs cluster auth" should exit with an error message if it can't generate a tokens file.

I do agree with this statement. Tested latest version upstream and it doesn´t generate any error message also. We can open a bug upstream and monitor. This is probably due to the client <-> server nature of pcs <-> pcsd communication (since logs are generated from pcsd itself). Not sure there is a mechanism to provide specific error message feedback.

> Furthermore, should corosync and pacemaker be started by default when installed as dependencies of "pcs?" pcs cluster start will start these services. Right now the only way to set up a cluster after running "apt-get install pcs" is to manually stop corosync and pacemaker, delete corosync.conf, run "pcs cluster auth", "pcs cluster setup", and then "pcs cluster start."

I´m changing pcs package itself (debian/ structure).

~$ sudo dpkg -i ./pcs_0.9.153-2_all.deb
Selecting previously unselected package pcs.
(Reading database ... 152254 files and directories currently installed.)
Preparing to unpack ./pcs_0.9.153-2_all.deb ...
Unpacking pcs (0.9.153-2) ...
Setting up pcs (0.9.153-2) ...
Processing triggers for systemd (229-4ubuntu13) ...
Processing triggers for ureadahead (0.100.0-19) ...
Processing triggers for man-db (2.7.5-1) ...
inaddy@(cluster01):~$ ps -ef | grep coros
root 3247 1 0 17:20 ? 00:00:00 /usr/sbin/corosync -f
inaddy 3701 1097 0 17:22 ttyS0 00:00:00 grep --color=auto coros

PCS doesn´t even work without the corosync binary. Im changing from Recommends to Depends.

I have also added debconf questions allowing the package to move corosync.conf file AND to stop corosync daemon:

inaddy@(cluster01):~$ sudo dpkg-reconfigure --priority=medium pcs

http://pastebin.ubuntu.com/23549910/

Answering yes:

inaddy@(cluster01):~$ ps -ef | grep corosync

inaddy@(cluster01):~$ ls /etc/corosync/corosync.conf*
/etc/corosync/corosync.conf.pcs

Only problem is that this SRU will take awhile since I have to propose this fix to Debian first. Im going to propose several fixes (for the pcs bugs) at once.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :
Download full text (3.2 KiB)

Showing fixes for:

https://bugs.launchpad.net/ubuntu/+source/pcs/+bug/1580035
https://bugs.launchpad.net/ubuntu/+source/pcs/+bug/1580045
https://bugs.launchpad.net/ubuntu/+source/pcs/+bug/1640923

Okay, for, working in the upstream version (since i'll propose major changes to the debian package itself) this is how it is going to work:

----

inaddy@(cluster01):~$ ps -ef | grep [c]orosync
root 15614 1 0 22:07 ? 00:00:00 /usr/sbin/corosync -f

inaddy@(cluster01):~$ sudo dpkg -i ./*.deb
Selecting previously unselected package pcs.
(Reading database ... 152286 files and directories currently installed.)
Preparing to unpack ./pcs_0.9.153-2_all.deb ...
Unpacking pcs (0.9.153-2) ...
Setting up pcs (0.9.153-2) ...
Processing triggers for systemd (229-4ubuntu13) ...
Processing triggers for ureadahead (0.100.0-19) ...
Processing triggers for man-db (2.7.5-1) ...

and you are prompted with:

http://pastebin.ubuntu.com/23585840/

inaddy@(cluster01):~$ sudo dpkg-reconfigure pcs

and you are prompted with:

http://pastebin.ubuntu.com/23585841/

----

After installation:

inaddy@(cluster01):~$ ps -ef | grep [c]orosync

inaddy@(cluster01):~$ ps -ef | grep [p]csd
root 16148 1 0 22:09 ? 00:00:00 /usr/bin/ruby -C/var/lib/pcsd -I/usr/share/pcsd -- /usr/share/pcsd/ssl.rb & > /dev/null &

inaddy@(cluster01):~$ netstat -ltp
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 *:2224 *:* LISTEN -
tcp6 0 0 [::]:2224 [::]:* LISTEN -

----

Configuration:

inaddy@(cluster01):~$ sudo pcs cluster auth cluster01 cluster02 -u hacluster
Password:
cluster02: Authorized
cluster01: Authorized

inaddy@(cluster01):~$ sudo ls -l /var/lib/pcsd/tokens
-rw------- 1 root root 178 Dec 5 22:14 /var/lib/pcsd/tokens

inaddy@(cluster01):~$ sudo pcs cluster setup --force cluster01 cluster02 --name cluster
Destroying cluster on nodes: cluster01, cluster02...
cluster01: Stopping Cluster (pacemaker)...
cluster02: Stopping Cluster (pacemaker)...
cluster02: Successfully destroyed cluster
cluster01: Successfully destroyed cluster

Sending cluster config files to the nodes...
cluster01: Succeeded
cluster02: Succeeded

Synchronizing pcsd certificates on nodes cluster01, cluster02...
cluster02: Success
cluster01: Success

Restarting pcsd on the nodes in order to reload the certificates...
cluster02: Success
cluster01: Success

----

Running:

inaddy@(cluster01):~$ sudo pcs cluster start --all
cluster02: Starting Cluster...
cluster01: Starting Cluster...

inaddy@(cluster01):~$ ps -ef | egrep -E "[c]orosync|[p]acemaker"
root 16961 1 1 22:15 ? 00:00:00 /usr/sbin/corosync -f
root 16965 1 0 22:15 ? 00:00:00 /usr/sbin/pacemakerd -f
haclust+ 16967 16965 0 22:15 ? 00:00:00 /usr/lib/pacemaker/cib
root 16968 16965 0 22:15 ? 00:00:00 /usr/lib/pacemaker/stonithd
root 16969 16965 0 22:15 ? 00:00:00 /usr/lib/pacemaker/lrmd
haclust+ 16970 16965 0 22:15 ? 00:00:00 /usr/lib/pacemaker/attrd
haclust+ 16971 16965 0 22:15 ? 00:00:00 /usr/lib/pacemaker/peng...

Read more...

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :
Changed in pcs (Debian):
status: Unknown → New
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Upstream discussion followed here:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=847294
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=847295

And last statement is this (from Maintainer):

> On Thu, Dec 08, 2016 at 06:27:48PM -0200, Rafael David Tinoco wrote:
> Definitely. Since PCS is a package to *configure* corosync. I do agree
> with your statement (above and bellow).

True, but there is another use-case where you install it on an existing
cluster to get a web interface. In this case you don't want it to touch
existing setup by default.

> Just trying to make PCS functional "by default" since now it will be
> used as the clustering configuration tool for MSSQL Linux HA and they
> need it configured by default (or capable of).

I would argue that pcs is functional as all three daemons are running:

 * corosync
 * pacemaker
 * pcsd

What is not functional is the cluster as a whole, but this will always
require some manual configuration as it spans more than one host and
we cannot handle that in packaging.

In your case you would need to run something like:

 pcs cluster destroy
 pcs cluster auth node1 node2
 pcs cluster setup --start --name cluster node1 node2
 pcs resource create ...

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

I do agree that, since PCS can manage an existing installed cluster - I haven't considered this for all my previous posts/comments/patches - PCS installation should NOT try to take care of corosync/pacemaker files.

Whoever is dealing with PCS has to handle it properly AFTER installation. Argument for that - from Maintainer - is this:

"this will always require some manual configuration"

And this is also true for corosync/pacemaker.

I'm abandoning changes for this specific bug and marking it as Opinion. Will continue fix for other bugs in LP: #1640919.

Changed in pcs (Ubuntu):
status: Confirmed → Opinion
Changed in pcs (Debian):
status: New → Fix Released
Revision history for this message
Oleksii (oleksii99) wrote :

Just want to say "Thank you!" to the OP: Jonathan Meisel
This still not fixed in latest Debian release. I was bang my head against a wall until I found this tread.
The initial bug report is still true for my Debian:
root@ivr02:~# lsb_release -rd
Description: Debian GNU/Linux 9.1 (stretch)
Release: 9.1
root@ivr02:~# apt-cache policy pcs
pcs:
  Installed: 0.9.155+dfsg-2
  Candidate: 0.9.155+dfsg-2
  Version table:
 *** 0.9.155+dfsg-2 500
        500 http://ftp.us.debian.org/debian stretch/main i386 Packages
        100 /var/lib/dpkg/status
At least some pcs output like "cannot generate tokens file, b/c of existing Corosync config" would be really helpful.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.