installer hostname change causes sqstart to fail

Bug #1450223 reported by Chris Tjepkema
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Trafodion
New
Medium
Amanda Moran

Bug Description

Test: change hostname with hostname command to an FQDN:
This will cause sqstart to fail after install...

from
hostname
n003
to
hostname n003.cm.cluster

[root@n003 installer]# hostname
n003.cm.cluster
-
NOTE
       Note that hostname doesn't change anything permanently. After reboot original
       names from /etc/hosts are used again.
--
No error is detected by installer until sqstart fails and sqmon.log reads:
[SHELL] %startup
[SHELL] Cannot start monitor from node 'n003.cm.cluster' since it is not member of the cluster configuration or 'hostname' string does not match configuration string.
[SHELL] Configuration node names:
[SHELL] 'n003'
[SHELL] 'n004'
[SHELL] Failed to start environment!

--
trafodion_install
Enter list of nodes (blank separated), default '': n003.cm.cluster n004.cm.cluster
***INFO: Testing sudo access on node n003.cm.cluster
Warning: Permanently added 'n004.cm.cluster' (RSA) to the list of known hosts.
***INFO: Testing sudo access on node n004.cm.cluster
Specify location of Java 1.7.0_65 or higher (JDK), default is /usr/java/jdk1.7.0_67-cloudera:
Enter full path (including .tar or .tar.gz) of trafodion tar file (): ../daily.tar.gz

:
:
******* SUMMARY *******
Completed Execution on 2 nodes: n003.cm.cluster, n004
Results:
  n003.cm.cluster, n004 - HyperthreadingEnabled check FAILED [warning]

Additional details in log file: /root/Downloads/amanda/installer/2015-04-29-18-51-41.log

***INFO: Trafodion scanner ran without error. Install will continue...
:
:
TRAFODION START
******************************

/usr/lib/trafodion/installer/..
***INFO: Log file location /var/log/trafodion/trafodion_install_2015-04-29-18-59-31.log
***INFO: Starting Trafodion installer (2015-04-29-18-59-31)
/opt/trafodion/daily
***INFO: untarring build file /usr/lib/trafodion/daily/trafodion_server-1.2.0.tgz to /opt/trafodion/daily
***INFO: modifying .bashrc to set Trafodion environment variables
***INFO: copying .bashrc file to all nodes
***INFO: copying sqconfig file (/opt/trafodion/sqconfig) to /opt/trafodion/daily/sql/scripts/sqconfig
***INFO: Creating /opt/trafodion/daily directory on all nodes
***INFO: starting sqgen
n003,n004

Creating directories on cluster nodes
/usr/bin/pdsh -w n003,n004 -x n003.cm.cluster mkdir -p /opt/trafodion/daily/etc
/usr/bin/pdsh -w n003,n004 -x n003.cm.cluster mkdir -p /opt/trafodion/daily/logs
/usr/bin/pdsh -w n003,n004 -x n003.cm.cluster mkdir -p /opt/trafodion/daily/tmp
/usr/bin/pdsh -w n003,n004 -x n003.cm.cluster mkdir -p /opt/trafodion/daily/sql/scripts

The SQ environment variable file /opt/trafodion/daily/etc/ms.env exists.
The file will not be re-generated.

Note: Using cluster.conf format type 2.

The SeaMonster environment variable file /opt/trafodion/daily/etc/seamonster.env exists.
The file will not be re-generated.

Copying the generated files to all the nodes in the cluster

Copying /opt/trafodion/daily/tmp/cluster.conf to /opt/trafodion/daily/tmp of all the nodes
/usr/bin/pdcp -p -w n003,n004 -x n003.cm.cluster /opt/trafodion/daily/tmp/cluster.conf /opt/trafodion/daily/tmp

Copying /opt/trafodion/daily/etc/ms.env to /opt/trafodion/daily/etc of all the nodes
/usr/bin/pdcp -p -w n003,n004 -x n003.cm.cluster /opt/trafodion/daily/etc/ms.env /opt/trafodion/daily/etc

Copying /opt/trafodion/daily/etc/seamonster.env to /opt/trafodion/daily/etc of all the nodes
/usr/bin/pdcp -p -w n003,n004 -x n003.cm.cluster /opt/trafodion/daily/etc/seamonster.env /opt/trafodion/daily/etc

Copying rest of the generated files to /opt/trafodion/daily/sql/scripts
/usr/bin/pdcp -p -w n003,n004 -x n003.cm.cluster sqconfig sqshell gomon.cold gomon.warm shell.env mon.env /opt/trafodion/daily/sql/scripts
pdcp@n003: can't stat shell.env
pdcp@n003: can't stat mon.env
/usr/bin/pdcp -p -w n003,n004 -x n003.cm.cluster sqconfig sqconfig.db /opt/trafodion/daily/sql/scripts

******* Generate public/private certificates *******

Cluster Name : ATCSQ02
Generating Self Signed Certificate....
***********************************************************
Certificate file :server.crt
Private key file :server.key
Certificate/Private key created in directory :/opt/trafodion/sqcert
***********************************************************

***********************************************************
Updating Authentication Configuration
***********************************************************
Creating folders for storing certificates

***INFO: copying /opt/trafodion/sqcert directory to all nodes
***INFO: Start of DCS install
***INFO: untarring build file /usr/lib/trafodion/daily/dcs-1.2.0.tgz
***INFO: modifying /opt/trafodion/daily/dcs-1.2.0/conf/dcs-env.sh
***INFO: modifying /opt/trafodion/daily/dcs-1.2.0/conf/dcs-site.xml
***INFO: creating /opt/trafodion/daily/dcs-1.2.0/conf/servers file
***INFO: End of DCS install.
***INFO: Start of REST Server install
***INFO: untarring build file /usr/lib/trafodion/daily/rest-1.2.0.tgz
***INFO: modifying /opt/trafodion/daily/rest-1.2.0/conf/rest-site.xml
***INFO: End of REST Server install.
***INFO: copying install to all nodes
***INFO: starting Trafodion instance
Checking orphan processes.
Removing old mpijob* files from /opt/trafodion/daily/tmp

Removing old monitor.port* files from /opt/trafodion/daily/tmp

Executing sqipcrm (output to sqipcrm.out)
Starting the SQ Environment (Executing /opt/trafodion/daily/sql/scripts/gomon.cold)
Background SQ Startup job (pid: 20898)

# of SQ processes: 5 ....
Error while executing the startup script!!!
Checking if processes are up.
# of SQ processes: 5 user specified max: 1. Execution time in seconds: 4.

The SQ environment is not up all, or partially up and not operational. Check the logs.

Process Configured Actual Down
------- ---------- ------ ----
DTM 2 0 \$TM0 \$TM1
RMS 4 0 \$ZSC000 \$ZSC001 \$ZSM000 \$ZSM001
MXOSRVR 2 0 2

The SQ environment is down.]

Please check the SQ shell log file : /opt/trafodion/daily/logs/sqmon.log

SQ Startup (from /opt/trafodion/daily/sql/scripts) Failed

/opt/trafodion/daily/dcs-1.2.0/bin/start-dcs.sh found.
Starting the DCS environment now
master 32068. Stop it first.
n003.cm.cluster: server 32213. Stop it first.
n004: server 6816. Stop it first.
Checking if processes are up.
Checking attempt: 1; user specified max: 2. Execution time in seconds: 4.

The SQ environment is not up all, or partially up and not operational. Check the logs.

Process Configured Actual Down
------- ---------- ------ ----
DTM 2 0 \$TM0 \$TM1
RMS 4 0 \$ZSC000 \$ZSC001 \$ZSM000 \$ZSM001
MXOSRVR 2 0 2

The SQ environment is down.]
Starting lob server processes
/opt/trafodion/daily/rest-1.2.0/bin/start-rest.sh found.
Starting the REST environment now
rest 32689. Stop it first.

You can monitor the SQ shell log file : /opt/trafodion/daily/logs/sqmon.log

Startup time 0 hour(s) 2 minute(s) 30 second(s)
***ERROR: sqstart failed with RC=1. Check /opt/trafodion/daily/sqmon.log file for details.
***ERROR: Consider running trafodion_scanner, to assist in debugging.
***ERROR: Error while running traf_start.
***ERROR: Setup not complete, review logs.
***ERROR: Exiting....

Tags: installer
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.