DCS: Tries to start even if trafodion is not running.

Bug #1442685 reported by Guy Groulx
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Trafodion
Confirmed
Medium
Anuradha

Bug Description

Ran into situation where sqstart failed in starting trafodion.

However, it did the dcsstart anyway and dcsstart did start the dcs master and server processes and tried to start the mxosrvr.

It does not make sense for dcsstart to execute if trafodion is not up and running.

Revision history for this message
Matt Brown (mattbrown-2) wrote :

There are developer scenarios where we want to start DCS even though Trafodion is not running. DcsServer already checks and warns if Trafodion is not running before attempting to launch any mxosrvrs. See below for an example of DCS start when Trafodion is stopped. It's possible that sqstart script could be changed check for successful start of Trafodion and not start DCS in case of failure.

2015-04-10 15:56:46,596 INFO org.trafodion.dcs.server.ServerManager: User program enabled
2015-04-10 15:56:46,608 INFO org.trafodion.dcs.server.ServerManager: Created znode [/user/dcs/servers/running/g4t3016.houston.hp.com:1:50031:1428681406596]
2015-04-10 15:56:46,611 INFO org.trafodion.dcs.server.ServerManager: Server handler [1:1] is not running
2015-04-10 15:56:46,612 INFO org.trafodion.dcs.server.ServerManager: Server handler [1:2] is not running
2015-04-10 15:56:46,613 INFO org.trafodion.dcs.server.ServerManager: Server handler [1:3] is not running
2015-04-10 15:56:46,614 INFO org.trafodion.dcs.server.ServerManager: Server handler [1:4] is not running
2015-04-10 15:56:54,795 ERROR org.trafodion.dcs.server.ServerManager: Trafodion is not running
2015-04-10 15:56:54,795 INFO org.trafodion.dcs.util.RetryCounter: Sleeping 10000ms before retry #1...
2015-04-10 15:56:54,795 ERROR org.trafodion.dcs.server.ServerManager: Trafodion is not running
2015-04-10 15:56:54,795 INFO org.trafodion.dcs.util.RetryCounter: Sleeping 10000ms before retry #1...
2015-04-10 15:56:54,796 ERROR org.trafodion.dcs.server.ServerManager: Trafodion is not running
2015-04-10 15:56:54,796 INFO org.trafodion.dcs.util.RetryCounter: Sleeping 10000ms before retry #1...
2015-04-10 15:56:54,796 ERROR org.trafodion.dcs.server.ServerManager: Trafodion is not running
2015-04-10 15:56:54,796 INFO org.trafodion.dcs.util.RetryCounter: Sleeping 10000ms before retry #1...

Changed in trafodion:
assignee: nobody → Matt Brown (mattbrown-2)
Revision history for this message
Matt Brown (mattbrown-2) wrote :

The sqstart script should check return code of sqcheck prior to starting DCS

Changed in trafodion:
assignee: Matt Brown (mattbrown-2) → Anuradha (anuradha-hegde)
milestone: none → r1.2
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.