Hello,
I cannot share sos report due to sensitive information. Symptoms are following. Update of pacemaker in Ubuntu 16 to pacemaker 1.9 will cause crash of VMs. Also error log of pacemaker and syslog are flooded by thousands of messages per second - the disk will go out quit quickly.
Reboot -f is helping(VMs boots also). Ordinary reboot (without -f) is stuck for about 30 minutes, then machine boots and problem is gone.
snippet of logs from syslog:
194551 Jan 8 09:14:31 te-primary crmd[3948]: notice: Transition aborted by lrm_rsc_op.te-res_last_failure_0: Event failed (cib=0.0.0, source=match_graph_event:381, path=/create_request_adv/crm_xml/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='rte_re s']/lrm_rsc_op[@id='te-res_last_failure_0'], 0)
194552 Jan 8 09:14:31 te-primary crmd[3948]: warning: FSA: Input I_FAIL from get_lrm_resource() received in state S_TRANSITION_ENGINE
194553 Jan 8 09:14:31 te-primary crmd[3948]: notice: Transition aborted: Peer Cancelled (source=do_te_invoke:161, 0)
194554 Jan 8 09:14:31 te-primary crmd[3948]: notice: Transition 6451494 (Complete=3, Pending=0, Fired=0, Skipped=4, Incomplete=28, Source=/var/lib/pacemaker/pengine/pe-input-19.bz2): Stopped
194555 Jan 8 09:14:31 te-primary pengine[3947]: notice: On loss of DCM Quorum: Ignore
194556 Jan 8 09:14:31 te-primary pengine[3947]: notice: Start te-res#011(te-primary)
194557 Jan 8 09:14:31 te-primary pengine[3947]: notice: Start fs_res#011(te-primary)
194558 Jan 8 09:14:31 te-primary pengine[3947]: notice: Start VM_DCM_res#011(te-primary)
194559 Jan 8 09:14:31 te-primary pengine[3947]: notice: Start VM_DAC_1_res#011(te-primary)
194560 Jan 8 09:14:31 te-primary pengine[3947]: notice: Start drbd_res:0#011(te-primary)
194561 Jan 8 09:14:31 te-primary pengine[3947]: notice: Promote drbd_res:0#011(Stopped -> Master te-primary)
194562 Jan 8 09:14:31 te-primary pengine[3947]: notice: Calculated Transition 6451495: /var/lib/pacemaker/pengine/pe-input-19.bz2
194563 Jan 8 09:14:31 te-primary crmd[3948]: warning: bad input <create_request_adv origin="te_rsc_command" t="crmd" version="3.0.10" subt="request" reference="lrm_invoke-tengine-1610093671-19354500" crm_task="lrm_invoke" crm_sys_to="lrmd" crm_sys_from="tengine" crm_h ost_to="te-primary" acl_target="hacluster" crm_user="hacluster">
194564 Jan 8 09:14:31 te-primary crmd[3948]: warning: bad input <crm_xml>
194565 Jan 8 09:14:31 te-primary crmd[3948]: warning: bad input <rsc_op id="2" operation="monitor" operation_key="te-res_monitor_0" on_node="te-primary" on_node_uuid="1" transition-key="2:6451495:7:ea06a2cc-9174-42a2-93ee-121424209102">
194566 Jan 8 09:14:31 te-primary crmd[3948]: warning: bad input <primitive id="te-res" class="ocf" provider="copmany_name" type="hdsrte_ra"/>
194567 Jan 8 09:14:31 te-primary crmd[3948]: warning: bad input <attributes CRM_meta_op_target_rc="7" CRM_meta_timeout="20000" crm_feature_set="3.0.10"/>
194568 Jan 8 09:14:31 te-primary crmd[3948]: warning: bad input </rsc_op>
194569 Jan 8 09:14:31 te-primary crmd[3948]: warning: bad input </crm_xml>
194570 Jan 8 09:14:31 te-primary crmd[3948]: warning: bad input </create_request_adv>
194571 Jan 8 09:14:31 te-primary crmd[3948]: notice: Transition aborted by lrm_rsc_op.te-res_last_failure_0: Event failed (cib=0.0.0, source=match_graph_event:381, path=/create_request_adv/crm_xml/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='rte_re s']/lrm_rsc_op[@id='te-res_last_failure_0'], 0)
As it seems to be different error, I will open new bug-report
Thanks a lot
sincerely
Martin
Hello,
I cannot share sos report due to sensitive information. Symptoms are following. Update of pacemaker in Ubuntu 16 to pacemaker 1.9 will cause crash of VMs. Also error log of pacemaker and syslog are flooded by thousands of messages per second - the disk will go out quit quickly.
Reboot -f is helping(VMs boots also). Ordinary reboot (without -f) is stuck for about 30 minutes, then machine boots and problem is gone.
snippet of logs from syslog: op.te-res_ last_failure_ 0: Event failed (cib=0.0.0, source= match_graph_ event:381, path=/create_ request_ adv/crm_ xml/node_ state[@ id='1'] /lrm[@id= '1']/lrm_ resources/ lrm_resource[ @id='rte_ re s']/lrm_ rsc_op[ @id='te- res_last_ failure_ 0'], 0) do_te_invoke: 161, 0) /var/lib/ pacemaker/ pengine/ pe-input- 19.bz2) : Stopped 011(te- primary) 011(te- primary) res#011( te-primary) 1_res#011( te-primary) 0#011(te- primary) 0#011(Stopped -> Master te-primary) pacemaker/ pengine/ pe-input- 19.bz2 "te_rsc_ command" t="crmd" version="3.0.10" subt="request" reference= "lrm_invoke- tengine- 1610093671- 19354500" crm_task= "lrm_invoke" crm_sys_to="lrmd" crm_sys_ from="tengine" crm_h ost_to="te-primary" acl_target= "hacluster" crm_user= "hacluster" > key="te- res_monitor_ 0" on_node= "te-primary" on_node_uuid="1" transition- key="2: 6451495: 7:ea06a2cc- 9174-42a2- 93ee-1214242091 02"> "copmany_ name" type="hdsrte_ra"/> op_target_ rc="7" CRM_meta_ timeout= "20000" crm_feature_ set="3. 0.10"/> request_ adv> op.te-res_ last_failure_ 0: Event failed (cib=0.0.0, source= match_graph_ event:381, path=/create_ request_ adv/crm_ xml/node_ state[@ id='1'] /lrm[@id= '1']/lrm_ resources/ lrm_resource[ @id='rte_ re s']/lrm_ rsc_op[ @id='te- res_last_ failure_ 0'], 0)
194551 Jan 8 09:14:31 te-primary crmd[3948]: notice: Transition aborted by lrm_rsc_
194552 Jan 8 09:14:31 te-primary crmd[3948]: warning: FSA: Input I_FAIL from get_lrm_resource() received in state S_TRANSITION_ENGINE
194553 Jan 8 09:14:31 te-primary crmd[3948]: notice: Transition aborted: Peer Cancelled (source=
194554 Jan 8 09:14:31 te-primary crmd[3948]: notice: Transition 6451494 (Complete=3, Pending=0, Fired=0, Skipped=4, Incomplete=28, Source=
194555 Jan 8 09:14:31 te-primary pengine[3947]: notice: On loss of DCM Quorum: Ignore
194556 Jan 8 09:14:31 te-primary pengine[3947]: notice: Start te-res#
194557 Jan 8 09:14:31 te-primary pengine[3947]: notice: Start fs_res#
194558 Jan 8 09:14:31 te-primary pengine[3947]: notice: Start VM_DCM_
194559 Jan 8 09:14:31 te-primary pengine[3947]: notice: Start VM_DAC_
194560 Jan 8 09:14:31 te-primary pengine[3947]: notice: Start drbd_res:
194561 Jan 8 09:14:31 te-primary pengine[3947]: notice: Promote drbd_res:
194562 Jan 8 09:14:31 te-primary pengine[3947]: notice: Calculated Transition 6451495: /var/lib/
194563 Jan 8 09:14:31 te-primary crmd[3948]: warning: bad input <create_request_adv origin=
194564 Jan 8 09:14:31 te-primary crmd[3948]: warning: bad input <crm_xml>
194565 Jan 8 09:14:31 te-primary crmd[3948]: warning: bad input <rsc_op id="2" operation="monitor" operation_
194566 Jan 8 09:14:31 te-primary crmd[3948]: warning: bad input <primitive id="te-res" class="ocf" provider=
194567 Jan 8 09:14:31 te-primary crmd[3948]: warning: bad input <attributes CRM_meta_
194568 Jan 8 09:14:31 te-primary crmd[3948]: warning: bad input </rsc_op>
194569 Jan 8 09:14:31 te-primary crmd[3948]: warning: bad input </crm_xml>
194570 Jan 8 09:14:31 te-primary crmd[3948]: warning: bad input </create_
194571 Jan 8 09:14:31 te-primary crmd[3948]: notice: Transition aborted by lrm_rsc_
As it seems to be different error, I will open new bug-report
Thanks a lot
sincerely
Martin