intermittent fab setup_storage abort with disk crc error

Bug #1582025 reported by wenqing liang
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Won't Fix
Medium
Jeya ganesh babu J
R2.20
Won't Fix
Medium
Jeya ganesh babu J
R2.22.x
Won't Fix
Medium
Jeya ganesh babu J
R3.0
Won't Fix
Medium
Jeya ganesh babu J
Trunk
Won't Fix
Medium
Jeya ganesh babu J

Bug Description

Intermittent fab setup_storage abort with disk crc error from one or more cluster nodes:

R2.22.x-14 juno.

2016-05-15 11:40:43:613362: [root@10.87.140.197] out: [5.0.0.1] out: [^[[1mcmbu-ceph-perf3^[[0m][^[[1;37mINFO^[[0m ] Running command: sgdisk --zap-all --clear --mbrtogpt -- /dev/sdc
2016-05-15 11:40:43:613457: [root@10.87.140.197] out: [5.0.0.1] out: [^[[1mcmbu-ceph-perf3^[[0m][^[[1;33mWARNIN^[[0m] ^GCaution: invalid main GPT header, but valid backup; regenerating main header
2016-05-15 11:40:43:714140: [root@10.87.140.197] out: [5.0.0.1] out: [^[[1mcmbu-ceph-perf3^[[0m][^[[1;33mWARNIN^[[0m] from backup!
2016-05-15 11:40:43:714284: [root@10.87.140.197] out: [5.0.0.1] out: [^[[1mcmbu-ceph-perf3^[[0m][^[[1;33mWARNIN^[[0m]
2016-05-15 11:40:43:714373: [root@10.87.140.197] out: [5.0.0.1] out: [^[[1mcmbu-ceph-perf3^[[0m][^[[1;33mWARNIN^[[0m] ^GWarning! Main partition table CRC mismatch! Loaded backup partition table
2016-05-15 11:40:43:714456: [root@10.87.140.197] out: [5.0.0.1] out: [^[[1mcmbu-ceph-perf3^[[0m][^[[1;33mWARNIN^[[0m] instead of main partition table!
2016-05-15 11:40:43:714540: [root@10.87.140.197] out: [5.0.0.1] out: [^[[1mcmbu-ceph-perf3^[[0m][^[[1;33mWARNIN^[[0m]
2016-05-15 11:40:43:714622: [root@10.87.140.197] out: [5.0.0.1] out: [^[[1mcmbu-ceph-perf3^[[0m][^[[1;33mWARNIN^[[0m] Warning! One or more CRCs don't match. You should repair the disk!
2016-05-15 11:40:43:714703: [root@10.87.140.197] out: [5.0.0.1] out: [^[[1mcmbu-ceph-perf3^[[0m][^[[1;33mWARNIN^[[0m]
2016-05-15 11:40:43:714786: [root@10.87.140.197] out: [5.0.0.1] out: [^[[1mcmbu-ceph-perf3^[[0m][^[[1;33mWARNIN^[[0m] Invalid partition data!
2016-05-15 11:40:43:714866: [root@10.87.140.197] out: [5.0.0.1] out: [^[[1mcmbu-ceph-perf3^[[0m][^[[1;34mDEBUG^[[0m ] Caution! After loading partitions, the CRC doesn't check out!
2016-05-15 11:40:45:017284: [root@10.87.140.197] out: [5.0.0.1] out: [^[[1mcmbu-ceph-perf3^[[0m][^[[1;34mDEBUG^[[0m ] GPT data structures destroyed! You may now partition the disk using fdisk or
2016-05-15 11:40:45:017553: [root@10.87.140.197] out: [5.0.0.1] out: [^[[1mcmbu-ceph-perf3^[[0m][^[[1;34mDEBUG^[[0m ] other utilities.
2016-05-15 11:40:45:017703: [root@10.87.140.197] out: [5.0.0.1] out: [^[[1mcmbu-ceph-perf3^[[0m][^[[1;34mDEBUG^[[0m ] Information: Creating fresh partition table; will override earlier problems!
2016-05-15 11:40:45:017846: [root@10.87.140.197] out: [5.0.0.1] out: [^[[1mcmbu-ceph-perf3^[[0m][^[[1;34mDEBUG^[[0m ] Non-GPT disk; not saving changes. Use -g to override.
2016-05-15 11:40:45:018006: [root@10.87.140.197] out: [5.0.0.1] out: [^[[1mcmbu-ceph-perf3^[[0m][^[[1;31mERROR^[[0m ] RuntimeError: command returned non-zero exit status: 3
2016-05-15 11:40:45:018151: [root@10.87.140.197] out: [5.0.0.1] out: [^[[1mceph_deploy^[[0m][^[[1;31mERROR^[[0m ] RuntimeError: Failed to execute command: sgdisk --zap-all --clear --mbrtogpt -- /dev/sdc
2016-05-15 11:40:45:018291: [root@10.87.140.197] out: [5.0.0.1] out:
2016-05-15 11:40:45:018451: [root@10.87.140.197] out: [5.0.0.1] out:

2016-05-15 11:40:45:018291: [root@10.87.140.197] out: [5.0.0.1] out:
2016-05-15 11:40:45:018451: [root@10.87.140.197] out: [5.0.0.1] out:
2016-05-15 11:40:45:026961: [root@10.87.140.197] out: [5.0.0.1] out: Fatal error: local() encountered an error (return code 1) while executing 'sudo ceph-deploy disk zap cmbu-ceph-perf3:/dev/sdc:/dev/sdb'
2016-05-15 11:40:45:027107: [root@10.87.140.197] out: [5.0.0.1] out:
2016-05-15 11:40:45:027197: [root@10.87.140.197] out: [5.0.0.1] out: Aborting.
2016-05-15 11:40:45:027280: [root@10.87.140.197] out: [5.0.0.1] out:
2016-05-15 11:40:45:191442: [root@10.87.140.197] out:
2016-05-15 11:40:45:223084: [root@10.87.140.197] out:
2016-05-15 11:40:45:223221: [root@10.87.140.197] out: Fatal error: run() received nonzero return code 1 while executing!
2016-05-15 11:40:45:223306: [root@10.87.140.197] out:
2016-05-15 11:40:45:223389: [root@10.87.140.197] out: Requested: sudo storage-fs-setup --storage-master 5.0.0.1 --storage-setup-mode setup --storage-hostnames cmbu-ceph-perf1 cmbu-ceph-perf2 cmbu-ceph-perf3 cmbu-ceph-perf4 --storage-compute-hostnames cmbu-ceph-perf2 cmbu-ceph-perf3 cmbu-ceph-perf4 --storage-hosts 5.0.0.1 5.0.0.2 5.0.0.3 5.0.0.4 --storage-host-tokens n1keenA n1keenA n1keenA n1keenA --storage-disk-config cmbu-ceph-perf3:/dev/sdc cmbu-ceph-perf3:/dev/sdd cmbu-ceph-perf2:/dev/sdc cmbu-ceph-perf2:/dev/sdd cmbu-ceph-perf4:/dev/sdc cmbu-ceph-perf4:/dev/sdd --storage-ssd-disk-config cmbu-ceph-perf3:/dev/sde cmbu-ceph-perf3:/dev/sdf cmbu-ceph-perf2:/dev/sde cmbu-ceph-perf2:/dev/sdf cmbu-ceph-perf4:/dev/sde cmbu-ceph-perf4:/dev/sdf --storage-journal-config cmbu-ceph-perf3:/dev/sdb cmbu-ceph-perf2:/dev/sdb cmbu-ceph-perf4:/dev/sdb --storage-local-disk-config none --storage-local-ssd-disk-config none --storage-nfs-disk-config none --storage-directory-config none --storage-chassis-config cmbu-ceph-perf3:0 cmbu-ceph-perf2:0 cmbu-ceph-perf4:1 --storage-mon-hosts none --collector-hosts 5.0.0.1 --collector-host-tokens n1keenA --cfg-host 5.0.0.1 --cinder-vip none --config-hosts none --storage-os-hosts none --storage-os-host-tokens none --cfg-vip none --storage-replica-size None
2016-05-15 11:40:45:223483: [root@10.87.140.197] out: Executed: /bin/bash -l -c "sudo storage-fs-setup --storage-master 5.0.0.1 --storage-setup-mode setup --storage-hostnames cmbu-ceph-perf1 cmbu-ceph-perf2 cmbu-ceph-perf3 cmbu-ceph-perf4 --storage-compute-hostnames cmbu-ceph-perf2 cmbu-ceph-perf3 cmbu-ceph-perf4 --storage-hosts 5.0.0.1 5.0.0.2 5.0.0.3 5.0.0.4 --storage-host-tokens n1keenA n1keenA n1keenA n1keenA --storage-disk-config cmbu-ceph-perf3:/dev/sdc cmbu-ceph-perf3:/dev/sdd cmbu-ceph-perf2:/dev/sdc cmbu-ceph-perf2:/dev/sdd cmbu-ceph-perf4:/dev/sdc cmbu-ceph-perf4:/dev/sdd --storage-ssd-disk-config cmbu-ceph-perf3:/dev/sde cmbu-ceph-perf3:/dev/sdf cmbu-ceph-perf2:/dev/sde cmbu-ceph-perf2:/dev/sdf cmbu-ceph-perf4:/dev/sde cmbu-ceph-perf4:/dev/sdf --storage-journal-config cmbu-ceph-perf3:/dev/sdb cmbu-ceph-perf2:/dev/sdb cmbu-ceph-perf4:/dev/sdb --storage-local-disk-config none --storage-local-ssd-disk-config none --storage-nfs-disk-config none --storage-directory-config none --storage-chassis-config cmbu-ceph-perf3:0 cmbu-ceph-perf2:0 cmbu-ceph-perf4:1 --storage-mon-hosts none --collector-hosts 5.0.0.1 --collector-host-tokens n1keenA --cfg-host 5.0.0.1 --cinder-vip none --config-hosts none --storage-os-hosts none --storage-os-host-tokens none --cfg-vip none --storage-replica-size None"
2016-05-15 11:40:45:223610: [root@10.87.140.197] out:
2016-05-15 11:40:45:223610: [root@10.87.140.197] out:
2016-05-15 11:40:45:223722: [root@10.87.140.197] out: Aborting.
2016-05-15 11:40:45:223800: [root@10.87.140.197] out:
2016-05-15 11:40:45:356162:
2016-05-15 11:40:45:357777: Disconnecting from 10.87.140.197... done.
2016-05-15 11:40:45:472127:

Tags: storage
Changed in juniperopenstack:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.