2017-12-11 14:42:00 |
Doug Szumski |
bug |
|
|
added bug |
2017-12-11 14:42:45 |
Doug Szumski |
description |
During node cleaning, the generic hardware manager can fail in the `erasing device metadata` step if the GPT is invalid. Specifically this can happen when the hardware manager calls ```sgdisk -Z /dev/somedrive``` to destroy the GPT and MBR data structures.
It isn't clear why sgdisk is validating the GPT when the -Z flag (zap all) instructs sgdisk to destroy the GPT. However, upon retrying sgdisk -Z succeeds.
Example failure:
```
maintenance_reason | Agent returned error for clean step {u'priority': 99, u'interface': |
| | u'deploy', u'reboot_requested': False, u'abortable': True, u'step': |
| | u'erase_devices_metadata'} on node 1b973868-9734-4ecf-9700-c0730e97e031 |
| | : {u'message': u'Clean step failed: Error performing clean_step |
| | erase_devices_metadata: Error erasing block device: Failed to erase the |
| | metadata on the device(s): "/dev/nvme3n1": Unexpected error while |
| | running command. |
| | Command: sgdisk -Z /dev/nvme3n1 |
| | Exit code: 2 |
| | Stdout: |
| | u"Caution! After loading partitions, the CRC doesn\'t check out!\ |
| | GPT |
| | data structures destroyed! You may now partition the disk using fdisk |
| | or\ |
| | other utilities.\ |
| | " |
| | Stderr: u"\\x07Caution: invalid main GPT |
| | header, but valid backup; regenerating main header\ |
| | from |
| | backup!\ |
| | \ |
| | \\x07Warning! Main partition table CRC mismatch! Loaded |
| | backup partition table\ |
| | instead of main partition table!\ |
| | \ |
| | Warning! |
| | One or more CRCs don\'t match. You should repair the disk!\ |
| | \ |
| | Invalid |
| | partition data!\ |
| | "', u'code': 500, u'type': u'CleaningError', |
| | u'details': u'Error performing clean_step erase_devices_metadata: Error |
| | erasing block device: Failed to erase the metadata on the device(s): |
| | "/dev/nvme3n1": Unexpected error while running command. |
| | Command: sgdisk |
| | -Z /dev/nvme3n1 |
| | Exit code: 2 |
| | Stdout: u"Caution! After loading |
| | partitions, the CRC doesn\'t check out!\ |
| | GPT data structures destroyed! |
| | You may now partition the disk using fdisk or\ |
| | other |
| | utilities.\ |
| | " |
| | Stderr: u"\\x07Caution: invalid main GPT header, but |
| | valid backup; regenerating main header\ |
| | from backup!\ |
| | \ |
| | \\x07Warning! |
| | Main partition table CRC mismatch! Loaded backup partition |
| | table\ |
| | instead of main partition table!\ |
| | \ |
| | Warning! One or more CRCs |
| | don\'t match. You should repair the disk!\ |
| | \ |
| | Invalid partition |
| | data!\ |
| | "'}.
```maintenance_reason | Agent returned error for clean step {u'priority': 99, u'interface': |
| | u'deploy', u'reboot_requested': False, u'abortable': True, u'step': |
| | u'erase_devices_metadata'} on node 1b973868-9734-4ecf-9700-c0730e97e031 |
| | : {u'message': u'Clean step failed: Error performing clean_step |
| | erase_devices_metadata: Error erasing block device: Failed to erase the |
| | metadata on the device(s): "/dev/nvme3n1": Unexpected error while |
| | running command. |
| | Command: sgdisk -Z /dev/nvme3n1 |
| | Exit code: 2 |
| | Stdout: |
| | u"Caution! After loading partitions, the CRC doesn\'t check out!\ |
| | GPT |
| | data structures destroyed! You may now partition the disk using fdisk |
| | or\ |
| | other utilities.\ |
| | " |
| | Stderr: u"\\x07Caution: invalid main GPT |
| | header, but valid backup; regenerating main header\ |
| | from |
| | backup!\ |
| | \ |
| | \\x07Warning! Main partition table CRC mismatch! Loaded |
| | backup partition table\ |
| | instead of main partition table!\ |
| | \ |
| | Warning! |
| | One or more CRCs don\'t match. You should repair the disk!\ |
| | \ |
| | Invalid |
| | partition data!\ |
| | "', u'code': 500, u'type': u'CleaningError', |
| | u'details': u'Error performing clean_step erase_devices_metadata: Error |
| | erasing block device: Failed to erase the metadata on the device(s): |
| | "/dev/nvme3n1": Unexpected error while running command. |
| | Command: sgdisk |
| | -Z /dev/nvme3n1 |
| | Exit code: 2 |
| | Stdout: u"Caution! After loading |
| | partitions, the CRC doesn\'t check out!\ |
| | GPT data structures destroyed! |
| | You may now partition the disk using fdisk or\ |
| | other |
| | utilities.\ |
| | " |
| | Stderr: u"\\x07Caution: invalid main GPT header, but |
| | valid backup; regenerating main header\ |
| | from backup!\ |
| | \ |
| | \\x07Warning! |
| | Main partition table CRC mismatch! Loaded backup partition |
| | table\ |
| | instead of main partition table!\ |
| | \ |
| | Warning! One or more CRCs |
| | don\'t match. You should repair the disk!\ |
| | \ |
| | Invalid partition |
| | data!\ |
| | "'}. |
During node cleaning, the generic hardware manager can fail in the `erasing device metadata` step if the GPT is invalid. Specifically this can happen when the hardware manager calls ```sgdisk -Z /dev/somedrive``` to destroy the GPT and MBR data structures.
It isn't clear why sgdisk is validating the GPT when the -Z flag (zap all) instructs sgdisk to destroy the GPT. However, upon retrying sgdisk -Z succeeds.
Example failure:
|maintenance_reason | Agent returned error for clean step {u'priority': 99, u'interface': |
| | u'deploy', u'reboot_requested': False, u'abortable': True, u'step': |
| | u'erase_devices_metadata'} on node 1b973868-9734-4ecf-9700-c0730e97e031 |
| | : {u'message': u'Clean step failed: Error performing clean_step |
| | erase_devices_metadata: Error erasing block device: Failed to erase the |
| | metadata on the device(s): "/dev/nvme3n1": Unexpected error while |
| | running command. |
| | Command: sgdisk -Z /dev/nvme3n1 |
| | Exit code: 2 |
| | Stdout: |
| | u"Caution! After loading partitions, the CRC doesn\'t check out!\ |
| | GPT |
| | data structures destroyed! You may now partition the disk using fdisk |
| | or\ |
| | other utilities.\ |
| | " |
| | Stderr: u"\\x07Caution: invalid main GPT |
| | header, but valid backup; regenerating main header\ |
| | from |
| | backup!\ |
| | \ |
| | \\x07Warning! Main partition table CRC mismatch! Loaded |
| | backup partition table\ |
| | instead of main partition table!\ |
| | \ |
| | Warning! |
| | One or more CRCs don\'t match. You should repair the disk!\ |
| | \ |
| | Invalid |
| | partition data!\ |
| | "', u'code': 500, u'type': u'CleaningError', |
| | u'details': u'Error performing clean_step erase_devices_metadata: Error |
| | erasing block device: Failed to erase the metadata on the device(s): |
| | "/dev/nvme3n1": Unexpected error while running command. |
| | Command: sgdisk |
| | -Z /dev/nvme3n1 |
| | Exit code: 2 |
| | Stdout: u"Caution! After loading |
| | partitions, the CRC doesn\'t check out!\ |
| | GPT data structures destroyed! |
| | You may now partition the disk using fdisk or\ |
| | other |
| | utilities.\ |
| | " |
| | Stderr: u"\\x07Caution: invalid main GPT header, but |
| | valid backup; regenerating main header\ |
| | from backup!\ |
| | \ |
| | \\x07Warning! |
| | Main partition table CRC mismatch! Loaded backup partition |
| | table\ |
| | instead of main partition table!\ |
| | \ |
| | Warning! One or more CRCs |
| | don\'t match. You should repair the disk!\ |
| | \ |
| | Invalid partition |
| | data!\ |
| | "'}.
```maintenance_reason | Agent returned error for clean step {u'priority': 99, u'interface': |
| | u'deploy', u'reboot_requested': False, u'abortable': True, u'step': |
| | u'erase_devices_metadata'} on node 1b973868-9734-4ecf-9700-c0730e97e031 |
| | : {u'message': u'Clean step failed: Error performing clean_step |
| | erase_devices_metadata: Error erasing block device: Failed to erase the |
| | metadata on the device(s): "/dev/nvme3n1": Unexpected error while |
| | running command. |
| | Command: sgdisk -Z /dev/nvme3n1 |
| | Exit code: 2 |
| | Stdout: |
| | u"Caution! After loading partitions, the CRC doesn\'t check out!\ |
| | GPT |
| | data structures destroyed! You may now partition the disk using fdisk |
| | or\ |
| | other utilities.\ |
| | " |
| | Stderr: u"\\x07Caution: invalid main GPT |
| | header, but valid backup; regenerating main header\ |
| | from |
| | backup!\ |
| | \ |
| | \\x07Warning! Main partition table CRC mismatch! Loaded |
| | backup partition table\ |
| | instead of main partition table!\ |
| | \ |
| | Warning! |
| | One or more CRCs don\'t match. You should repair the disk!\ |
| | \ |
| | Invalid |
| | partition data!\ |
| | "', u'code': 500, u'type': u'CleaningError', |
| | u'details': u'Error performing clean_step erase_devices_metadata: Error |
| | erasing block device: Failed to erase the metadata on the device(s): |
| | "/dev/nvme3n1": Unexpected error while running command. |
| | Command: sgdisk |
| | -Z /dev/nvme3n1 |
| | Exit code: 2 |
| | Stdout: u"Caution! After loading |
| | partitions, the CRC doesn\'t check out!\ |
| | GPT data structures destroyed! |
| | You may now partition the disk using fdisk or\ |
| | other |
| | utilities.\ |
| | " |
| | Stderr: u"\\x07Caution: invalid main GPT header, but |
| | valid backup; regenerating main header\ |
| | from backup!\ |
| | \ |
| | \\x07Warning! |
| | Main partition table CRC mismatch! Loaded backup partition |
| | table\ |
| | instead of main partition table!\ |
| | \ |
| | Warning! One or more CRCs |
| | don\'t match. You should repair the disk!\ |
| | \ |
| | Invalid partition |
| | data!\ |
| | "'}. |
|
2017-12-11 14:44:26 |
Doug Szumski |
description |
During node cleaning, the generic hardware manager can fail in the `erasing device metadata` step if the GPT is invalid. Specifically this can happen when the hardware manager calls ```sgdisk -Z /dev/somedrive``` to destroy the GPT and MBR data structures.
It isn't clear why sgdisk is validating the GPT when the -Z flag (zap all) instructs sgdisk to destroy the GPT. However, upon retrying sgdisk -Z succeeds.
Example failure:
|maintenance_reason | Agent returned error for clean step {u'priority': 99, u'interface': |
| | u'deploy', u'reboot_requested': False, u'abortable': True, u'step': |
| | u'erase_devices_metadata'} on node 1b973868-9734-4ecf-9700-c0730e97e031 |
| | : {u'message': u'Clean step failed: Error performing clean_step |
| | erase_devices_metadata: Error erasing block device: Failed to erase the |
| | metadata on the device(s): "/dev/nvme3n1": Unexpected error while |
| | running command. |
| | Command: sgdisk -Z /dev/nvme3n1 |
| | Exit code: 2 |
| | Stdout: |
| | u"Caution! After loading partitions, the CRC doesn\'t check out!\ |
| | GPT |
| | data structures destroyed! You may now partition the disk using fdisk |
| | or\ |
| | other utilities.\ |
| | " |
| | Stderr: u"\\x07Caution: invalid main GPT |
| | header, but valid backup; regenerating main header\ |
| | from |
| | backup!\ |
| | \ |
| | \\x07Warning! Main partition table CRC mismatch! Loaded |
| | backup partition table\ |
| | instead of main partition table!\ |
| | \ |
| | Warning! |
| | One or more CRCs don\'t match. You should repair the disk!\ |
| | \ |
| | Invalid |
| | partition data!\ |
| | "', u'code': 500, u'type': u'CleaningError', |
| | u'details': u'Error performing clean_step erase_devices_metadata: Error |
| | erasing block device: Failed to erase the metadata on the device(s): |
| | "/dev/nvme3n1": Unexpected error while running command. |
| | Command: sgdisk |
| | -Z /dev/nvme3n1 |
| | Exit code: 2 |
| | Stdout: u"Caution! After loading |
| | partitions, the CRC doesn\'t check out!\ |
| | GPT data structures destroyed! |
| | You may now partition the disk using fdisk or\ |
| | other |
| | utilities.\ |
| | " |
| | Stderr: u"\\x07Caution: invalid main GPT header, but |
| | valid backup; regenerating main header\ |
| | from backup!\ |
| | \ |
| | \\x07Warning! |
| | Main partition table CRC mismatch! Loaded backup partition |
| | table\ |
| | instead of main partition table!\ |
| | \ |
| | Warning! One or more CRCs |
| | don\'t match. You should repair the disk!\ |
| | \ |
| | Invalid partition |
| | data!\ |
| | "'}.
```maintenance_reason | Agent returned error for clean step {u'priority': 99, u'interface': |
| | u'deploy', u'reboot_requested': False, u'abortable': True, u'step': |
| | u'erase_devices_metadata'} on node 1b973868-9734-4ecf-9700-c0730e97e031 |
| | : {u'message': u'Clean step failed: Error performing clean_step |
| | erase_devices_metadata: Error erasing block device: Failed to erase the |
| | metadata on the device(s): "/dev/nvme3n1": Unexpected error while |
| | running command. |
| | Command: sgdisk -Z /dev/nvme3n1 |
| | Exit code: 2 |
| | Stdout: |
| | u"Caution! After loading partitions, the CRC doesn\'t check out!\ |
| | GPT |
| | data structures destroyed! You may now partition the disk using fdisk |
| | or\ |
| | other utilities.\ |
| | " |
| | Stderr: u"\\x07Caution: invalid main GPT |
| | header, but valid backup; regenerating main header\ |
| | from |
| | backup!\ |
| | \ |
| | \\x07Warning! Main partition table CRC mismatch! Loaded |
| | backup partition table\ |
| | instead of main partition table!\ |
| | \ |
| | Warning! |
| | One or more CRCs don\'t match. You should repair the disk!\ |
| | \ |
| | Invalid |
| | partition data!\ |
| | "', u'code': 500, u'type': u'CleaningError', |
| | u'details': u'Error performing clean_step erase_devices_metadata: Error |
| | erasing block device: Failed to erase the metadata on the device(s): |
| | "/dev/nvme3n1": Unexpected error while running command. |
| | Command: sgdisk |
| | -Z /dev/nvme3n1 |
| | Exit code: 2 |
| | Stdout: u"Caution! After loading |
| | partitions, the CRC doesn\'t check out!\ |
| | GPT data structures destroyed! |
| | You may now partition the disk using fdisk or\ |
| | other |
| | utilities.\ |
| | " |
| | Stderr: u"\\x07Caution: invalid main GPT header, but |
| | valid backup; regenerating main header\ |
| | from backup!\ |
| | \ |
| | \\x07Warning! |
| | Main partition table CRC mismatch! Loaded backup partition |
| | table\ |
| | instead of main partition table!\ |
| | \ |
| | Warning! One or more CRCs |
| | don\'t match. You should repair the disk!\ |
| | \ |
| | Invalid partition |
| | data!\ |
| | "'}. |
During node cleaning, the generic hardware manager can fail in the `erasing device metadata` step if the GPT is invalid. Specifically this can happen when the hardware manager calls ```sgdisk -Z /dev/somedrive``` to destroy the GPT and MBR data structures.
It isn't clear why sgdisk is validating the GPT when the -Z flag (zap all) instructs sgdisk to destroy the GPT. However, upon retrying sgdisk -Z succeeds.
Example failure:
2017-12-11 12:14:47.449 7 ERROR ironic.drivers.modules.agent_base_vendor [-] Agent returned error for clean step {u'priority': 99, u'interface': u'deploy', u'reboot_requested': False, u'abortable': True,
u'step': u'erase_devices_metadata'} on node 1b973868-9734-4ecf-9700-c0730e97e031 : {u'message': u'Clean step failed: Error performing clean_step erase_devices_metadata: Error erasing block device: Failed
to erase the metadata on the device(s): "/dev/nvme3n1": Unexpected error while running command.\nCommand: sgdisk -Z /dev/nvme3n1\nExit code: 2\nStdout: u"Caution! After loading partitions, the CRC doesn\'
t check out!\\nGPT data structures destroyed! You may now partition the disk using fdisk or\\nother utilities.\\n"\nStderr: u"\\x07Caution: invalid main GPT header, but valid backup; regenerating main hea
der\\nfrom backup!\\n\\n\\x07Warning! Main partition table CRC mismatch! Loaded backup partition table\\ninstead of main partition table!\\n\\nWarning! One or more CRCs don\'t match. You should repair the
disk!\\n\\nInvalid partition data!\\n"', u'code': 500, u'type': u'CleaningError', u'details': u'Error performing clean_step erase_devices_metadata: Error erasing block device: Failed to erase the metadat
a on the device(s): "/dev/nvme3n1": Unexpected error while running command.\nCommand: sgdisk -Z /dev/nvme3n1\nExit code: 2\nStdout: u"Caution! After loading partitions, the CRC doesn\'t check out!\\nGPT d
ata structures destroyed! You may now partition the disk using fdisk or\\nother utilities.\\n"\nStderr: u"\\x07Caution: invalid main GPT header, but valid backup; regenerating main header\\nfrom backup!\\
n\\n\\x07Warning! Main partition table CRC mismatch! Loaded backup partition table\\ninstead of main partition table!\\n\\nWarning! One or more CRCs don\'t match. You should repair the disk!\\n\\nInvalid
partition data!\\n"'}. |
|
2017-12-11 14:47:18 |
Doug Szumski |
description |
During node cleaning, the generic hardware manager can fail in the `erasing device metadata` step if the GPT is invalid. Specifically this can happen when the hardware manager calls ```sgdisk -Z /dev/somedrive``` to destroy the GPT and MBR data structures.
It isn't clear why sgdisk is validating the GPT when the -Z flag (zap all) instructs sgdisk to destroy the GPT. However, upon retrying sgdisk -Z succeeds.
Example failure:
2017-12-11 12:14:47.449 7 ERROR ironic.drivers.modules.agent_base_vendor [-] Agent returned error for clean step {u'priority': 99, u'interface': u'deploy', u'reboot_requested': False, u'abortable': True,
u'step': u'erase_devices_metadata'} on node 1b973868-9734-4ecf-9700-c0730e97e031 : {u'message': u'Clean step failed: Error performing clean_step erase_devices_metadata: Error erasing block device: Failed
to erase the metadata on the device(s): "/dev/nvme3n1": Unexpected error while running command.\nCommand: sgdisk -Z /dev/nvme3n1\nExit code: 2\nStdout: u"Caution! After loading partitions, the CRC doesn\'
t check out!\\nGPT data structures destroyed! You may now partition the disk using fdisk or\\nother utilities.\\n"\nStderr: u"\\x07Caution: invalid main GPT header, but valid backup; regenerating main hea
der\\nfrom backup!\\n\\n\\x07Warning! Main partition table CRC mismatch! Loaded backup partition table\\ninstead of main partition table!\\n\\nWarning! One or more CRCs don\'t match. You should repair the
disk!\\n\\nInvalid partition data!\\n"', u'code': 500, u'type': u'CleaningError', u'details': u'Error performing clean_step erase_devices_metadata: Error erasing block device: Failed to erase the metadat
a on the device(s): "/dev/nvme3n1": Unexpected error while running command.\nCommand: sgdisk -Z /dev/nvme3n1\nExit code: 2\nStdout: u"Caution! After loading partitions, the CRC doesn\'t check out!\\nGPT d
ata structures destroyed! You may now partition the disk using fdisk or\\nother utilities.\\n"\nStderr: u"\\x07Caution: invalid main GPT header, but valid backup; regenerating main header\\nfrom backup!\\
n\\n\\x07Warning! Main partition table CRC mismatch! Loaded backup partition table\\ninstead of main partition table!\\n\\nWarning! One or more CRCs don\'t match. You should repair the disk!\\n\\nInvalid
partition data!\\n"'}. |
During node cleaning, the generic hardware manager can fail in the `erasing device metadata` step if the GPT is invalid. Specifically this can happen when the hardware manager calls ```sgdisk -Z /dev/somedrive``` to destroy the GPT and MBR data structures.
It isn't clear why sgdisk is validating the GPT when the -Z flag (zap all) instructs sgdisk to destroy the GPT. However, upon retrying sgdisk -Z succeeds.
Example failure:
2017-12-11 12:14:47.449 7 ERROR ironic.drivers.modules.agent_base_vendor [-] Agent returned error for clean step {u'priority': 99, u'interface': u'deploy', u'reboot_requested': False, u'abortable': True,
u'step': u'erase_devices_metadata'} on node 1b973868-9734-4ecf-9700-c0730e97e031 : {u'message': u'Clean step failed: Error performing clean_step erase_devices_metadata: Error erasing block device: Failed
to erase the metadata on the device(s): "/dev/nvme3n1": Unexpected error while running command.\nCommand: sgdisk -Z /dev/nvme3n1\nExit code: 2\nStdout: u"Caution! After loading partitions, the CRC doesn\'
t check out!\\nGPT data structures destroyed! You may now partition the disk using fdisk or\\nother utilities.\\n"\nStderr: u"\\x07Caution: invalid main GPT header, but valid backup; regenerating main hea
der\\nfrom backup!\\n\\n\\x07Warning! Main partition table CRC mismatch! Loaded backup partition table\\ninstead of main partition table!\\n\\nWarning! One or more CRCs don\'t match. You should repair the
disk!\\n\\nInvalid partition data!\\n"', u'code': 500, u'type': u'CleaningError', u'details': u'Error performing clean_step erase_devices_metadata: Error erasing block device: Failed to erase the metadat
a on the device(s): "/dev/nvme3n1": Unexpected error while running command.\nCommand: sgdisk -Z /dev/nvme3n1\nExit code: 2\nStdout: u"Caution! After loading partitions, the CRC doesn\'t check out!\\nGPT d
ata structures destroyed! You may now partition the disk using fdisk or\\nother utilities.\\n"\nStderr: u"\\x07Caution: invalid main GPT header, but valid backup; regenerating main header\\nfrom backup!\\
n\\n\\x07Warning! Main partition table CRC mismatch! Loaded backup partition table\\ninstead of main partition table!\\n\\nWarning! One or more CRCs don\'t match. You should repair the disk!\\n\\nInvalid
partition data!\\n"'}.
Workaroud:
Retry the cleaning. For example, move the node to the `manage` state, and then to `provide`. |
|
2017-12-11 15:12:14 |
Dmitry Tantsur |
ironic-python-agent: status |
New |
Triaged |
|
2017-12-11 15:12:18 |
Dmitry Tantsur |
ironic-python-agent: importance |
Undecided |
High |
|
2018-08-07 14:09:25 |
John Fulton |
summary |
Ironic python agent cleaning fails with invalid GPT |
Ironic python agent cleaning fails from CRC mismatch |
|
2023-10-24 15:41:48 |
Jay Faulkner |
ironic-python-agent: status |
Triaged |
Fix Released |
|