Metadata corrupted
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
Invalid
|
Undecided
|
Unassigned |
Bug Description
On one of our openstack-swift deployments, we observed several metadata errors from swift error log:
Type1: TypeError: 'NoneType' object has no attribute '_getitem_
2017-09-
Type2: TypeError: tuple indices must be integers, not str
2017-09-
Type3: TypeError: 'bool' object has no attribute '_getitem_
2017-09-
Type4: TypeError: string indices must be integers, not str
18 2017-09-
Upon reporting this error, data is quarantined. Unfortunately, we do not have any replicas (just 1 disk)
As can be observed, all these errors are stemming from file: /usr/lib/
Looking at source code in our repo (which has slight unrelated(in this context) modifications from opensource version):
1528 def _construct_
1529 """
1530 Open the `.data` file to fetch its metadata, and fetch the metadata
1531 from the fast-POST `.meta` file as well if it exists, merging them
1532 properly.
1533
1534 :param data_file: on-disk `.data` file being considered
1535 :param meta_file: on-disk fast-POST `.meta` file being considered
1536 :returns: an opened data file pointer
1537 :raises DiskFileError: various exceptions from
1538 :func:`
1539 """
1540 fp = open(data_file, 'rb')
1541 datafile_metadata = self._failsafe_
1542 if meta_file:
1543 self._metadata = self._failsafe_
1544 sys_metadata = dict(
1545 [(key, val) for key, val in datafile_
1546 if key.lower() in DATAFILE_
1547 or is_sys_
1548 self._metadata.
1549 else:
1550 self._metadata = datafile_metadata
1551 if self._name is None:
1552 # If we don't know our name, we were just given a hash dir at
1553 # instantiation, so we'd better validate that the name hashes back
1554 # to us
1555 self._name = self._metadata[
1556 self._verify_
1557 self._verify_
1558 return fp
....
....
....
1450 def _verify_
1451 """
1452 Verify the metadata's name value matches what we think the object is
1453 named.
1454
1455 :param data_file: data file name being consider, used when quarantines
1456 occur
1457 :param fp: open file pointer so that we can `fstat()` the file to
1458 verify the on-disk size with Content-Length metadata value
1459 :raises DiskFileCollision: if the metadata stored name does not match
1460 the referenced name of the file
1461 :raises DiskFileExpired: if the object has expired
1462 :raises DiskFileQuarant
1463 between the metadata and the file-system
1464 metadata
1465 """
1466 try:
1467 mname = self._metadata[
1468 except KeyError:
1469 raise self._quarantin
1470 else:
1471 if mname != self._name:
1472 self._logger.error(
1473 _('Client path %(client)s does not match '
1474 'path stored in object metadata %(meta)s'),
1475 {'client': self._name, 'meta': mname})
1476 raise DiskFileCollisi
1477 'stored in object metadata')
1478 try:
1479 x_delete_at = int(self.
1480 except KeyError:
1481 pass
1482 except ValueError:
1483 # Quarantine, the x-delete-at key is present but not an
1484 # integer.
1485 raise self._quarantine(
1486 data_file, "bad metadata x-delete-at value %s" % (
1487 self._metadata[
1488 else:
1489 if x_delete_at <= time.time():
1490 raise DiskFileExpired
1491 try:
1492 metadata_size = int(self.
1493 except KeyError:
1494 raise self._quarantine(
1495 data_file, "missing content-length in metadata")
1496 except ValueError:
1497 # Quarantine, the content-length key is present but not an
1498 # integer.
1499 raise self._quarantine(
1500 data_file, "bad metadata content-length value %s" % (
1501 self._metadata[
1502 fd = fp.fileno()
1503 try:
1504 statbuf = os.fstat(fd)
1505 except OSError as err:
1506 # Quarantine, we can't successfully stat the file.
1507 raise self._quarantin
1508 else:
1509 obj_size = statbuf.st_size - METADATA_FOOTER_LEN
1510 if obj_size != metadata_size:
1511 raise self._quarantine(
1512 data_file, "metadata content-length %s does"
1513 " not match actual object size %s" % (
1514 metadata_size, statbuf.st_size))
1515 self._content_
1516 return obj_size
Definitely looks like metadata area is corrupted. I don't see any IO errors from local disk.
Can someone please help in understanding what exactly went wrong.
Is it a software or hardware bug ?
Is it possible to recover data in quarantined region.
Changed in swift: | |
status: | New → Incomplete |
Based on the diskile code posted above, this looks like a relatively old version of Swift. Can you update the bug report with the swift version?
You say that "source code in our repo (which has slight unrelated(in this context) modifications from opensource version)" - maybe you can post the _failsafe_ read_metadata and read_metadata functions from diskfile.py. In fact, post a diff of your diskfile.py against the opensource version you have forked.