Comment 5 for bug 1666320

Revision history for this message
Kevin Greenan (kmgreen2) wrote : Re: [Bug 1666320] Re: crc32 implementation is schizophrenic

Hey Dude,

I'll have a little ok this afternoon.

-kevin

Sent from my iPhone

> On Jun 27, 2017, at 8:09 PM, clayg <email address hidden> wrote:
>
> I wonder how "chksum_type" is determined?
>
> root@swift-test-02:~# cat audit_frag.py
> #!/usr/bin/env python
> import sys
> from swift.common.storage_policy import POLICIES
>
> policy = POLICIES[2]
>
> with open(sys.argv[1]) as f:
> for frag_data in iter(lambda: f.read(policy.fragment_size), ''):
> print '%r' % policy.pyeclib_driver.get_metadata(frag_data, formatted=True)
> root@swift-test-02:~# LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libz.so python audit_frag.py /srv/node/d11/objects-2/398/465/6381800fa3e800cd36d6ea5bf4fd6465/1458349982.69822#3.data
> {'index': 3, 'chksum_mismatch': 0, 'backend_id': 'isa_l_rs_vand', 'orig_data_size': 6110L, 'chksum_type': 'none', 'backend_version': 140376711236864, 'chksum': '', 'size': 2037}
> root@swift-test-02:~# LD_PRELOAD=/usr/lib/liberasurecode.so.1.4.0 python audit_frag.py /srv/node/d11/objects-2/398/465/6381800fa3e800cd36d6ea5bf4fd6465/1458349982.69822#3.data
> {'index': 3, 'chksum_mismatch': 0, 'backend_id': 'isa_l_rs_vand', 'orig_data_size': 6110L, 'chksum_type': 'none', 'backend_version': 139775415815424, 'chksum': '', 'size': 2037}
>
>
> ubuntu@saio:/vagrant/.scratch$ LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libz.so python audit_frag.py /srv/node3/sdb3/objects-1/589/c2b/93464008f434853ad9b74c8c5381ac2b/1498598728.24878#2#d.data
> {'index': 2, 'chksum_mismatch': 0, 'backend_id': 'isa_l_rs_vand', 'orig_data_size': 1048576L, 'chksum_type': 'none', 'backend_version': 134400, 'chksum': '', 'size': 524288}
> {'index': 2, 'chksum_mismatch': 0, 'backend_id': 'isa_l_rs_vand', 'orig_data_size': 1048576L, 'chksum_type': 'none', 'backend_version': 134400, 'chksum': '', 'size': 524288}
> {'index': 2, 'chksum_mismatch': 0, 'backend_id': 'isa_l_rs_vand', 'orig_data_size': 5473L, 'chksum_type': 'none', 'backend_version': 134400, 'chksum': '', 'size': 2737}
> ubuntu@saio:/vagrant/.scratch$ LD_PRELOAD=/usr/local/lib/liberasurecode.so.1.4.0 python audit_frag.py /srv/node3/sdb3/objects-1/589/c2b/93464008f434853ad9b74c8c5381ac2b/1498598728.24878#2#d.data
> Traceback (most recent call last):
> File "audit_frag.py", line 9, in <module>
> print policy.pyeclib_driver.get_metadata(frag_data, formatted=True)
> File "/vagrant/pyeclib/pyeclib/ec_iface.py", line 335, in get_metadata
> return self.ec_lib_reference.get_metadata(fragment, formatted)
> File "/vagrant/pyeclib/pyeclib/core.py", line 127, in get_metadata
> formatted)
> pyeclib.ec_iface.ECInvalidFragmentMetadata: pyeclib_c_get_metadata ERROR: Fragment integrity check failed. Please inspect syslog for liberasurecode error report.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1666320
>
> Title:
> crc32 implementation is schizophrenic
>
> Status in liberasurecode:
> In Progress
>
> Bug description:
> There are two implementations of the crc used for fragment checksums;
> one [1] seems optimized for SSE4, with the other [2] as a fallback
> implementation.
>
> Bad news: Despite comments declaring that these are using the same
> polynomials [3], they're not! The first implements CRC32C [4], while
> the table-based implementation seems to be based off of zlib [5].
>
> Good news: That first implementation doesn't seem to actually get
> used. While we may set INTEL_SSE41 or INTEL_SSE42 if CPU seems to
> support it [6], we never seem to set INTEL_SSE4.
>
> Bad news: While the second implementation is close to zlib's, it isn't
> actually the same -- zlib uses an unsigned long to hold the crc [7],
> while we use an int. There are two problems:
>
> 1. An int is only guaranteed to have at least 16 bits. This probably isn't a problem; most modern CPUs use at least 32 bits for an int.
> 2. A signed int behaves differently when we left-shift; instead of always filling with zeros on the left, it will fill with zeros when positive (or zero) and ones when negative.
>
> Good (?) news: we don't (necessarily) use that second implementation,
> either. If libz.so is loaded before liberasurecode.so, we use libz's
> implementation.
>
> [1] https://github.com/openstack/liberasurecode/blob/1.4.0/src/utils/chksum/crc32.c#L46-L66
> [2] https://github.com/openstack/liberasurecode/blob/1.4.0/src/utils/chksum/crc32.c#L69-L130
> [3] https://github.com/openstack/liberasurecode/blob/1.4.0/src/utils/chksum/crc32.c#L115-L116
> [4] https://en.wikipedia.org/wiki/SSE4#SSE4.2
> [5] https://github.com/madler/zlib/blob/v1.2.11/crc32.h#L8-L59
> [6] https://github.com/openstack/liberasurecode/blob/1.4.0/get_flags_from_cpuid.c#L87-L92
> [7] https://github.com/madler/zlib/blob/master/crc32.c#L238
> [8] https://github.com/openstack/liberasurecode/blob/1.4.0/src/utils/chksum/crc32.c#L119
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/liberasurecode/+bug/1666320/+subscriptions