2017-02-21 15:19:48 |
Tim Burke |
description |
There are two implementations of the crc used for fragment checksums; one [1] seems optimized for SSE4, with the other [2] as a fallback implementation.
Bad news: Despite comments declaring that these are using the same polynomials [3], they're not! The first implements CRC32C [4], while the table-based implementation seems to be based off of zlib [5].
Good news: That first implementation doesn't seem to actually get used. While we may set INTEL_SSE41 or INTEL_SSE42 if CPU seems to support it [6], we never seem to set INTEL_SSE4.
Bad news: While the second implementation is close to zlib's, it isn't actually the same -- zlib uses an unsigned long to hold the crc [7], while we use an int. There are two problems:
1. An int is only guaranteed to have at least 16 bits. This probably isn't a problem; most modern CPUs use at least 32 bits for an int.
2. A signed int behaves differently when we left-shift; instead of always filling with zeros on the left, it will fill with ones when positive (or zero) and ones when negative.
Good (?) news: we don't (necessarily) use that second implementation, either. If libz.so is loaded before liberasurecode.so, we use libz's implementation.
[1] https://github.com/openstack/liberasurecode/blob/1.4.0/src/utils/chksum/crc32.c#L46-L66
[2] https://github.com/openstack/liberasurecode/blob/1.4.0/src/utils/chksum/crc32.c#L69-L130
[3] https://github.com/openstack/liberasurecode/blob/1.4.0/src/utils/chksum/crc32.c#L115-L116
[4] https://en.wikipedia.org/wiki/SSE4#SSE4.2
[5] https://github.com/madler/zlib/blob/v1.2.11/crc32.h#L8-L59
[6] https://github.com/openstack/liberasurecode/blob/1.4.0/get_flags_from_cpuid.c#L87-L92
[7] https://github.com/madler/zlib/blob/master/crc32.c#L238
[8] https://github.com/openstack/liberasurecode/blob/1.4.0/src/utils/chksum/crc32.c#L119 |
There are two implementations of the crc used for fragment checksums; one [1] seems optimized for SSE4, with the other [2] as a fallback implementation.
Bad news: Despite comments declaring that these are using the same polynomials [3], they're not! The first implements CRC32C [4], while the table-based implementation seems to be based off of zlib [5].
Good news: That first implementation doesn't seem to actually get used. While we may set INTEL_SSE41 or INTEL_SSE42 if CPU seems to support it [6], we never seem to set INTEL_SSE4.
Bad news: While the second implementation is close to zlib's, it isn't actually the same -- zlib uses an unsigned long to hold the crc [7], while we use an int. There are two problems:
1. An int is only guaranteed to have at least 16 bits. This probably isn't a problem; most modern CPUs use at least 32 bits for an int.
2. A signed int behaves differently when we left-shift; instead of always filling with zeros on the left, it will fill with zeros when positive (or zero) and ones when negative.
Good (?) news: we don't (necessarily) use that second implementation, either. If libz.so is loaded before liberasurecode.so, we use libz's implementation.
[1] https://github.com/openstack/liberasurecode/blob/1.4.0/src/utils/chksum/crc32.c#L46-L66
[2] https://github.com/openstack/liberasurecode/blob/1.4.0/src/utils/chksum/crc32.c#L69-L130
[3] https://github.com/openstack/liberasurecode/blob/1.4.0/src/utils/chksum/crc32.c#L115-L116
[4] https://en.wikipedia.org/wiki/SSE4#SSE4.2
[5] https://github.com/madler/zlib/blob/v1.2.11/crc32.h#L8-L59
[6] https://github.com/openstack/liberasurecode/blob/1.4.0/get_flags_from_cpuid.c#L87-L92
[7] https://github.com/madler/zlib/blob/master/crc32.c#L238
[8] https://github.com/openstack/liberasurecode/blob/1.4.0/src/utils/chksum/crc32.c#L119 |
|