crc32 implementation is schizophrenic
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
liberasurecode |
Fix Released
|
Undecided
|
Tim Burke |
Bug Description
There are two implementations of the crc used for fragment checksums; one [1] seems optimized for SSE4, with the other [2] as a fallback implementation.
Bad news: Despite comments declaring that these are using the same polynomials [3], they're not! The first implements CRC32C [4], while the table-based implementation seems to be based off of zlib [5].
Good news: That first implementation doesn't seem to actually get used. While we may set INTEL_SSE41 or INTEL_SSE42 if CPU seems to support it [6], we never seem to set INTEL_SSE4.
Bad news: While the second implementation is close to zlib's, it isn't actually the same -- zlib uses an unsigned long to hold the crc [7], while we use an int. There are two problems:
1. An int is only guaranteed to have at least 16 bits. This probably isn't a problem; most modern CPUs use at least 32 bits for an int.
2. A signed int behaves differently when we left-shift; instead of always filling with zeros on the left, it will fill with zeros when positive (or zero) and ones when negative.
Good (?) news: we don't (necessarily) use that second implementation, either. If libz.so is loaded before liberasurecode.so, we use libz's implementation.
[1] https:/
[2] https:/
[3] https:/
[4] https:/
[5] https:/
[6] https:/
[7] https:/
[8] https:/
This script may be useful to check whether fragment headers are using zlib CRCs or not.
Usage: python check_crc.py fragment_file [fragment_file2 [...]]