Signed error in command_log_reader.cc

Bug #437896 reported by Brian Aker
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Drizzle
Fix Released
Critical
Jay Pipes

Bug Description

Hi!

From plugin/command_log/command_log_reader.cc
        uint32_t recalc_checksum= crc32(0, (const unsigned char *) checksum_buffer.c_str(), length);

crc32 is defined in zlib.h as:
        ZEXTERN uLong ZEXPORT crc32 OF((uLong crc, const Bytef *buf, uInt len));

There are several issues. length can be a larger value then what crc32 can take. Also, crc32 is returns a long, so on a 32bit platform it may be ok to receive the value but on 64bit...

I'd recommend pulling the crc implementation from libmemcached and using that (or possibly a different algo).

Cheers,
   -Brian

Related branches

Brian Aker (brianaker)
Changed in drizzle:
assignee: nobody → Jay Pipes (jaypipes)
importance: Undecided → Critical
milestone: none → bell
Jay Pipes (jaypipes)
Changed in drizzle:
status: New → Confirmed
Revision history for this message
Jay Pipes (jaypipes) wrote :

> There are several issues. length can be a larger value then what crc32 can take.

Yes, agreed.

> Also, crc32 is returns a long, so on a 32bit platform it may be ok to receive the value but on 64bit...
> I'd recommend pulling the crc implementation from libmemcached and using that (or possibly a different algo).

Well, the one in libmemcached also returns a 32-bit unsigned integer...

uint32_t hash_crc32(const char *key, size_t key_length)
{
  uint32_t x;
  uint32_t crc= UINT32_MAX;

  for (x= 0; x < key_length; x++)
     crc= (crc >> 8) ^ crc32tab[(crc ^ (uint64_t)key[x]) & 0xff];

  return ~crc;
}

But, as you can see, it handles lengths greater than 32 bits, though in a way that I am not sure is bitwise correct. It is cast-converting a char into a uint64_t and XORing that against an unsigned 32-bit integer, then ANDing it against the value of 255.

If this is what you want me to use, no problem. Please verify for me, though, that the above algorithm will indeed solve the bug. I *think* it does, but would like you to verify. Thanks!

jay

Changed in drizzle:
status: Confirmed → In Progress
Revision history for this message
Jay Pipes (jaypipes) wrote :

OK, just making this publicly accessible and logged...Brian asked Austin about the cast operation in the above function. Turns out the cast is fine, but should really be a static_cast<uint8_t>, and that there are no endian issues with the code.

TODO:

Where to put contributed code like this? A contrib/ folder like PostgreSQL does?

Revision history for this message
Brian Aker (brianaker) wrote : Re: [Bug 437896] Re: Signed error in command_log_reader.cc

Hi!

On Sep 28, 2009, at 3:39 PM, Jay Pipes wrote:

> Where to put contributed code like this? A contrib/ folder like
> PostgreSQL does?

My suggestion, keep the code in a separate file with the original
license/etc on it. For now... mysys? I'm open to other suggestions
about location.

Cheers,
 -Brian

Revision history for this message
Arjen Lentz (arjen-lentz) wrote :
Download full text (4.2 KiB)

Hi Jay

On 29/09/2009, at 4:08 AM, Jay Pipes wrote:
>> There are several issues. length can be a larger value then what
>> crc32 can take.
>
> Yes, agreed.
>
>> Also, crc32 is returns a long, so on a 32bit platform it may be ok
>> to receive the value but on 64bit...
>> I'd recommend pulling the crc implementation from libmemcached and
>> using that (or possibly a different algo).

CRC-32 is a uint32, not a long (make sure it never messes with sign
bits, just to be safe).

> Well, the one in libmemcached also returns a 32-bit unsigned
> integer...
>
> uint32_t hash_crc32(const char *key, size_t key_length)
> {
> uint32_t x;
> uint32_t crc= UINT32_MAX;
>
> for (x= 0; x < key_length; x++)
> crc= (crc >> 8) ^ crc32tab[(crc ^ (uint64_t)key[x]) & 0xff];
>
> return ~crc;
> }
>
> But, as you can see, it handles lengths greater than 32 bits, though
> in
> a way that I am not sure is bitwise correct. It is cast-converting a
> char into a uint64_t and XORing that against an unsigned 32-bit
> integer,
> then ANDing it against the value of 255.

The above code looks a tad dodgy, given that the crc is uint32, not
uint64.
It'll work safely since it's beyond the relevant 8 bits, but just
looks confusing and possibly cause extra trickery on 32bit builds?

Another error in the above is that the key_length is size_t but the x
counter is uint32. That may or may not be the same.

When transmitted next to a block, it should a) follow the data and b)
be stored/transmitted low byte first. So depending on the arch, endian-
swapping is required.

> If this is what you want me to use, no problem. Please verify for me,
> though, that the above algorithm will indeed solve the bug. I
> *think* it
> does, but would like you to verify. Thanks!

I forget what the exact threshold rule is, but ancient memory recalls
that CRC-16 becomes unreliable when the block is a few K long, and
thus we moved to CRC-32 at that time. So I'll take the educated guess
that a CRC-32 is *definitely* out of its depth when you get to >32bit
size data.

Some numbers for your enjoyment (see that it's designed for serial
transmissions lines, really ;-)
CRC-16 (CCITT) ERROR DETECTION: All single and double bit errors, all
errors with an odd number of bits, all burst errors of length 16 or
less, 99.997% of 17-bit error bursts, and 99.998% of 18-bit and longer.
CRC-32 ERROR DETECTION: When correctly applied, undetected errors are
reduced by a factor of 10^-5 over the CCITT CRC-16.

I presume that when dealing with blobs, you could potentially have
large data blocks to deal with - so CRC-64 might be the way to go?
I'd be happy to verify/validate a proper implementation. It's not
complicated, except to explain. I can just see it when it's right ;-)

I haven't got code handy beyond CRC-32, but if you want a proper
CRC-64 implementation I'd be happy to provide proper code and a table,
and optionally the code to generate the lookup table (so people can
verify it's correct and see how it's done, at least). There's also an
alternative using 2 16-item tables rather than 1 256-item lookup...
slower but much smaller... probably not relevant for ...

Read more...

Jay Pipes (jaypipes)
Changed in drizzle:
status: In Progress → Fix Committed
Changed in drizzle:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.