Unsafe reentrant call to io_flush

Bug #490520 reported by JC Redoutey
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
libmemcached
Fix Released
Medium
Brian Aker

Bug Description

While trying to use buffered mode to set a big number of pretty big values (3333 bytes exactly in my test!), it ends up pretty quickly with an error "CLIENT_ERROR bad data chunk".

By looking more closely to the message that triggered that error, it seems one message is ill-formed, containing one part of its normal content but the end of the previous one.

The problem seems to come from:
1) memcached_io_write calls io_flush
2) io_flush calls io_wait
3) io_wait calls memcached_purge
4) memcached_purge calls memcached_io_write
5) and finally memcached_io_write calls io_flush again ...

the problem being write_buffer and write_offset_buffer are copied at the beginning of io_flush and only reset at the end. So if 2) has already written some data, it is not reflected into the buffer of the offset when entering io_flush for the second time, which will resend the beginning of the buffer that was already sent.
Basically, if we have in the buffer [abcdefghijkl], we might send something like [abcdabcdefghijkl]...

One solution is flag io_wait call as "ptr->root->purging= 1;", to prevent entering later on the write call into the purge.
One other solution would be to have a start_offset being kept up 2 date at every loop of the send calls.

Related branches

Revision history for this message
Colin Pitrat (colin-pitrat) wrote :

Hi Jean-Charles,

when using your patch, I have a problem with ptr->write_buffer_start_offset having some crazy values from time to time.
It looks like it's never initialized. I'm not sure that I understand your patch correctly, but I have the impression that it is safe to put it to 0 at the beginning of _io_flush, isn't it ?

Regards,
Colin

Revision history for this message
Brian Aker (brianaker) wrote :

I can't reproduce this currently. I am going to add the test case for it though (in case it shows up on other platforms).

Changed in libmemcached:
assignee: nobody → Brian Aker (brianaker)
importance: Undecided → Medium
status: New → Fix Committed
Brian Aker (brianaker)
Changed in libmemcached:
status: Fix Committed → Fix Released
Revision history for this message
Piotr Sikora (piotr-sikora) wrote :

I've got regression for this on OpenBSD/amd64.

    Testing lp:490520
    Assertion failed at tests/mem_functions.cc:5952: rc == MEMCACHED_SUCCESS or rc == MEMCACHED_BUFFERED

but when compiled --with-debug, it fails sooner on assertion:

    assertion "(ptr->fd != -1)" failed: file "libmemcached/io.cc", line 701, function "io_flush"

Backtrace attached.

Revision history for this message
Piotr Sikora (piotr-sikora) wrote :

This still happens with 0.52 on OpenBSD/amd64.

Assertion "int(ptr->state) <= int(MEMCACHED_SERVER_STATE_ADDRINFO)" failed for function "memcached_io_read" likely for "Programmer error, invalid socket state", at libmemcached/io.cc:452
Abort trap (core dumped)

Could we please re-open this issue?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.