Percona Server moved to https://jira.percona.com/projects/PS

innodb_memcache OOM when reading >8k values

Bug #1655403 reported by Andrew N on 2017-01-10

This bug affects 1 person

	Status	Importance	Assigned to
MySQL Server	Unknown	Unknown	mysql-bugs #87355
Percona Server moved to https://jira.percona.com/projects/PS	Status tracked in 5.7
5.5	Invalid	Undecided	Unassigned
5.6	Invalid	Undecided	Unassigned
5.7	Triaged	High	Unassigned

Bug Description

Running percona 5.7 HEAD, using VARCHAR(65000) in place of VARCHAR(1024)
for demo_test.c2 in the distribution innodb_memcached_config.sql,
percona-server uses up all system memory and triggers the Linux OOM-killer
when a single memcache client retrieves many values greater than ~8k. The
memory for the values isn’t getting freed after the GET command returns,
but it’s not a true memory leak because all of the additional memory gets
released when the memcache TCP connection closes.

The issue seems to be that values over a certain threshold trigger the
following call stack, allocating memory in a heap that belongs to the
read_tpl in innodb_engine.c, and that memory isn’t freed up until the
read_tpl is destroyed when the connection closes.

    mem_heap_alloc
    btr_rec_copy_externally_stored_field
    ib_read_tuple
    ib_cursor_read_row
    innodb_api_search
    innodb_get
    process_get_command

The below patch is working for us, and I’m happy to update it as needed to
get it merged. I’m hoping that would go a lot faster if you could point me
to detailed documentation or help me get in touch with someone who
understands the relation between innodb cursors, tuples, and heaps, and
where and how it would be safe to release the block allocated when
btr_rec_copy_externally_stored_field calls mem_heap_alloc.

diff --git a/plugin/innodb_memcached/innodb_memcache/src/innodb_engine.c b/plugi
index 60fc372..db88525 100644
--- a/plugin/innodb_memcached/innodb_memcache/src/innodb_engine.c
+++ b/plugin/innodb_memcached/innodb_memcache/src/innodb_engine.c
@@ -1637,6 +1637,10 @@ innodb_release(
item_release(def_eng, (hash_item *) item);
conn_data->use_default_mem = false;
}
+ if (conn_data->read_tpl) {
+ ib_cb_tuple_delete(conn_data->read_tpl);
+ conn_data->read_tpl = NULL;
+ }

return;
}

Here is the script we're using to reproduce the error. Run it and watch
mysql memory usage grow.

#!/usr/bin/env ruby

require 'socket'

    def bin_to_hex(s)
      s.each_byte.map { |b| b.to_s(16) }.join
    end

    class MemcacheClient
      def initialize
        @s = TCPSocket.new 'localhost', 11211
      end

      def get(key)
        @s.write("get #{key}\r\n")
        info = @s.gets
        if info.start_with?('VALUE')
          count = info.split[-1].to_i
          value = @s.read(count)
          value += @s.gets # \r\n
          value += @s.gets # END
        end
        info + (value || "")
      end

      def set(key, value)
        value = value.to_s
        @s.write("set #{key} 0 0 #{value.bytesize}\r\n#{value}\r\n")
        @s.gets
      end
    end

    def test_write_and_read(size=64000)
      memcache_client = MemcacheClient.new
      random = Random.new

      1.upto(1_000_000) do |n|
        k = bin_to_hex(random.bytes(20))
        v = 'a' * size

puts n
$stdout.flush

        memcache_client.set(k, v)
        ret = memcache_client.get(k)
        if !ret.include?(v)
          raise "Error: mismatch: #{ret.inspect} != #{v.inspect}"
        end
      end
    end
    test_write_and_read

Tags:

Nickolay Ihalainen (ihanick) on 2017-08-09

tags:

added: upstream

Revision history for this message

Shahriyar Rzayev (rzayev-sehriyar) wrote on 2018-01-25:

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-1046

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

mysql-bugs #87355 Edit

Bug watches keep track of this bug in other bug trackers.