drizzle_state_packet_read:bad packet number

Bug #528410 reported by Denis Defreyne
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Drizzle
Fix Released
High
Andrew Hutchings
Drizzle Client & Protocol Library
Fix Released
High
Unassigned

Bug Description

I am getting the following error when using libdrizzle 0.7 together with MySQL Ver 14.14 Distrib 5.1.38. The error occurs when using “drizzle_result_buffer”:

drizzle_state_packet_read:bad packet number:213:223

The error is reproduceable; it always occurs on the same query. This query works just fine when running it with the “mysql” commandline client.

Related branches

Revision history for this message
Eric Day (eday) wrote :

Could you provide a little more information on the nature of the query? If possible, perhaps even provide the dataset in an attachment and the query to reproduce? Thanks!

Changed in libdrizzle:
status: New → Incomplete
importance: Undecided → High
assignee: nobody → Eric Day (eday)
Revision history for this message
Denis Defreyne (ddfreyne) wrote :

I’ve been trying to reproduce this bug for a while, but I cannot seem to reproduce it anymore. If I encounter it again, I’ll be certain to report it. Apologies.

Eric Day (eday)
Changed in libdrizzle:
status: Incomplete → Invalid
Revision history for this message
Denis Defreyne (ddfreyne) wrote :

I have managed to reproduce this bug with the following query:

SELECT IFNULL(r1.value, 0.5), IFNULL(r2.value, 0.5), r1.user_id FROM sr_ratings_jester r1 INNER JOIN sr_ratings_jester r2 ON r1.user_id = r2.user_id WHERE r1.item_id = '7' AND r2.item_id = '15' LIMIT 10000

The error occurred in drizzle_result_buffer: “drizzle_state_packet_read:bad packet number:171:50”.

The dataset is the Jester dataset (<http://eigentaste.berkeley.edu/dataset/>); I have created a SQL file that imports this data and uploaded it to <http://stoneship.org/tmp/jester_dataset.sql> (35 MB).

Eric Day (eday)
Changed in libdrizzle:
status: Invalid → New
Changed in drizzle:
assignee: nobody → Andrew Hutchings (linuxjedi)
Revision history for this message
Andrew Hutchings (linuxjedi) wrote :

Can no longer reproduce this, I suspect it was fixed as part of the fixes to row.c

Changed in drizzle:
status: New → Fix Released
Changed in libdrizzle:
status: New → Fix Released
Revision history for this message
Andrew Hutchings (linuxjedi) wrote :

re-opening in Drizzle. Brian has seen it again.

Changed in drizzle:
status: Fix Released → Confirmed
Changed in drizzle:
importance: Undecided → High
Revision history for this message
Brian Aker (brianaker) wrote :

Just another note, while looking at this I found that the client seemed to go into a loop of some sort at:

#0 drizzle_result_free (result=0x62f868) at libdrizzle/result.c:130
#1 0x00007ffff7ddaeb5 in drizzle_result_free_all (con=0x627248) at libdrizzle/result.c:139
#2 0x00007ffff7dd5ec0 in drizzle_con_free (con=0x627248) at libdrizzle/drizzle.c:392
#3 0x0000000000406bfa in gearman_server_queue_libdrizzle_deinit (server=0x625030) at libgearman-server/queue_libdrizzle.c:307
#4 0x0000000000406998 in gearman_server_queue_libdrizzle_init (server=0x625030, conf=0x7fffffffd980) at libgearman-server/queue_libdrizzle.c:244
#5 0x0000000000406c67 in gearmand_queue_libdrizzle_init (gearmand=0x624fc0, conf=0x7fffffffd980) at libgearman-server/queue_libdrizzle.c:319

Revision history for this message
Andrew Hutchings (linuxjedi) wrote :

Split off the loop Brian reported into bug #728990

Revision history for this message
Andrew Hutchings (linuxjedi) wrote :

Brian: let me know if the fix for bug #728990 also fixes this (I think probably likely)

summary: - “bad packet”
+ drizzle_state_packet_read:bad packet number
Revision history for this message
Andrew Hutchings (linuxjedi) wrote :

setting to opinion because I believe this should have gone away now

Changed in drizzle:
status: Confirmed → Opinion
Revision history for this message
Mike (thatman) wrote :

I have this issue while trying to use gearmand. What information can I provide to help resolve this issue?

Revision history for this message
Andrew Hutchings (linuxjedi) wrote :

Hi Mike,

What version of libdrizzle are you using?

Revision history for this message
Mike (thatman) wrote :

Hi Andrew -

I've tried it via the drizzle-dev repo (which is my current installation, see below) as well as from source (drizzle7-2011.08.25). Same error.

If you think it would be helpful to the drizzle project, I can share my screen (via Teamviewer) and help you try to find out what's causing this to ensure it gets properly resolved.

Name : libdrizzle
Arch : x86_64
Version : 2011.08.23
Release : 1.el5
Size : 84 k
Repo : installed
Summary : Drizzle Client & Protocol Library
URL : http://launchpad.net/drizzle
License : BSD and Public Domain
Description: libdrizzle is the the client and protocol library for the Drizzle project. The
           : server, drizzled, will use this as for the protocol library, as well as the
           : client utilities and any new projects that require low-level protocol
           : communication (like proxies). Other language interfaces (PHP extensions, SWIG,
           : ...) should be built off of this interface.

Name : libdrizzle-devel
Arch : x86_64
Version : 2011.08.23
Release : 1.el5
Size : 137 k
Repo : installed
Summary : Drizzle Client & Protocol Library - Header Files
URL : http://launchpad.net/drizzle
License : GPLv2 and BSD
Description: Development files for the Drizzle Client & Protocol Library.

Revision history for this message
Andrew Hutchings (linuxjedi) wrote :

ok, this version should have all the fixes in, so the fact you are getting this is not good. Is it possible to give a test case that reproduces this?

Can you also provide the exact error message you are seeing? The numbers at the end will help to diagnose this.

Revision history for this message
Mike (thatman) wrote : Re: [Bug 528410] Re: drizzle_state_packet_read:bad packet number
Download full text (9.5 KiB)

I'm using it with gearman. Here's the full error. Happy to help debug
and give you access to the box (it's isolated from production stuff)
if you think it'll help.

ERROR [ main ] drizzle_row_buffer:drizzle_state_packet_read:bad
packet number:4:110 ->
libgearman-server/plugins/queue/drizzle/queue.cc:552

Here's an strace:
fcntl(19, F_GETFL) = 0 (flags O_RDONLY)
fcntl(19, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
epoll_ctl(16, EPOLL_CTL_ADD, 19, {EPOLLIN, {u32=19, u64=19}}) = 0
mmap(NULL, 10489856, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS|MAP_32BIT, -1, 0) = 0x41ab3000
mprotect(0x41ab3000, 4096, PROT_NONE) = 0
clone(child_stack=0x424b3240,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x424b39d0, tls=0x424b3940, child_tidptr=0x424b39d0) =
19398
clock_gettime(CLOCK_MONOTONIC, {739459, 702088264}) = 0
clock_gettime(CLOCK_MONOTONIC, {739459, 702158264}) = 0
getuid() = 0
geteuid() = 0
getgid() = 0
getegid() = 0
epoll_create(32000) = 21
fcntl(21, F_SETFD, FD_CLOEXEC) = 0
socketpair(PF_FILE, SOCK_STREAM, 0, [22, 23]) = 0
fcntl(22, F_SETFD, FD_CLOEXEC) = 0
fcntl(23, F_SETFD, FD_CLOEXEC) = 0
fcntl(22, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
getuid() = 0
geteuid() = 0
getgid() = 0
getegid() = 0
pipe([24, 25]) = 0
fcntl(24, F_GETFL) = 0 (flags O_RDONLY)
fcntl(24, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
epoll_ctl(21, EPOLL_CTL_ADD, 24, {EPOLLIN, {u32=24, u64=24}}) = 0
mmap(NULL, 10489856, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS|MAP_32BIT, -1, 0) = 0x424b4000
mprotect(0x424b4000, 4096, PROT_NONE) = 0
clone(child_stack=0x42eb4240,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x42eb49d0, tls=0x42eb4940, child_tidptr=0x42eb49d0) =
19399
clock_gettime(CLOCK_MONOTONIC, {739459, 703731264}) = 0
clock_gettime(CLOCK_MONOTONIC, {739459, 703805264}) = 0
getuid() = 0
geteuid() = 0
getgid() = 0
getegid() = 0
epoll_create(32000) = 26
fcntl(26, F_SETFD, FD_CLOEXEC) = 0
socketpair(PF_FILE, SOCK_STREAM, 0, [27, 28]) = 0
fcntl(27, F_SETFD, FD_CLOEXEC) = 0
fcntl(28, F_SETFD, FD_CLOEXEC) = 0
fcntl(27, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
getuid() = 0
geteuid() = 0
getgid() = 0
getegid() = 0
pipe([29, 30]) = 0
fcntl(29, F_GETFL) = 0 (flags O_RDONLY)
fcntl(29, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
epoll_ctl(26, EPOLL_CTL_ADD, 29, {EPOLLIN, {u32=29, u64=29}}) = 0
mmap(NULL, 10489856, PROT_READ|PROT_WRI...

Read more...

Revision history for this message
Andrew Hutchings (linuxjedi) wrote :

Hi Mike,

Many thanks for the data.

Looking at the gearman code I believe the problem is gearman executes a query with 4-5 columns, skips 1 (which in itself is broken in the GA release but should be fixed in the latest release you were also trying) and then tries to receive the rest of the columns as rows which makes a big mess of things.

I'll try and reproduce this locally and create a patch which will harden libdrizzle to this situation if I am correct. I will also suggest a patch to gearman if I am correct.

Revision history for this message
Andrew Hutchings (linuxjedi) wrote :

Confirmed my above analysis

Changed in drizzle:
status: Opinion → Triaged
Changed in drizzle:
status: Triaged → In Progress
Changed in drizzle:
status: In Progress → Fix Committed
Revision history for this message
Andrew Hutchings (linuxjedi) wrote :

I have provided a fix for libdrizzle for this and proposed it for merging. This adds the function drizzle_column_skip_all and makes it so that you cannot retrieve rows until all columns have been pulled out of the network buffer.

In Mike's case this also requires bug #843782 to be fixed for gearman.

Revision history for this message
Mike (thatman) wrote :

Andrew - Wow. Thank you for the fast fix. Very cool! Fingers crossed the gearman folks can do the same =)

Eric Day (eday)
Changed in libdrizzle:
assignee: Eric Day (eday) → nobody
Revision history for this message
Andy Nemzek (andyn-j) wrote :

Hello,

I just installed Drizzle 7 so I could use it with gearman and I get the same error. When will this fix be available?

If I'm reading this correctly, it sounds like gearman still won't work until they resolve their half of the bug...is this correct?

Thanks!

Revision history for this message
Andy Nemzek (andyn-j) wrote :

FYI - I was trying to setup Gearman to use a MySQL persistent queue when I was getting this error. I installed Drizzle 2011.08.25 in order to get libdrizzle because getting libdrizzle as a standalone component no longer appears possible.

However, I uninstalled Drizzle 2011.08.25 and installed Drizzle7 2011.03.13 instead and now everything appears to be working fine. So something between those two builds seems to have broken the interface to Gearman. I mention this because it was applied above that the fix implemented in Drizzle needs a corresponding fix in Gearman. Perhaps that is not the case if Gearman is working fine with 2011.03.13?

Revision history for this message
Andrew Hutchings (linuxjedi) wrote :

This particular bug should also affect 2011.03.13. It mostly depends on the number of columns in a result set and remaining packet data. Sometimes it will be fine, other times you will get a packet sequence error or a hang.

The reason for this was a function Gearman uses was not implemented correctly or as designed. There are ways of avoiding this function completely to achieve the same goal.

Vijay Samuel (vjsamuel)
Changed in drizzle:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.