CLIENT_STATISTICS are broken if hostname is > 16 chars

Bug #338012 reported by Neil Katin
20
Affects Status Importance Assigned to Milestone
OurDelta
Fix Released
Medium
Unassigned
Percona patches
Fix Released
Medium
Unassigned

Bug Description

I noticed today that the CLIENT_STATISTICS tables were not working properly if the hostnames are longer
than 16 characters. I am running these packages from the ourdelta repository (centos5.2):

MySQL-OurDelta-client.x86_64 5.0.67.d7-44.el5_2 installed
MySQL-OurDelta-server.x86_64 5.0.67.d7-44.el5_2 installed
MySQL-OurDelta-shared.x86_64 5.0.67.d7-44.el5_2 installed

mysql> select count(*) from client_statistics;
+----------+
| count(*) |
+----------+
| 1485697 |
+----------+
1 row in set (1.62 sec)

mysql> select client from client_statistics limit 20;
+------------------+
| client |
+------------------+
| gonzo17i.bunchba |
| |
| gonzo23i.bunchba |
| gonzo20i.bunchba |
| gonzo20i.bunchba |
| gonzo23i.bunchba |
| gonzo20i.bunchba |
| gonzo20i.bunchba |
| gonzo17i.bunchba |
| gonzo17i.bunchba |
| gonzo20i.bunchba |
| gonzo23i.bunchba |
| gonzo20i.bunchba |
| gonzo17i.bunchba |
| gonzo23i.bunchba |
| gonzo20i.bunchba |
| gonzo17i.bunchba |
| gonzo20i.bunchba |
| gonzo15i.bunchba |
| gonzo17i.bunchba |
+------------------+
20 rows in set (1.36 sec)

As you can see, there are lots of duplicate rows. The heart of the problem seems to be this
code from the google patch:

ourdelta-mybranch/mysql/5.0/google/userstatsv2/userstatsv2-main.patch.

Client name lengths are set to LIST_PROCESS_HOST_LEN (64), but the CLIENT column is set to 16 characters.
This means that there will be rows in client stats table that will be distinct, but indistinguishable in the show client_statistics output. The CLIENT column should probably have the same length (its set in this line:

+ {"CLIENT", 16, MYSQL_TYPE_STRING, 0, 0, "Client"},

But the underlying problem seems to be in a different place.

There is a routine to initialize the statistics:

+void init_global_client_stats(void)
+{
+ if (hash_init(&global_client_stats, system_charset_info, max_connections,
+ 0, 0, (hash_get_key)get_key_user_stats,
+ (hash_free_key)free_user_stats, 0)) {
+ sql_print_error("Initializing global_client_stats failed.");
+ exit(1);
+ }
+}

The get_key_user_stats is shared between the global_user_stats and global_client_stats hash tables.
Here is the definition of that function:

+byte *get_key_user_stats(USER_STATS *user_stats, uint *length,
+ my_bool not_used __attribute__((unused)))
+{
+ *length = strlen(user_stats->user);
+ return (byte*)user_stats->user;
+}

The "user_name" field is being shared between user and client statistics. This field is limited to 16 characters. But when a hash match is being looked for, we are comparing against a 64 char host name, and we are always inserting a new row.

Revision history for this message
Mark Callaghan (mdcallag) wrote : Re: [Ourdelta-developers] [Bug 338012] [NEW] CLIENT_STATISTICS are broken if hostname is > 16 chars
Download full text (6.7 KiB)

On Wed, Mar 4, 2009 at 3:35 PM, Neil Katin <email address hidden> wrote:

> Public bug reported:
>
>
> I noticed today that the CLIENT_STATISTICS tables were not working properly
> if the hostnames are longer
> than 16 characters. I am running these packages from the ourdelta
> repository (centos5.2):
>
> MySQL-OurDelta-client.x86_64 5.0.67.d7-44.el5_2 installed
> MySQL-OurDelta-server.x86_64 5.0.67.d7-44.el5_2 installed
> MySQL-OurDelta-shared.x86_64 5.0.67.d7-44.el5_2 installed
>
> mysql> select count(*) from client_statistics;
> +----------+
> | count(*) |
> +----------+
> | 1485697 |
> +----------+
> 1 row in set (1.62 sec)
>
> mysql> select client from client_statistics limit 20;
> +------------------+
> | client |
> +------------------+
> | gonzo17i.bunchba |
> | |
> | gonzo23i.bunchba |
> | gonzo20i.bunchba |
> | gonzo20i.bunchba |
> | gonzo23i.bunchba |
> | gonzo20i.bunchba |
> | gonzo20i.bunchba |
> | gonzo17i.bunchba |
> | gonzo17i.bunchba |
> | gonzo20i.bunchba |
> | gonzo23i.bunchba |
> | gonzo20i.bunchba |
> | gonzo17i.bunchba |
> | gonzo23i.bunchba |
> | gonzo20i.bunchba |
> | gonzo17i.bunchba |
> | gonzo20i.bunchba |
> | gonzo15i.bunchba |
> | gonzo17i.bunchba |
> +------------------+
> 20 rows in set (1.36 sec)
>
>
> As you can see, there are lots of duplicate rows. The heart of the problem
> seems to be this
> code from the google patch:
>
> ourdelta-mybranch/mysql/5.0/google/userstatsv2/userstatsv2-main.patch.
>
> Client name lengths are set to LIST_PROCESS_HOST_LEN (64), but the CLIENT
> column is set to 16 characters.
> This means that there will be rows in client stats table that will be
> distinct, but indistinguishable in the show client_statistics output. The
> CLIENT column should probably have the same length (its set in this line:
>
> + {"CLIENT", 16, MYSQL_TYPE_STRING, 0, 0, "Client"},
>
> But the underlying problem seems to be in a different place.
>
> There is a routine to initialize the statistics:
>
> +void init_global_client_stats(void)
> +{
> + if (hash_init(&global_client_stats, system_charset_info,
> max_connections,
> + 0, 0, (hash_get_key)get_key_user_stats,
> + (hash_free_key)free_user_stats, 0)) {
> + sql_print_error("Initializing global_client_stats failed.");
> + exit(1);
> + }
> +}
>
> The get_key_user_stats is shared between the global_user_stats and
> global_client_stats hash tables.
> Here is the definition of that function:
>
> +byte *get_key_user_stats(USER_STATS *user_stats, uint *length,
> + my_bool not_used __attribute__((unused)))
> +{
> + *length = strlen(user_stats->user);
> + return (byte*)user_stats->user;
> +}
>
> The "user_name" field is being shared between user and client
> statistics. This field is limited to 16 characters. But when a hash
> match is being looked for, we are comparing against a 64 char host name,
> and we are always inserting a new row.
>
> ** Affects: ourdelta
> Importance: Undecided
> Status: New
>
> --
> CLIENT_STATISTICS are broken if hostname is > 16 chars
> https://bugs.launchpad.net/bugs/338012
> You re...

Read more...

Revision history for this message
Mark Callaghan (mdcallag) wrote :

On Wed, Mar 4, 2009 at 3:51 PM, MARK CALLAGHAN <email address hidden> wrote:

>
>
> That is one of several bugs I fixed in the Google patch to be published
> real soon now. I think the way to get this is for me to publish the patch
> and then make it easy for Percona to find the fixes.
>
> --
> Mark Callaghan
> <email address hidden>
>

Well, I fixed this and then I removed all code for SHOW CLIENT_STATISTICS
from the next patch. We were not using it enough to justify the cost of
having that code there. SHOW INDEX_STATISTICS has been removed as well. SHOW
TABLE_STATS and SHOW USER_STATS remain with changes to fix the mutex
contention that they used to create. I don't know what Percona will do with
that. They may be bigger fans of SHOW INDEX_STATS than I was.

--
Mark Callaghan
<email address hidden>

Revision history for this message
Weldon Whipple (weldon) wrote :

On Thu, Mar 5, 2009 at 8:15 AM, MARK CALLAGHAN <email address hidden> wrote:
<snip/>
>
> Well, I fixed this and then I removed all code for SHOW CLIENT_STATISTICS
> from the next patch.

We don't use SHOW CLIENT_STATISTICS at our place.

> We were not using it enough to justify the cost of
> having that code there. SHOW INDEX_STATISTICS has been removed as well.

We don't use SHOW INDEX_STATISTICS either.

> SHOW TABLE_STATS and SHOW USER_STATS remain with changes to fix the mutex
> contention that they used to create.

We *do* use SHOW USER_STATISTICS (but haven't used SHOW
TABLE_STATISTICS yet ...)
We modified our patch so that SHOW USER_STATISTICS displays Cpu-time
as a DOUBLE (since it is already a double internally). It also shows
Rows_examined (in THD as examined_row_count and
diff_total_examined_rows) for each user ...

I'm not exactly certain about the origins of our current userstat
patch. (My boss gave it to me back in November--when he hired me). I
*think* it is an original Google patch, although some of our boxen are
running a Percona-derived patch ...)

Weldon Whipple
<email address hidden>

Revision history for this message
Arjen Lentz (arjen-lentz) wrote : Re: [Ourdelta-developers] [Bug 338012] [NEW] CLIENT_STATISTICS are broken if hostname is > 16 chars

Hi Mark,

On 06/03/2009, at 1:15 AM, MARK CALLAGHAN wrote:
> Well, I fixed this and then I removed all code for SHOW
> CLIENT_STATISTICS from the next patch. We were not using it enough
> to justify the cost of having that code there. SHOW INDEX_STATISTICS
> has been removed as well. SHOW TABLE_STATS and SHOW USER_STATS
> remain with changes to fix the mutex contention that they used to
> create. I don't know what Percona will do with that. They may be
> bigger fans of SHOW INDEX_STATS than I was.

INDEX_STATISTICS allows us to find unused indexes, and that's very
valuable.
Don't care for the SHOW commands as such, in INFORMATION_SCHEMA
they're much more useful as they can be joined onto other tables to do
the unused-index detection.

Don't think OurDelta will be removing that, it's specifically why we
included it.

Cheers,
Arjen.
--
Arjen Lentz, Director @ Open Query (http://openquery.com)
Affordable Training and Pro-Active Support for MySQL & related
technologies

My blog is at http://arjen-lentz.livejournal.com
OurDelta: free enhanced builds for MySQL @ http://ourdelta.org

Revision history for this message
Mark Callaghan (mdcallag) wrote : Re: [Ourdelta-developers] [Bug 338012] [NEW] CLIENT_STATISTICS are broken if hostname is > 16 chars
Download full text (4.8 KiB)

On Thu, Mar 5, 2009 at 1:06 PM, Arjen Lentz <email address hidden> wrote:
> Hi Mark,
>
> On 06/03/2009, at 1:15 AM, MARK CALLAGHAN wrote:
>> Well, I fixed this and then I removed all code for SHOW
>> CLIENT_STATISTICS from the next patch. We were not using it enough
>> to justify the cost of having that code there. SHOW INDEX_STATISTICS
>> has been removed as well. SHOW TABLE_STATS and SHOW USER_STATS
>> remain with changes to fix the mutex contention that they used to
>> create. I don't know what Percona will do with that. They may be
>> bigger fans of SHOW INDEX_STATS than I was.
>
>
> INDEX_STATISTICS allows us to find unused indexes, and that's very
> valuable.
> Don't care for the SHOW commands as such, in INFORMATION_SCHEMA
> they're much more useful as they can be joined onto other tables to do
> the unused-index detection.
>
> Don't think OurDelta will be removing that, it's specifically why we
> included it.
>

Then you have inherited a mutex contention performance bug. I fixed it
by removing the feature. There are other ways to fix it that require
more code changes. But it needs to be fixed.

>
> Cheers,
> Arjen.
> --
> Arjen Lentz, Director @ Open Query (http://openquery.com)
> Affordable Training and Pro-Active Support for MySQL & related
> technologies
>
> My blog is at http://arjen-lentz.livejournal.com
> OurDelta: free enhanced builds for MySQL @ http://ourdelta.org
>
> --
> CLIENT_STATISTICS are broken if hostname is > 16 chars
> https://bugs.launchpad.net/bugs/338012
> You received this bug notification because you are a member of OurDelta-
> developers, which is the registrant for OurDelta.
>
> Status in OurDelta - Builds for MySQL: New
>
> Bug description:
>
> I noticed today that the CLIENT_STATISTICS tables were not working properly if the hostnames are longer
> than 16 characters.  I am running these packages from the ourdelta repository (centos5.2):
>
> MySQL-OurDelta-client.x86_64             5.0.67.d7-44.el5_2     installed
> MySQL-OurDelta-server.x86_64             5.0.67.d7-44.el5_2     installed
> MySQL-OurDelta-shared.x86_64             5.0.67.d7-44.el5_2     installed
>
> mysql> select count(*) from client_statistics;
> +----------+
> | count(*) |
> +----------+
> |  1485697 |
> +----------+
> 1 row in set (1.62 sec)
>
> mysql> select client from client_statistics limit 20;
> +------------------+
> | client           |
> +------------------+
> | gonzo17i.bunchba |
> |                  |
> | gonzo23i.bunchba |
> | gonzo20i.bunchba |
> | gonzo20i.bunchba |
> | gonzo23i.bunchba |
> | gonzo20i.bunchba |
> | gonzo20i.bunchba |
> | gonzo17i.bunchba |
> | gonzo17i.bunchba |
> | gonzo20i.bunchba |
> | gonzo23i.bunchba |
> | gonzo20i.bunchba |
> | gonzo17i.bunchba |
> | gonzo23i.bunchba |
> | gonzo20i.bunchba |
> | gonzo17i.bunchba |
> | gonzo20i.bunchba |
> | gonzo15i.bunchba |
> | gonzo17i.bunchba |
> +------------------+
> 20 rows in set (1.36 sec)
>
>
> As you can see, there are lots of duplicate rows.  The heart of the problem seems to be this
> code from the google patch:
>
> ourdelta-mybranch/mysql/5.0/google/userstatsv2/userstatsv2-main.patch.
>
> Client name lengths are set to LIST_PROCESS_HOST_LEN (64), but th...

Read more...

Revision history for this message
Arjen Lentz (arjen-lentz) wrote : Re: [Ourdelta-developers] [Bug 338012] [NEW] CLIENT_STATISTICS are broken if hostname is > 16 chars

Hi Mark,

On 06/03/2009, at 7:18 AM, MARK CALLAGHAN wrote:
>> INDEX_STATISTICS allows us to find unused indexes, and that's very
>> valuable.
>> Don't care for the SHOW commands as such, in INFORMATION_SCHEMA
>> they're much more useful as they can be joined onto other tables to
>> do
>> the unused-index detection.
>>
>> Don't think OurDelta will be removing that, it's specifically why
>> we included it.
>
> Then you have inherited a mutex contention performance bug. I fixed it
> by removing the feature. There are other ways to fix it that require
> more code changes. But it needs to be fixed.

Well in the Percona/OurDelta version, there's a server param to turn
it all on/off.
So that removes the performance/contention issue most of the time, but
allows us to get the stats when wanted.
Of course it'd be even nicer to improve the code itself also, as we
won't want too much contention when it's turned on for a bit.

Cheers,
Arjen.
--
Arjen Lentz, Director @ Open Query (http://openquery.com)
Affordable Training and Pro-Active Support for MySQL & related
technologies

My blog is at http://arjen-lentz.livejournal.com
OurDelta: free enhanced builds for MySQL @ http://ourdelta.org

Changed in ourdelta:
importance: Undecided → Medium
status: New → Confirmed
status: Confirmed → Fix Committed
Revision history for this message
Vadim Tkachenko (vadim-tk) wrote :

I removed CLIENT_STATISTICS from 5.0.84 patches

Revision history for this message
Arjen Lentz (arjen-lentz) wrote : Re: [Ourdelta-developers] [Bug 338012] Re: CLIENT_STATISTICS are broken if hostname is > 16 chars

Hi Vadim

On 29/07/2009, at 5:45 PM, Vadim Tkachenko wrote:
> I removed CLIENT_STATISTICS from 5.0.84 patches

Why? There's a fix for this one.
See https://code.launchpad.net/~d-launchpad-askneil-com/ourdelta/
ourdelta-fix-bug338012
Was attached to the bugreport.

There are actually specific requests to port the *_STATISTICS features
to 5.1
Let's process the bugreports and contributed fixes, rather than
ripping it out?
Thanks

Cheers,
Arjen.
--
Arjen Lentz, Director @ Open Query (http://openquery.com)
Exceptional Services for MySQL at a fixed budget.

Follow our blog at http://openquery.com/blog/
OurDelta: free enhanced builds for MySQL @ http://ourdelta.org

Changed in percona-patches:
importance: Undecided → Medium
status: New → Fix Committed
Revision history for this message
Arjen Lentz (arjen-lentz) wrote :

Hi Yasufumi

On 11/08/2009, at 11:51 AM, Yasufumi Kinoshita wrote:
> ** Changed in: percona-patches Importance: Undecided => Medium
> ** Changed in: percona-patches Status: New => Fix Committed

Great, thanks for that.

Will make users much happier than removing the code ;-)

Cheers,
Arjen.
--
Arjen Lentz, Director @ Open Query (http://openquery.com)
Exceptional Services for MySQL at a fixed budget.

Follow our blog at http://openquery.com/blog/
OurDelta: free enhanced builds for MySQL @ http://ourdelta.org

Changed in percona-patches:
status: Fix Committed → Fix Released
Changed in ourdelta:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.