"Exact user search" doesn't work intuitively with names containing spaces

Bug #1612481 reported by Dmitrii Metelkin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mahara
Confirmed
Medium
Unassigned

Bug Description

Steps to replicate:

Make sure that the setting "Exact user searches" set to Yes on admin/extensions/pluginconfig.php?plugintype=search&pluginname=internal

Create a user with multi word lastname, e.g Bla Bla Bla
Navigate to /admin/users/search.php
Type this in search form:

 Bla Bla Bla

Expected result: User is in the search results, because you've typed their exact last name.
Actual result: User is not in the search results

Workaround: Enclose the firstname and/or lastname *by itself* in double quotes. e.g., for the user Aaron von Wells, search for:

  Aaron "von Wells"

description: updated
Revision history for this message
Robert Lyon (robertl-9) wrote :

The problem is the Mahara code expects firstname and lastname to be 1 word each

See searchlib.php

line 137:
      if (count($phraselist) == 2) {

and
line 198:
            foreach ($fullnames as $n) {
                $constraints[] = array(
                    'field' => 'firstname',
                    'type' => 'contains',
                    'string' => $n[0]
                );
                $constraints[] = array(
                    'field' => 'lastname',
                    'type' => 'contains',
                    'string' => $n[1]
                );
            }

So if you were to search for something like 'Schnitzel von Krumm' it would not work

To fix this we could check all combinations of the words as fullnames when there are more than 2 words eg

| Schnitzel von | Krumm |
| Schnitzel | von Krumm |

Revision history for this message
Dmitrii Metelkin (dmitriim) wrote :

I think this code is executed only if we use not internal search plugin. I believe we should look at admin_search_user function from search/internal/lib.php

Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

If you have a multi-word last name (or first name), the exact search expects this to be within quotation marks. Otherwise, turn off the exact search, but then it usually finds all people starting with the first name.

The exact search is to prevent finding all "John" from "John Adams" to "John Thompson" when all you care about it "John Smith".

Revision history for this message
Dmitrii Metelkin (dmitriim) wrote :

The quotation marks don't work on admin search page :(

Revision history for this message
Robert Lyon (robertl-9) wrote :

I believe what Kristina was saying is correct - you just got to 'know' how the data is saved.

eg if you have a user called 'Schnitzel von Krumm' where the 'von Krumm' is the lastname then you need to search:

Schnitzel "von Krumm" so that the search query splits the three words into two strings to search, it is all about knowing where to put the quotations.

Changed in mahara:
importance: Undecided → Medium
status: New → Confirmed
Aaron Wells (u-aaronw)
description: updated
summary: - In some cases users are out of the search result on admin page
+ "Exact user search" doesn't work intuitively with names containing
+ spaces
description: updated
Revision history for this message
Aaron Wells (u-aaronw) wrote :

Hi Dmitri,

It worked for me on the admin page. I'll use [square brackets] to enclose strings below, to try to avoid confusion when I'm talking about strings that contain double quotes.

1. Clean install of Mahara
2. Log in as admin
3. Go to Content -> Profile and change my first name to: [Ad min]
4. Go to the Administration -> Extensions -> Search/Internal config page and verify that exact search is on.
5. Go to Administration -> Users -> User search (admin/users/search.php) and search for ["Ad min"]

Result: The admin user shows up in the search results.

I agree that it's still bad usability, though. I think many users would expect [ad min user] or ["ad min user"] to work, but in fact you have to enclose the first name and/or last name *by itself* in quote marks, which I would probably not have thought of doing, and there isn't even a help bubble on the page to explain that we support the quote mark search operator.

Currently the behavior is this:

1. PluginSearchInternal::split_query_string() splits the search query into search terms. Any quote-enclosed strings become a search term. The remaining non-quoted text is split up by space characters into separate search terms.

2. PluginSearchInternal::match_user_field_expression() writes a SQL expression to compare each applicable database field to each search term.

2a. If exact search is on, it does an exact match: lower(<field>) = lower(<search term>)

2b. If exact search is off, it does a wildcard match: lower(<field>) like lower('%<search term>%')

So what Kristina said is not exactly true. An "exact search" for [Jon Smith] will actually pull up Jon Paul, and Mike Smith. It just excludes Jonathan Smithson.

That said, it's tricky to come up with a way to support [ad min user] while still honoring the original exact search use case, which I guess is roughly "Each part of the search query must be an exact match to a part of the user's name". I can think of two ideas:

1. Robert's suggestion of adding search terms for all adjacent combinations of space-separated unquoted elements in the query. So [a b c] would yield these search terms: [a], [b], [c], [a b], [b c], and [a b c]. The downside is that the number of search terms scales geometrically with the number of spaces in the search query. Also, it may be tricky to handle partly-quoted search queries such as [a "b c" d], although that's a corner case.

2. Compare the entire search query against a concatenation of the firstname and lastname fields. So the SQL would be something like "u.firstname = <query> || u.lastname = <query> || u.firstname || u.lastname = <query> || u.lastname || u.firstname = <query>". (For quoted search terms, we could just remove the quote marks.) This would be hard to fit into the existing code, though, which handles each search field quite separately.

Revision history for this message
Aaron Wells (u-aaronw) wrote :

Hm, just realize that I accidentally used "||" as both concatenation and OR in the above SQL snippet. What I meant to say was that if you wanted to compare the entire search query against the firstname, lastname, and a concatenation of them both (in both orders to account for internationalization), it'd be something like this:

u.firstname = <query>
OR u.lastname = <query>
OR u.lastname || ' ' || u.firstname = <query>
OR u.firstname || ' ' || u.lastname = <query>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.