ElasticSearch struggles with pagination on results over 10000

Bug #1783447 reported by Robert Lyon
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mahara
Confirmed
Medium
Unassigned

Bug Description

It throws errors like:

[2018-07-25T11:53:45,582][DEBUG][o.e.a.s.TransportSearchAction] ***** Failed to execute [SearchRequest{searchType=QUERY_THEN_FETCH, indices=[websearch-mahara-prod], indicesOptions=IndicesOptions[id=38, ignore_unavailable=false, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false], types=[], routing='null', preference='null', requestCache=null, scroll=null, maxConcurrentShardRequests=15, batchedReduceSize=512, preFilterShardSize=128, allowPartialSearchResults=true, source={"from":11460,"size":10,"query":{"bool":{"must":[{"match":{"catch_all":{"query":"mahara","operator":"OR","prefix_length":0,"max_expansions":50,"fuzzy_transpositions":true,"lenient":false,"zero_terms_query":"NONE","auto_generate_synonyms_phrase_query":true,"boost":1.0}}}],"filter":[{"bool":{"should":[{"term":{"access.general":{"value":"public","boost":1.0}}}],"adjust_pure_negative":true,"boost":1.0}},{"term":{"secfacetterm":{"value":"Forumpost","boost":1.0}}},{"term":{"mainfacetterm":{"value":"Text","boost":1.0}}}],"adjust_pure_negative":true,"boost":1.0}},"sort":[{"_score":{"order":"desc"}},{"_score":{"order":"desc"}}],"highlight":{"pre_tags":["<span class=\"search-highlight\">"],"post_tags":["</span>"],"fragment_size":100,"number_of_fragments":2,"require_field_match":false,"fields":{"description":{}}}}}]

Exception: Result window is too large, from + size must be less than or equal to: [10000] but was [11470]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.

This is due to a search like 'mahara' returning more than 10000 rows and so if one paginates to the last results it throws error

What would be better to do is alert the user that there are more than 10000 records and ask them to refine their search and only have pagination for 10000

Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

Idea: We show a set number of pages in the paginator and then if there are more results, show a message along the lines of "There are more search results. Please refine your search." so as not to go through all results while trying to find the next paginator. We would cap it at 1,000 results to be displayed.

Fun fact: We did a search on Google for "testing", saw over 1 bn results, but ended up with only 6 paginated pages and couldn't actually browse through all the results.

Changed in mahara:
milestone: 18.10.0 → 19.04.0
Changed in mahara:
milestone: 19.04.0 → 19.10.0
Changed in mahara:
milestone: 19.10.0 → 20.04.0
Changed in mahara:
assignee: nobody → Cecilia Vela Gurovic (ceciliavg)
status: Confirmed → In Progress
Robert Lyon (robertl-9)
Changed in mahara:
milestone: 20.04.0 → 20.10.0
Changed in mahara:
milestone: 20.10.0 → 21.04.0
Changed in mahara:
assignee: Cecilia Vela Gurovic (ceciliavg) → nobody
milestone: 21.04.0 → 21.10.0
status: In Progress → Confirmed
Robert Lyon (robertl-9)
Changed in mahara:
milestone: 21.10.0 → 22.04.0
Changed in mahara:
milestone: 22.04.0 → 22.10.0
tags: added: elasticsearch
Changed in mahara:
milestone: 22.10.0 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.