paginated query is not properly handling sort keys with None value

Bug #1615938 reported by Qiming Teng
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
oslo.db
Fix Released
Medium
Mehdi Abaakouk

Bug Description

The intended use case for pagination (not sure if documented somewhere) is that with 'limit' and 'marker', users can control the starting point of each query and the number of records returned. In real life, the identity of a record is one of the primary keys for the db table, i.e. they are used to sort records, and to support pagination. This is the baseline for pagination to work properly.

However, in real use cases, there are requirements that the records are sorted by other columns which are not necessarily a primary key, so such a column could carry a value of None. When these columns are used for sorting, we will get problems when doing paginated query because all sort keys currently participate in the comparison.

For example, when we are sorting a table using the following parameters:

   sort_keys = ['created_at', 'id'] # id is the primary key to ensure pagination always works
   sort_dirs = ['desc-nullslast', 'asc']

If the 'created_at' column has a value of None, the following lines will break. The model_attr here could be the 'created_at', and the marker_value can be None, but None cannot be compared with a value using '<' or '>'.

 167 if sort_dirs[i].startswith('desc'):
 168 crit_attrs.append((model_attr < marker_values[i]))
 169 else:
 170 crit_attrs.append((model_attr > marker_values[i]))

So, in these cases, if the model_attr is nullable and table does contain None values, we will get exceptions here.

The solution proposed is to skip these comparisons if marker_values[i] is None. If we have a marker_value that is None, the that is the only column used for sorting (thus pagination), it is the user's fault. We cannot do reliable pagination with a nullable column only. If the sorting keys have a primary key in its combination to ensure unique/deterministic ordering, we can safely skip the comparison on line 168 and 170.

Qiming Teng (tengqim)
Changed in oslo.db:
assignee: nobody → Qiming Teng (tengqim)
Changed in oslo.db:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/oslo.db 4.17.0

This issue was fixed in the openstack/oslo.db 4.17.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/oslo.db 4.13.6

This issue was fixed in the openstack/oslo.db 4.13.6 release.

Revision history for this message
Mehdi Abaakouk (sileht) wrote :

https://github.com/openstack/oslo.db/commit/b3869d04cff7071c1226758eb8b58fde9eba5b8d

The solution implemented doesn't really work as expected.

it have replaced

(id > MARKER_ID) or (id == MARKER_ID && updated_at > None)

with

(id > MARKER_ID)

But in more complicated case we still have issue with Null:

(id > MARKER_ID) or (id == MARKER_ID && updated_at > None) or (id == MARKER_ID && updated_at == None && revision > MARKER_REVISION)

become

(id > MARKER_ID) or (id == MARKER_ID && updated_at == None && revision > MARKER_REVISION)

So if my row is

<uuid> ; null ; -1
<uuid> ; <date> ; 0
<uuid> ; <bigger-date> ; 1

I don't want to filter update_at != None.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.db (master)

Fix proposed to branch: master
Review: https://review.openstack.org/502942

Changed in oslo.db:
assignee: Qiming Teng (tengqim) → Mehdi Abaakouk (sileht)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.db (master)

Reviewed: https://review.openstack.org/502942
Committed: https://git.openstack.org/cgit/openstack/oslo.db/commit/?id=3a40168056b85de0e53b69dc6ff52bddb4cdd7a7
Submitter: Jenkins
Branch: master

commit 3a40168056b85de0e53b69dc6ff52bddb4cdd7a7
Author: Mehdi Abaakouk <email address hidden>
Date: Tue Sep 12 11:52:35 2017 +0200

    Fix pagination when marker value is None

    The change b3869d04cff7071c1226758eb8b58fde9eba5b8d was fixing #1615938
    only for some cases, where the unicity of the row was done by one
    column.

    This change fixes it for other cases.

    Change-Id: I4acb382c770b168f29fa35f02707fb22d402b13b
    Closes-bug: #1615938

Changed in oslo.db:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/oslo.db 4.29.0

This issue was fixed in the openstack/oslo.db 4.29.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.