Use 'last_error' and 'maintenance_reason' as the sort key in node list will error with db2

Bug #1446508 reported by Kan
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Won't Fix
Low
Unassigned

Bug Description

Use DB2 as the database service in openstack, find some errors when using specific sort key in ironic list command.
For example, use 'last_error' as the sort key in the node list command:

[root@kan-rhel7-osee ironic]# ironic node-list --detail --sort-key last_error
(ProgrammingError) ibm_db_dbi::ProgrammingError: SQLNumResultCols failed: [IBM][CLI Driver][DB2/LINUXX8664] SQL0134N Improper use of a string column, host variable, constant, or function "NODES_LAST_ERROR". SQLSTATE=42907 SQLCODE=-134 'SELECT nodes.created_at AS nodes_created_at, nodes.updated_at AS nodes_updated_at, nodes.id AS nodes_id, nodes.uuid AS nodes_uuid, nodes.instance_uuid AS nodes_instance_uuid, nodes.name AS nodes_name, nodes.chassis_id AS nodes_chassis_id, nodes.power_state AS nodes_power_state, nodes.target_power_state AS nodes_target_power_state, nodes.provision_state AS nodes_provision_state, nodes.target_provision_state AS nodes_target_provision_state, nodes.provision_updated_at AS nodes_provision_updated_at, nodes.last_error AS nodes_last_error, nodes.instance_info AS nodes_instance_info, nodes.properties AS nodes_properties, nodes.driver AS nodes_driver, nodes.driver_info AS nodes_driver_info, nodes.driver_internal_info AS nodes_driver_internal_info, nodes.clean_step AS nodes_clean_step, nodes.reservation AS nodes_reservation, nodes.conductor_affinity AS nodes_conductor_affinity, nodes.maintenance AS nodes_maintenance, nodes.maintenance_reason AS nodes_maintenance_reason, nodes.console_enabled AS nodes_console_enabled, nodes.inspection_finished_at AS nodes_inspection_finished_at, nodes.inspection_started_at AS nodes_inspection_started_at, nodes.extra AS nodes_extra \nFROM nodes ORDER BY nodes.last_error ASC, nodes.id ASC FETCH FIRST 1000 ROWS ONLY' () (HTTP 500)

The data type in mysql:
mysql> desc nodes;
+------------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------------+--------------+------+-----+---------+----------------+
| created_at | datetime | YES | | NULL | |
| updated_at | datetime | YES | | NULL | |
| id | int(11) | NO | PRI | NULL | auto_increment |
| uuid | varchar(36) | YES | UNI | NULL | |
| instance_uuid | varchar(36) | YES | UNI | NULL | |
| chassis_id | int(11) | YES | MUL | NULL | |
| power_state | varchar(15) | YES | | NULL | |
| target_power_state | varchar(15) | YES | | NULL | |
| provision_state | varchar(15) | YES | | NULL | |
| target_provision_state | varchar(15) | YES | | NULL | |
| last_error | text | YES | | NULL | |
| properties | text | YES | | NULL | |
| driver | varchar(15) | YES | | NULL | |
| driver_info | text | YES | | NULL | |
| reservation | varchar(255) | YES | | NULL | |
| maintenance | tinyint(1) | YES | | NULL | |
| extra | text | YES | | NULL | |
| provision_updated_at | datetime | YES | | NULL | |
| console_enabled | tinyint(1) | YES | | NULL | |
| instance_info | text | YES | | NULL | |
| conductor_affinity | int(11) | YES | MUL | NULL | |
| maintenance_reason | text | YES | | NULL | |
| driver_internal_info | text | YES | | NULL | |
| name | varchar(255) | YES | UNI | NULL | |
| inspection_started_at | datetime | YES | | NULL | |
| inspection_finished_at | datetime | YES | | NULL | |
| clean_step | text | YES | | NULL | |
+------------------------+--------------+------+-----+---------+----------------+
27 rows in set (0.00 sec)

The data type in db2:
db2 => describe table nodes
                                Data type Column
Column name schema Data type name Length Scale Nulls
------------------------------- --------- ------------------- ---------- ----- ------
CREATED_AT SYSIBM TIMESTAMP 10 6 Yes
UPDATED_AT SYSIBM TIMESTAMP 10 6 Yes
ID SYSIBM INTEGER 4 0 No
UUID SYSIBM VARCHAR 36 0 Yes
INSTANCE_UUID SYSIBM VARCHAR 36 0 Yes
CHASSIS_ID SYSIBM INTEGER 4 0 Yes
POWER_STATE SYSIBM VARCHAR 15 0 Yes
TARGET_POWER_STATE SYSIBM VARCHAR 15 0 Yes
PROVISION_STATE SYSIBM VARCHAR 15 0 Yes
TARGET_PROVISION_STATE SYSIBM VARCHAR 15 0 Yes
LAST_ERROR SYSIBM CLOB 1048576 0 Yes
PROPERTIES SYSIBM CLOB 1048576 0 Yes
DRIVER SYSIBM VARCHAR 15 0 Yes
DRIVER_INFO SYSIBM CLOB 1048576 0 Yes
RESERVATION SYSIBM VARCHAR 255 0 Yes
MAINTENANCE SYSIBM SMALLINT 2 0 Yes
EXTRA SYSIBM CLOB 1048576 0 Yes
PROVISION_UPDATED_AT SYSIBM TIMESTAMP 10 6 Yes
CONSOLE_ENABLED SYSIBM SMALLINT 2 0 Yes
INSTANCE_INFO SYSIBM CLOB 1048576 0 Yes
CONDUCTOR_AFFINITY SYSIBM INTEGER 4 0 Yes
MAINTENANCE_REASON SYSIBM CLOB 1048576 0 Yes
DRIVER_INTERNAL_INFO SYSIBM CLOB 1048576 0 Yes
NAME SYSIBM VARCHAR 63 0 Yes
INSPECTION_STARTED_AT SYSIBM TIMESTAMP 10 6 Yes
INSPECTION_FINISHED_AT SYSIBM TIMESTAMP 10 6 Yes
CLEAN_STEP SYSIBM CLOB 1048576 0 Yes

  27 record(s) selected.

The 'text' format is transferred to 'clob' in DB2, and it can not be used in 'order by' sql command.

There are some solutions that might solve this:
1. Change the 'text' format to 'varchar' format using the max length.
2. Change "ORDER BY nodes.last_error ASC" to "ORDER BY cast(nodes.last_error as varchar(3000)) ASC" for DB2, which can be achieve by editing the the SQL generator script.

Revision history for this message
Viktor Serhieiev (vsergeyev) wrote :

Does Ironic supports DB2 database?

Revision history for this message
Dmitry Tantsur (divius) wrote :

TBH none of the solutions sound reasonable to me. Text is by definition sortable. Using varchar in 21 century seems weird to me - one should never assume length of free form field. So either DB2 is broken or it's sqlalchemy support, I don't see what can be done on Ironic part without regressing to hacks.

Changed in ironic:
status: New → Opinion
Revision history for this message
Kan (kansks) wrote :

Thanks for your comment.

Revision history for this message
Mike Bayer (zzzeek) wrote :

> Text is by definition sortable.

If by "text" we mean "arbitrarily long text", this is not very feasible. For example. Would you wish to sort all of the works of William Shakespeare based on their full text? Would this be at all useful or intuitive? That's what it essentially means when you sort on a database TEXT field that is unlengthed.

> Using varchar in 21 century seems weird to me - one should never assume length of free form field.

unfortunately, even today, space and CPU is not free and especially within relational databases which are tasked with indexing such fields using a variety of techniques of varying performance characteristics - the B-tree is most common but is often not supported on TEXT/CLOB fields. If a field is not indexed, as is the case here, a sort means that every single TEXT field must be compared to others any number of times. Big iron databases such as Oracle and DB2 do not support unindexed ordering of arbitrarily long fields - these databases opt to not support the use case of sorting all the works of William Shakespeare by their full text, because even in the 21st century, people don't really need this :)

> I don't see what can be done on Ironic part without regressing to hacks.

The query that I first saw here was the "extra" field, I was shown this query:

 'SELECT chassis.created_at AS chassis_created_at, chassis.updated_at AS chassis_updated_at, chassis.id AS chassis_id, chassis.uuid AS chassis_uuid, chassis.extra AS chassis_extra, chassis.description AS chassis_description \nFROM chassis ORDER BY chassis.extra ASC, chassis.id ASC FETCH FIRST 1000 ROWS ONLY'

in this case, it appears we are referring to the last_error field.

As for the "extra" field, we've looked into the purpose of this column and it appears to be used to store JSON strings, and sorting JSON strings is about as useful as the abovementioned Shakespeare use case. I would propose making that field not sortable.

As for last_error, I would propose applying a substr() or CAST as needed to the field which will limit it to a reasonable number of characters to be sorted on considering that this is an unindexed sort.

this is definitely a WONTFIX for SQLAlchemy or the DB2 driver, where our agnosticism IMO is doing the right thing by preventing something bad from happening, e.g. sorting by an unlengthed TEXT field. The developer needs to decide how this should be handled.

Revision history for this message
Dmitry Tantsur (divius) wrote :

No need to resort to bad comparisons such as sorting Shakespeare. And if we do, let us imagine Python developer will make only strings of length up to 255 bytes sortable. Good idea? Let's stop trolling and think about use cases.

Applying limit 0-255 (varchar) to last_error is insane. We already have a problem of node state length being limited to some miserable value. One day we wanted node state a bit longer - no luck. last_error is much worse, because we're not in control of errors, and operators definitely don't want us to truncate them.

So we have these options:
* Prevent sorting by last_error. I'm not fond of this idea, because it might be useful to group similar errors. Maybe it's not a compelling use case - discussable.
* Apply substr or CAST hack - maybe. I'd like to see how it actually looks, and how bad the performance will be.
* If you know a portable database type that defines "a long enough string of sane length for sorting" that will work as well :)

Revision history for this message
Dmitry Tantsur (divius) wrote :

UPD: I see that varchars are bigger nowadays, but I'm still wondering what is the portable value of varchar length...

Revision history for this message
Mike Bayer (zzzeek) wrote :

> Let's stop trolling and think about use cases.

I'm sorry you consider that to be trolling. There's not a strong use case to sort textual fields that are arbitrarily long, and database engines quite often store TEXT/CLOB/etc. in a very different way than VARCHAR values such that it is technically infeasible to sort by them. It is analogous to how simple it is to provide a list of Shakespeare titles sorted by name (just grab any list and sort them) versus sorting them by content (requires locating and downloading the first page of each play, e.g. much more cumbersome per value).

> Applying limit 0-255 (varchar) to last_error is insane.

As I think you noted in your next comment, nobody suggested such a thing. VARCHAR types can be much longer or even unlengthed on most backends, plus you don't need to use VARCHAR on every backend.

> but I'm still wondering what is the portable value of varchar length...

No need to know this. The type here can be set up conditionally on backend. Please see http://docs.sqlalchemy.org/en/rel_1_0/core/type_api.html?highlight=with_variant#sqlalchemy.types.TypeEngine.with_variant.

Also, a runtime substr() of the column also doesn't require any typing logic at all.

> Apply substr or CAST hack - maybe. I'd like to see how it actually looks, and how bad the performance will be.

Performance isn't an issue, because you don't need to use it on backends that don't need it. Only do the specific substr() or CAST expression desired on those backends which require it. Please see http://docs.sqlalchemy.org/en/rel_1_0/core/compiler.html#further-examples for simple examples of how to build ad-hoc functions that render different SQL based on backend.

Revision history for this message
Dmitry Tantsur (divius) wrote :

Thanks Mike, it starts to be much clearer. I'm setting this bug to "Incomplete", as it might be actionable, but I'd like some more clarification on the solution:

> you don't need to use VARCHAR on every backend

> The type here can be set up conditionally on backend.

Awesome, I didn't know it. I'm not sure how I feel about using backend-specific type, but it can be a solution. I wonder how we make it not regress without tests on DB2...

Kan, could you please raise this issue on the next Ironic IRC meeting (https://wiki.openstack.org/wiki/Meetings/Ironic)? Depending on the outcome, we'll update the state of the bug.

Changed in ironic:
status: Opinion → Incomplete
Revision history for this message
Kan (kansks) wrote :

Thanks Dmitry. I think inviting Mike to the meeting is a good idea to make discuss more clearly.
Hi, Mike. As you are more expert than me in database related knowledge, you are warmly welcome to attend the ironic IRC meeting, for I have not much knowledge in database and in case that I misunderstand your point. The meeting time is here:
https://www.google.com/calendar/embed?showPrint=0&showCalendars=0&mode=WEEK&height=600&wkst=1&bgcolor=%23FFFFFF&src=sodarock.com_fepuqkqe2333htb41n2n9qmue8%40group.calendar.google.com&color=%235229A3&ctz=America%2FLos_Angeles
Please let me know whether it is convenient for you.

Revision history for this message
Mike Bayer (zzzeek) wrote :

Hi Kan -

Seems like the time is set incorrectly for 4/27 on that google calendar, can you confirm this meeting is next on Monday, Apr 27 10AM PDT / 1 PM EDT ? thanks

I will schedule it and if you want to ping me on irc nickname zzzeek I'll try to be there.

- mike

Revision history for this message
Kan (kansks) wrote :

Hi, Mike, sorry for misleading you. The official time is in this link Dmitry mentioned: https://wiki.openstack.org/wiki/Meetings/Ironic
The next meeting time is here:
The next meeting is scheduled for April 28th, 2015 at 0500 UTC (http://www.timeanddate.com/worldclock/fixedtime.html?iso=20150428T0500), you can check with your local time. Thanks.

Revision history for this message
Mike Bayer (zzzeek) wrote :

hi Kan -

I'm in EDT so I can't make a 1 AM meeting during the week.

Revision history for this message
Kan (kansks) wrote :

Hi, Mike. What a pity. Seems like the next next meeting is 1 PM EDT. How about move this discuss in that meeting?
Hi, Dmitry, is it fine with you?

Revision history for this message
Dmitry Tantsur (divius) wrote :

Well, I don't care when exactly it will happen :) That said, I don't understand why you insist on Mike being present. Deciding if the community is fine with one of the solutions here does not require a database expert.

Revision history for this message
Kan (kansks) wrote :

Mike can give us some new clues about this fix, right? Besides these two solutions, there might be some other solutions come out when discuss. I think get Mike involved can help to make the discuss more clearly. So I will move the discuss in that meeting. Thanks.

Revision history for this message
Mike Bayer (zzzeek) wrote :

there's two basic solutions, assuming you want to leave the behavior the same on non-DB2 databases: 1. apply a SQL expression to the ordering against DB2, or 2. change the column on DB2.

For 1, you would apply a function like this to the ORDER BY, in the case of columns that are known to be TEXT: http://paste.openstack.org/show/208470/ (docs: docs.sqlalchemy.org/en/rel_1_0/core/compiler.html)

For 2, you would change the column type to be VARCHAR(5000) on DB2 and other DBs as needed: http://paste.openstack.org/show/208472/ (docs: http://docs.sqlalchemy.org/en/rel_1_0/core/type_api.html?highlight=with_variant#sqlalchemy.types.TypeEngine.with_variant)

the challenge with #1 is getting that expression to be used as needed in ORDER BY expressions. The challenge with #2 is migrating existing DB2 backend to use the new types, if there are in fact DB2 environments in production; the migration script would need to test the backend for being the 'db2' backend or other backend where this applies and apply an ALTER to change the type.

As migrating types can be awkward I'd probably go for #1 here.

Revision history for this message
Dmitry Tantsur (divius) wrote :

Thanks, Mike, solution #1 looks doable to me as well. Kan, I would still recommend to chat with other cores on whether they're ok with such a work around.

Changed in ironic:
status: Incomplete → Triaged
importance: Undecided → Low
Revision history for this message
Kan (kansks) wrote :

No problem. Will propose. Thank you two.

Revision history for this message
Kan (kansks) wrote :
Download full text (3.8 KiB)

I have seen the example mike commented in method #1(http://paste.openstack.org/show/208470/), the sql command is modified in statement as well as the content need to be cast is in string format, that ensures the element be replaced to the corresponding database.

But I came across a problem when debugging these in ironic. It is that the statement is a 'query' object, and the 'sort keys' is part of this object. So I don't know how to generate the 'sort keys' element cast according to different databases.

Here are the code flow:

1. In ironic.db.sqlalchemy.api.py:
https://github.com/openstack/ironic/blob/master/ironic/db/sqlalchemy/api.py#L136
https://github.com/openstack/ironic/blob/master/ironic/db/sqlalchemy/api.py#L205

...
from oslo_db.sqlalchemy import utils as db_utils
...
def model_query(model, *args, **kwargs):
    """Query helper for simpler session usage.
    :param session: if present, the session to use
    """

    session = kwargs.get('session') or get_session()
    query = session.query(model, *args)
    return query
...
def _paginate_query(model, limit=None, marker=None, sort_key=None,
                    sort_dir=None, query=None):
    if not query:
        query = model_query(model)
    sort_keys = ['id']
    if sort_key and sort_key not in sort_keys:
        sort_keys.insert(0, sort_key)
    query = db_utils.paginate_query(query, model, limit, sort_keys,
                                    marker=marker, sort_dir=sort_dir)
    return query.all()
...
class Connection(api.Connection):
    """SqlAlchemy connection."""
...
    def get_node_list(self, filters=None, limit=None, marker=None,
                      sort_key=None, sort_dir=None):
        query = model_query(models.Node)
        query = self._add_nodes_filters(query, filters)
        return _paginate_query(models.Node, limit, marker,
                               sort_key, sort_dir, query)
...

Here the query is init, the init function is in sqlalchemy.orm.session: https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/orm/session.py#L1245
And the sort keys are input as list such as [u'last_error', 'id'] when using 'last_error' as the sort key.

2. The 'paginate_query' will call function in oslo.db:
https://github.com/openstack/oslo.db/blob/master/oslo_db/sqlalchemy/utils.py#L140

def paginate_query(query, model, limit, sort_keys, marker=None,
                   sort_dir=None, sort_dirs=None):
...
    # Add sorting
    for current_sort_key, current_sort_dir in zip(sort_keys, sort_dirs):
        try:
            sort_dir_func = {
                'asc': sqlalchemy.asc,
                'desc': sqlalchemy.desc,
            }[current_sort_dir]
        except KeyError:
            raise ValueError(_("Unknown sort direction, "
                               "must be 'desc' or 'asc'"))
        try:
            sort_key_attr = inspect(model).\
                all_orm_descriptors[current_sort_key]
        except KeyError:
            raise exception.InvalidSortKey()
        query = query.order_by(sort_dir_func(sort_key_attr))
...
    return query

The 'sort_keys' is handled here to 'sort_key_attr'.
The 'inspect(model)' returns a 'Mapper' object. The 'all_orm_descripto...

Read more...

Revision history for this message
Mike Bayer (zzzeek) wrote :

oslo.db will need to be modified so that the sort_keys list can accept SQL elements directly, in addition to strings (also it should be documented more clearly).

Revision history for this message
Mike Bayer (zzzeek) wrote :

oh there's another way, that wouldn't require an oslo.db change and might be more portable. Use a @hybrid_property on the ironic model, like this:

from sqlalchemy.ext.hybrid import hybrid_property

class SomeModel(Base):
    # ...

    text_field = Column(Text)

    @hybrid_property
    def sortable_text_field(self):
        return self.text_field

    @sortable_text_field.expression
    def sortable_text_field(cls):
        return OrderByText(self.text_field)

then you specify "sortable_text_field" to oslo's paginate_query instead of "text_field".

Revision history for this message
Mike Bayer (zzzeek) wrote :

that should be

  @sortable_text_field.expression
    def sortable_text_field(cls):
        return OrderByText(cls.text_field)

for that last method

Revision history for this message
Kan (kansks) wrote :

There is a related bug in oslo.db: https://bugs.launchpad.net/oslo.db/+bug/1451880
Will try the method Mike proposed with this oslo.db patch.

Kan (kansks)
Changed in ironic:
assignee: nobody → Kan (kansks)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to ironic (master)

Fix proposed to branch: master
Review: https://review.openstack.org/181809

Revision history for this message
Kan (kansks) wrote :

This patch also needs an oslo.db patch to pass the tests: https://review.openstack.org/#/c/180206/

For now, the oslo.db patch is merged but is not included in the latest tag of oslo.db. Mark WIP and will update the tag of oslo.db in requirements once the oslo.db code is included.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on ironic (master)

Change abandoned by Dmitry Tantsur (<email address hidden>) on branch: master
Review: https://review.openstack.org/181809
Reason: Hi! This patch has been work in progress for nearly a year. Please feel free to restore it if you still plan on working on it.

Dmitry Tantsur (divius)
Changed in ironic:
status: In Progress → Triaged
assignee: Kan (kansks) → nobody
Revision history for this message
Julia Kreger (juliaashleykreger) wrote :

Greetings, I'm going to mark this item as Wont-Fix.

Realistically, DB2 support is outside of Sqlalchemy's support as an "external dialect", and there is no way for the project to confirm or reject any fix to the modeling. For all we know, it could be fine with the extensive refactoring which has taken place. As a project, Ironic only "supports" MySQL and SQLite usage, and we provide some support for Postgres through our ability to test it. Beyond that, there is really no way to know and we should not hold this item open unless contributors are able to reproduce and verify this issue.

Changed in ironic:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.