Ironic

Use 'last_error' and 'maintenance_reason' as the sort key in node list will error with db2

Bug #1446508 reported by Kan on 2015-04-21

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Ironic	Won't Fix	Low	Unassigned

Bug Description

Use DB2 as the database service in openstack, find some errors when using specific sort key in ironic list command.
For example, use 'last_error' as the sort key in the node list command:

[root@kan-rhel7-osee ironic]# ironic node-list --detail --sort-key last_error
(ProgrammingError) ibm_db_dbi::ProgrammingError: SQLNumResultCols failed: [IBM][CLI Driver][DB2/LINUXX8664] SQL0134N Improper use of a string column, host variable, constant, or function "NODES_LAST_ERROR". SQLSTATE=42907 SQLCODE=-134 'SELECT nodes.created_at AS nodes_created_at, nodes.updated_at AS nodes_updated_at, nodes.id AS nodes_id, nodes.uuid AS nodes_uuid, nodes.instance_uuid AS nodes_instance_uuid, nodes.name AS nodes_name, nodes.chassis_id AS nodes_chassis_id, nodes.power_state AS nodes_power_state, nodes.target_power_state AS nodes_target_power_state, nodes.provision_state AS nodes_provision_state, nodes.target_provision_state AS nodes_target_provision_state, nodes.provision_updated_at AS nodes_provision_updated_at, nodes.last_error AS nodes_last_error, nodes.instance_info AS nodes_instance_info, nodes.properties AS nodes_properties, nodes.driver AS nodes_driver, nodes.driver_info AS nodes_driver_info, nodes.driver_internal_info AS nodes_driver_internal_info, nodes.clean_step AS nodes_clean_step, nodes.reservation AS nodes_reservation, nodes.conductor_affinity AS nodes_conductor_affinity, nodes.maintenance AS nodes_maintenance, nodes.maintenance_reason AS nodes_maintenance_reason, nodes.console_enabled AS nodes_console_enabled, nodes.inspection_finished_at AS nodes_inspection_finished_at, nodes.inspection_started_at AS nodes_inspection_started_at, nodes.extra AS nodes_extra \nFROM nodes ORDER BY nodes.last_error ASC, nodes.id ASC FETCH FIRST 1000 ROWS ONLY' () (HTTP 500)

The data type in db2:
db2 => describe table nodes
--------------PROVISION_STATE LAST_ERROR SYSIBM CLOB PROPERTIES SYSIBM CLOB DRIVER DRIVER_INFO SYSIBM CLOB RESERVATION MAINTENANCE EXTRA SYSIBM CLOB PROVISION_UPDATED_AT CONSOLE_ENABLED INSTANCE_INFO SYSIBM CLOB CONDUCTOR_AFFINITY MAINTENANCE_REASON SYSIBM CLOB DRIVER_INTERNAL_INFO NAME INSPECTION_STARTED_AT INSPECTION_FINISHED_AT CLEAN_STEP SYSIBM CLOB /> Data type Column
schema Data type name Length Scale Nulls
/>---------- --------- ------------------- ---------- ----- ------
SYSIBM TIMESTAMP 10 6 Yes
SYSIBM TIMESTAMP 10 6 Yes
SYSIBM INTEGER 4 0 No
SYSIBM VARCHAR 36 0 Yes
SYSIBM VARCHAR 36 0 Yes
SYSIBM INTEGER 4 0 Yes
SYSIBM VARCHAR 15 0 Yes
SYSIBM VARCHAR 15 0 Yes
SYSIBM VARCHAR 15 0 Yes
SYSIBM VARCHAR 15 0 Yes
1048576 0 Yes
1048576 0 Yes
SYSIBM VARCHAR 15 0 Yes
1048576 0 Yes
SYSIBM VARCHAR 255 0 Yes
SYSIBM SMALLINT 2 0 Yes
1048576 0 Yes
SYSIBM TIMESTAMP 10 6 Yes
SYSIBM SMALLINT 2 0 Yes
1048576 0 Yes
SYSIBM INTEGER 4 0 Yes
1048576 0 Yes
SYSIBM CLOB 1048576 0 Yes
SYSIBM VARCHAR 63 0 Yes
SYSIBM TIMESTAMP 10 6 Yes
SYSIBM TIMESTAMP 10 6 Yes
1048576 0 Yes

27 record(s) selected.

The 'text' format is transferred to 'clob' in DB2, and it can not be used in 'order by' sql command.

There are some solutions that might solve this:
1. Change the 'text' format to 'varchar' format using the max length.
2. Change "ORDER BY nodes.last_error ASC" to "ORDER BY cast(nodes.last_error as varchar(3000)) ASC" for DB2, which can be achieve by editing the the SQL generator script.

Revision history for this message

Viktor Serhieiev (vsergeyev) wrote on 2015-04-21:

Does Ironic supports DB2 database?

Revision history for this message

Dmitry Tantsur (divius) wrote on 2015-04-21:

TBH none of the solutions sound reasonable to me. Text is by definition sortable. Using varchar in 21 century seems weird to me - one should never assume length of free form field. So either DB2 is broken or it's sqlalchemy support, I don't see what can be done on Ironic part without regressing to hacks.

Changed in ironic:
status:	New → Opinion

Revision history for this message

Kan (kansks) wrote on 2015-04-22:

Thanks for your comment.

Revision history for this message

Mike Bayer (zzzeek) wrote on 2015-04-22:

> Text is by definition sortable.

If by "text" we mean "arbitrarily long text", this is not very feasible. For example. Would you wish to sort all of the works of William Shakespeare based on their full text? Would this be at all useful or intuitive? That's what it essentially means when you sort on a database TEXT field that is unlengthed.

> Using varchar in 21 century seems weird to me - one should never assume length of free form field.

unfortunately, even today, space and CPU is not free and especially within relational databases which are tasked with indexing such fields using a variety of techniques of varying performance characteristics - the B-tree is most common but is often not supported on TEXT/CLOB fields. If a field is not indexed, as is the case here, a sort means that every single TEXT field must be compared to others any number of times. Big iron databases such as Oracle and DB2 do not support unindexed ordering of arbitrarily long fields - these databases opt to not support the use case of sorting all the works of William Shakespeare by their full text, because even in the 21st century, people don't really need this :)

> I don't see what can be done on Ironic part without regressing to hacks.

The query that I first saw here was the "extra" field, I was shown this query:

'SELECT chassis.created_at AS chassis_created_at, chassis.updated_at AS chassis_updated_at, chassis.id AS chassis_id, chassis.uuid AS chassis_uuid, chassis.extra AS chassis_extra, chassis.description AS chassis_description \nFROM chassis ORDER BY chassis.extra ASC, chassis.id ASC FETCH FIRST 1000 ROWS ONLY'

in this case, it appears we are referring to the last_error field.

As for the "extra" field, we've looked into the purpose of this column and it appears to be used to store JSON strings, and sorting JSON strings is about as useful as the abovementioned Shakespeare use case. I would propose making that field not sortable.

As for last_error, I would propose applying a substr() or CAST as needed to the field which will limit it to a reasonable number of characters to be sorted on considering that this is an unindexed sort.

this is definitely a WONTFIX for SQLAlchemy or the DB2 driver, where our agnosticism IMO is doing the right thing by preventing something bad from happening, e.g. sorting by an unlengthed TEXT field. The developer needs to decide how this should be handled.

> Text is by definition sortable.

If by "text" we mean "arbitrarily long text", this is not very feasible.    For example.  Would you wish to sort all of the works of William Shakespeare based on their full text?   Would this be at all useful or intuitive?  That's what it essentially means when you sort on a database TEXT field that is unlengthed.

> Using varchar in 21 century seems weird to me - one should never assume length of free form field.

unfortunately, even today, space and CPU is not free and especially within relational databases which are tasked with indexing such fields using a variety of techniques of varying performance characteristics - the B-tree is most common but is often not supported on TEXT/CLOB fields.          If a field is not indexed, as is the case here, a sort means that every single TEXT field must be compared to others any number of times.     Big iron databases such as Oracle and DB2 do not support unindexed ordering of arbitrarily long fields - these databases opt to not support the use case of sorting all the works of William Shakespeare by their full text, because even in the 21st century, people don't really need this :)

> I don't see what can be done on Ironic part without regressing to hacks.

The query that I first saw here was the "extra" field, I was shown this query:

in this case, it appears we are referring to the last_error field.

this is definitely a WONTFIX for SQLAlchemy or the DB2 driver, where our agnosticism IMO is doing the right thing by preventing something bad from happening, e.g. sorting by an unlengthed TEXT field.   The developer needs to decide how this should be handled.

Revision history for this message

Dmitry Tantsur (divius) wrote on 2015-04-22:

No need to resort to bad comparisons such as sorting Shakespeare. And if we do, let us imagine Python developer will make only strings of length up to 255 bytes sortable. Good idea? Let's stop trolling and think about use cases.

Applying limit 0-255 (varchar) to last_error is insane. We already have a problem of node state length being limited to some miserable value. One day we wanted node state a bit longer - no luck. last_error is much worse, because we're not in control of errors, and operators definitely don't want us to truncate them.

So we have these options:
* Prevent sorting by last_error. I'm not fond of this idea, because it might be useful to group similar errors. Maybe it's not a compelling use case - discussable.
* Apply substr or CAST hack - maybe. I'd like to see how it actually looks, and how bad the performance will be.
* If you know a portable database type that defines "a long enough string of sane length for sorting" that will work as well :)

Revision history for this message

Dmitry Tantsur (divius) wrote on 2015-04-22:

UPD: I see that varchars are bigger nowadays, but I'm still wondering what is the portable value of varchar length...

Revision history for this message

Mike Bayer (zzzeek) wrote on 2015-04-22:

> Let's stop trolling and think about use cases.

I'm sorry you consider that to be trolling. There's not a strong use case to sort textual fields that are arbitrarily long, and database engines quite often store TEXT/CLOB/etc. in a very different way than VARCHAR values such that it is technically infeasible to sort by them. It is analogous to how simple it is to provide a list of Shakespeare titles sorted by name (just grab any list and sort them) versus sorting them by content (requires locating and downloading the first page of each play, e.g. much more cumbersome per value).

> Applying limit 0-255 (varchar) to last_error is insane.

As I think you noted in your next comment, nobody suggested such a thing. VARCHAR types can be much longer or even unlengthed on most backends, plus you don't need to use VARCHAR on every backend.

> but I'm still wondering what is the portable value of varchar length...

No need to know this. The type here can be set up conditionally on backend. Please see http://docs.sqlalchemy.org/en/rel_1_0/core/type_api.html?highlight=with_variant#sqlalchemy.types.TypeEngine.with_variant.

Also, a runtime substr() of the column also doesn't require any typing logic at all.

> Apply substr or CAST hack - maybe. I'd like to see how it actually looks, and how bad the performance will be.

Performance isn't an issue, because you don't need to use it on backends that don't need it. Only do the specific substr() or CAST expression desired on those backends which require it. Please see http://docs.sqlalchemy.org/en/rel_1_0/core/compiler.html#further-examples for simple examples of how to build ad-hoc functions that render different SQL based on backend.

Revision history for this message

Dmitry Tantsur (divius) wrote on 2015-04-23:

Thanks Mike, it starts to be much clearer. I'm setting this bug to "Incomplete", as it might be actionable, but I'd like some more clarification on the solution:

> you don't need to use VARCHAR on every backend

> The type here can be set up conditionally on backend.

Awesome, I didn't know it. I'm not sure how I feel about using backend-specific type, but it can be a solution. I wonder how we make it not regress without tests on DB2...

Kan, could you please raise this issue on the next Ironic IRC meeting (https://wiki.openstack.org/wiki/Meetings/Ironic)? Depending on the outcome, we'll update the state of the bug.

Changed in ironic:
status:	Opinion → Incomplete

Revision history for this message

Kan (kansks) wrote on 2015-04-23:

Thanks Dmitry. I think inviting Mike to the meeting is a good idea to make discuss more clearly.
Hi, Mike. As you are more expert than me in database related knowledge, you are warmly welcome to attend the ironic IRC meeting, for I have not much knowledge in database and in case that I misunderstand your point. The meeting time is here:
https://www.google.com/calendar/embed?showPrint=0&showCalendars=0&mode=WEEK&height=600&wkst=1&bgcolor=%23FFFFFF&src=sodarock.com_fepuqkqe2333htb41n2n9qmue8%40group.calendar.google.com&color=%235229A3&ctz=America%2FLos_Angeles
Please let me know whether it is convenient for you.

Revision history for this message

Mike Bayer (zzzeek) wrote on 2015-04-23:

#10

Hi Kan -

Seems like the time is set incorrectly for 4/27 on that google calendar, can you confirm this meeting is next on Monday, Apr 27 10AM PDT / 1 PM EDT ? thanks

I will schedule it and if you want to ping me on irc nickname zzzeek I'll try to be there.

- mike

Revision history for this message

Kan (kansks) wrote on 2015-04-24:

#11

Hi, Mike, sorry for misleading you. The official time is in this link Dmitry mentioned: https://wiki.openstack.org/wiki/Meetings/Ironic
The next meeting time is here:
The next meeting is scheduled for April 28th, 2015 at 0500 UTC (http://www.timeanddate.com/worldclock/fixedtime.html?iso=20150428T0500), you can check with your local time. Thanks.

Revision history for this message

Mike Bayer (zzzeek) wrote on 2015-04-24:

#12

hi Kan -

I'm in EDT so I can't make a 1 AM meeting during the week.

Revision history for this message

Kan (kansks) wrote on 2015-04-26:

#13

Hi, Mike. What a pity. Seems like the next next meeting is 1 PM EDT. How about move this discuss in that meeting?
Hi, Dmitry, is it fine with you?

Revision history for this message

Dmitry Tantsur (divius) wrote on 2015-04-26:

#14

Well, I don't care when exactly it will happen :) That said, I don't understand why you insist on Mike being present. Deciding if the community is fine with one of the solutions here does not require a database expert.

Revision history for this message

Kan (kansks) wrote on 2015-04-27:

#15

Mike can give us some new clues about this fix, right? Besides these two solutions, there might be some other solutions come out when discuss. I think get Mike involved can help to make the discuss more clearly. So I will move the discuss in that meeting. Thanks.

Revision history for this message

Mike Bayer (zzzeek) wrote on 2015-04-27:

#16

there's two basic solutions, assuming you want to leave the behavior the same on non-DB2 databases: 1. apply a SQL expression to the ordering against DB2, or 2. change the column on DB2.

For 1, you would apply a function like this to the ORDER BY, in the case of columns that are known to be TEXT: http://paste.openstack.org/show/208470/ (docs: docs.sqlalchemy.org/en/rel_1_0/core/compiler.html)

For 2, you would change the column type to be VARCHAR(5000) on DB2 and other DBs as needed: http://paste.openstack.org/show/208472/ (docs: http://docs.sqlalchemy.org/en/rel_1_0/core/type_api.html?highlight=with_variant#sqlalchemy.types.TypeEngine.with_variant)

the challenge with #1 is getting that expression to be used as needed in ORDER BY expressions. The challenge with #2 is migrating existing DB2 backend to use the new types, if there are in fact DB2 environments in production; the migration script would need to test the backend for being the 'db2' backend or other backend where this applies and apply an ALTER to change the type.

As migrating types can be awkward I'd probably go for #1 here.

Revision history for this message

Dmitry Tantsur (divius) wrote on 2015-04-27:

#17

Thanks, Mike, solution #1 looks doable to me as well. Kan, I would still recommend to chat with other cores on whether they're ok with such a work around.

Changed in ironic:
status:	Incomplete → Triaged
importance:	Undecided → Low

Revision history for this message

Kan (kansks) wrote on 2015-04-28:

#18

No problem. Will propose. Thank you two.

Revision history for this message

Kan (kansks) wrote on 2015-04-30:

#19

Download full text (3.8 KiB)

I have seen the example mike commented in method #1(http://paste.openstack.org/show/208470/), the sql command is modified in statement as well as the content need to be cast is in string format, that ensures the element be replaced to the corresponding database.

But I came across a problem when debugging these in ironic. It is that the statement is a 'query' object, and the 'sort keys' is part of this object. So I don't know how to generate the 'sort keys' element cast according to different databases.

Here are the code flow:

1. In ironic.db.sqlalchemy.api.py:
https://github.com/openstack/ironic/blob/master/ironic/db/sqlalchemy/api.py#L136
https://github.com/openstack/ironic/blob/master/ironic/db/sqlalchemy/api.py#L205

...
from oslo_db.sqlalchemy import utils as db_utils
...
def model_query(model, *args, **kwargs):
    """Query helper for simpler session usage.
    :param session: if present, the session to use
    """

    session = kwargs.get('session') or get_session()
    query = session.query(model, *args)
    return query
...
def _paginate_query(model, limit=None, marker=None, sort_key=None,
                    sort_dir=None, query=None):
    if not query:
        query = model_query(model)
    sort_keys = ['id']
    if sort_key and sort_key not in sort_keys:
        sort_keys.insert(0, sort_key)
    query = db_utils.paginate_query(query, model, limit, sort_keys,
                                    marker=marker, sort_dir=sort_dir)
    return query.all()
...
class Connection(api.Connection):
    """SqlAlchemy connection."""
...
    def get_node_list(self, filters=None, limit=None, marker=None,
                      sort_key=None, sort_dir=None):
        query = model_query(models.Node)
        query = self._add_nodes_filters(query, filters)
        return _paginate_query(models.Node, limit, marker,
                               sort_key, sort_dir, query)
...

Here the query is init, the init function is in sqlalchemy.orm.session: https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/orm/session.py#L1245
And the sort keys are input as list such as [u'last_error', 'id'] when using 'last_error' as the sort key.

2. The 'paginate_query' will call function in oslo.db:
https://github.com/openstack/oslo.db/blob/master/oslo_db/sqlalchemy/utils.py#L140

def paginate_query(query, model, limit, sort_keys, marker=None,
                   sort_dir=None, sort_dirs=None):
...
    # Add sorting
    for current_sort_key, current_sort_dir in zip(sort_keys, sort_dirs):
        try:
            sort_dir_func = {
                'asc': sqlalchemy.asc,
                'desc': sqlalchemy.desc,
            }[current_sort_dir]
        except KeyError:
            raise ValueError(_("Unknown sort direction, "
                               "must be 'desc' or 'asc'"))
        try:
            sort_key_attr = inspect(model).\
                all_orm_descriptors[current_sort_key]
        except KeyError:
            raise exception.InvalidSortKey()
        query = query.order_by(sort_dir_func(sort_key_attr))
...
    return query

The 'sort_keys' is handled here to 'sort_key_attr'.
The 'inspect(model)' returns a 'Mapper' object. The 'all_orm_descripto...

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.

Changed in ironic:
assignee:	nobody → Kan (kansks)
status:	Triaged → In Progress

Changed in ironic:
status:	In Progress → Triaged
assignee:	Kan (kansks) → nobody