Lacking support for searching nested variables

Bug #1671116 reported by Thomas Maddox
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
craton
Fix Released
High
Thomas Maddox

Bug Description

We are currently unable to effectively query on nested variables for a resources. Not being able to do so severely hinders the power of this API and harms our value-add.

For example, let's say we want to query a series of hosts for which have a disk manufactured by Seagate and the data for this, expressed via variables on hosts, looks like:

{
    "hardware_profiles": {
        "disks": [
            {
                "manufacturer": "seagate",
                "capacity": "2TB"
            },
            {
                "manufacturer": "western",
                "capacity": "5TB"
            }
        ]
    }
}

We wouldn't be able to effectively query for this condition with our existing minimal variable querying capabilities (simply matches for "key:value"). After some discussion, we've agreed that it'd be more expressive and easier to work with if we were to provide something like JSONPath queries to achieve this. So, the end-result would look something like:

craton host-list --vars 'hardware_profiles.disks[*].manufacturer:seagate'

There are several problems presented:
1. How do we leverage our database's JSON capabilities to achieve querying variables this way?
2. How do we handle the existing data model where a variable can be either a String datatype or a JSON document?
3. How can this be applied to querying for resolved variables?
4. What limitations exist in SQLAlchemy that may prevent or hinder our ability to implement this, and how do we overcome them?
5. In order to support this somewhat cleanly, we'll need to pin our DB requirement at MySQL 5.7, so this imposes a deployer burden.

Workaround:

The interim solution for this lies in flattening the storage of the data to diminish the need for querying nested variables, but it's still not solved entirely, and certainly not in a sustainable way.

Changed in craton:
milestone: none → v0.1.0
importance: Undecided → Critical
importance: Critical → High
description: updated
Changed in craton:
assignee: nobody → Thomas Maddox (thomas-maddox)
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to craton (master)

Fix proposed to branch: master
Review: https://review.openstack.org/443186

Changed in craton:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/443941

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to craton (master)

Reviewed: https://review.openstack.org/443186
Committed: https://git.openstack.org/cgit/openstack/craton/commit/?id=6884f8ff8ccd9d1ab4c83751ff8adc60d0c44cee
Submitter: Jenkins
Branch: master

commit 6884f8ff8ccd9d1ab4c83751ff8adc60d0c44cee
Author: Thomas Maddox <email address hidden>
Date: Wed Mar 8 15:34:14 2017 +0000

    Move to MySQL 5.7 and SQLAlchemy>=1.1.0

    In an effort to support JSON Path variable
    queries, we need to:
    * Pin to MySQL 5.7
    * Use JSON column type
    * Use a newer SQLAlchemy library.

    Change-Id: I98ae021c9cd9755f51e67e56135db164e8ae1f50
    Partial-Bug: 1671116
    Depends-On: I36b7ae9cf68aeef0e5dbcb1f17db52b81efd6ffe

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/443941
Committed: https://git.openstack.org/cgit/openstack/craton/commit/?id=606926b89d2106fac4680052dbf984083f6a1e83
Submitter: Jenkins
Branch: master

commit 606926b89d2106fac4680052dbf984083f6a1e83
Author: Thomas Maddox <email address hidden>
Date: Thu Mar 9 17:11:45 2017 +0000

    JSON Path-like querying for variables

    This patch implements the ability to query for values
    in nested variable documents. For instance, if we had
    hosts with a variables that look like:

    {
        "hardware_profiles": {
            "disks": [
                {
                    "manufacturer": "seagate",
                    "capacity_quantity": 2,
                    "capacity_unit": "TB"
                },
                {
                    "manufacturer": "western",
                    "capacity_quantity": 3,
                    "capacity_unit": "TB"
                }
            ]
        }
    }

    and we wanted to grab all of them with a disk that was
    manufactured by Seagate, we would like to be able to
    query like so:

    GET /v1/hosts?vars=hardware_profiles.disks[*].manufacturer:"seagate"

    This does modify the variables query to expect valid JSON
    values. So, strings need to be quoted, for instance.

    Change-Id: Id597d3e57d2e28766fecd1b314f53176543e1b9d
    Closes-Bug: 1671116

Changed in craton:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.