mongodb stores traits inefficiently

Bug #1425584 reported by gordon chung
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ceilometer
Won't Fix
Wishlist
Ankita Wagh
Panko
Confirmed
Wishlist
Unassigned

Bug Description

currently mongodb stores each trait as:

{'trait_name': trait.name, 'trait_type': trait.dtype, 'trait_value': trait.value}

https://github.com/openstack/ceilometer/blob/master/ceilometer/event/storage/pymongo_base.py#L60-L64

this is really inefficient as a trait is a key-value pair and could be stored as {trait.name: trait.value} but currently the mongodb stores the data exactly as the same way as data model. this can make updates to model difficult to fix in mongodb and it can dramatically increase the side of each event ie. instead of a trait being k:v, it's now 3 times the size)

gordon chung (chungg)
description: updated
Changed in ceilometer:
assignee: nobody → Ankita (ankita-wagh7)
Revision history for this message
gordon chung (chungg) wrote :

thanks for looking at this. before you start you might want to ask in irc, mailing list, or meeting if others agree. the only scenario where the current model makes sense if we want to expand on what a trait is and add certain flags/tags to a trait... but i don't have a real use case to be honest which is why i personally think it can be fixed.

gordon chung (chungg)
Changed in ceilometer:
status: Triaged → Won't Fix
Changed in panko:
importance: Undecided → Wishlist
status: New → Triaged
Revision history for this message
Pooja Ghumre (pooja-9) wrote :

We either need this to be fixed in MongoDb or start fixing scale issues with SQLalchemy-supported DBs. We are currently on Pike with Panko using MySQL as a database backend (we switched from mongodb because of this bug) and now are running into issues with event queries where the mysql command hangs for hours when the no. of events grow beyond 60k rows or so.

This was the mysql query that got hung.
mysql> SELECT event.id AS event_id, event.`generated` AS event_generated, event.message_id AS event_message_id, event_type.`desc` AS event_type_desc, event.raw AS event_raw FROM event INNER JOIN event_type ON event_type.id = event.event_type_id AND event_type.`desc` = 'compute.instance.create.start' INNER JOIN (SELECT trait_text.event_id AS ev_id FROM trait_text, (SELECT trait_text.event_id AS ev_id FROM trait_text WHERE trait_text.`key` = 'user_id' AND trait_text.value = '5bd56f9907594f8f8234d84bf0898a74') AS anon_2, (SELECT trait_text.event_id AS ev_id FROM trait_text WHERE trait_text.`key` = 'project_id' AND trait_text.value = '43acb105b32a4193a63774a9eaa362b6') AS anon_3 WHERE trait_text.`key` = 'user_id' AND trait_text.value = '5bd56f9907594f8f8234d84bf0898a74' AND anon_2.ev_id = anon_3.ev_id) AS anon_1 ON anon_1.ev_id = event.id WHERE event.`generated` >= 1542072223 AND event.`generated` <= 1543369291 ORDER BY event.`generated` ASC, event.message_id ASC LIMIT 100;```

Changed in panko:
status: Triaged → Confirmed
Revision history for this message
Pooja Ghumre (pooja-9) wrote :

Is this something that will be addressed for mongodb or mysql backend in Panko?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.