raw_template files duplication wastes DB space and memory
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Heat |
Fix Released
|
High
|
Crag Wolfe |
Bug Description
I've been trying to better understand where our memory is used when creating large trees of stacks (such as a TripleO deployment which creates a tree of over 100 nested stacks).
I noticed one thing when profiling via heapy:
heap byvia=Partition of a set of 19392 objects. Total size = 19857584 bytes.
Index Count % Size % Cumulative % Referred Via:
0 9 0 2556360 13 2556360 13 "[u'file:
1 9 0 2107872 11 4664232 23 "[u'file:
2 1 0 1779176 9 6443408 32 '.body'
3 1 0 1530760 8 7974168 40 "[u'oslo.message']"
4 5 0 1430408 7 9404576 47 '[27]'
5 9 0 1098432 6 10503008 53 "[u'file:
6 9 0 937008 5 11440016 58 "[u'file:
7 9 0 563256 3 12003272 60 "[u'file:
8 9 0 527904 3 12531176 63 "[u'file:
9 178 1 493640 2 13024816 66 '.func_doc', '[0]'
We're passing the files map to every single one of those 100+ nested stacks via RPC, which means we waste memory in the messages, and in the template.Template objects.
It also turns out to be a pretty big waste of DB space:
MariaDB [heat]> select sum(char_
+------
| sum(char_
+------
| 461.9550 |
+------
So that's 461MB of data which is almost all duplicated (the files map is just over 300K and we're storing it over 100 times.
I think we probably need to create a new table which contains a record per file, then have a relationship between that table and the raw_template such that we can lazy-load the files on demand (each nested stack probably only requires access to a tiny subset of the global files map).
tags: | added: tripleo |
Changed in heat: | |
status: | Triaged → In Progress |
I think that's meant to be addressed by https:/ /review. openstack. org/#/c/ 303692/ ?