meter table contains redundant/duplicate data
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ceilometer |
Fix Released
|
Wishlist
|
gordon chung |
Bug Description
The meter table is one of the base "fact" tables in the Ceilometer schema. As such, the number of rows in this table will be large. We should remove redundant and duplicate fields from this schema to allow many more records per data and index page.
The current (as of 2013-08-13) schema for the meter table looks like this:
CREATE TABLE meter (
id INT PRIMARY KEY,
counter_
user_id VARCHAR(255) NOT NULL,
project_id VARCHAR(255) NOT NULL,
resource_id VARCHAR(255) NOT NULL,
resource_metadata TEXT NOT NULL,
counter_type VARCHAR(255) NOT NULL,
counter_unit VARCHAR(255) NOT NULL,
counter_volume FLOAT NOT NULL,
timestamp DATETIME NOT NULL DEFAULT NOW(),
message_signature VARCHAR(1000) NOT NULL,
message_id VARCHAR(1000) NOT NULL,
FOREIGN KEY user_id REFERENCES user(id),
FOREIGN KEY project_id REFERENCES project(id),
FOREIGN KEY resource_id REFERENCES resource(id),
INDEX ix_meter_timestamp (timestamp),
INDEX ix_meter_user_id (user_id),
INDEX ix_meter_project_id (project_id),
INDEX 'idx_meter_
)
I propose normalizing the database schema in the following ways:
1) Replace counter_type with an integer column counter_type_id that references a lookup table (counter_type)
2) Replace counter_unit with an integer column counter_unit_id that references a lookup table (counter_unit)
3) Replace counter_name with an integer column counter_id that references a lookup table (counter)
The above 3 modifications should dramatically reduce the average row size for records in the meter table and be a good first step in improving query performance...
Changed in ceilometer: | |
status: | New → Triaged |
importance: | Undecided → Wishlist |
Changed in ceilometer: | |
milestone: | none → icehouse-1 |
Changed in ceilometer: | |
assignee: | nobody → Vladimir Vechkanov (vvechkanov) |
Changed in ceilometer: | |
milestone: | icehouse-1 → none |
Changed in ceilometer: | |
milestone: | none → icehouse-3 |
Changed in ceilometer: | |
status: | Fix Committed → Fix Released |
Changed in ceilometer: | |
milestone: | icehouse-3 → 2014.1 |
You propose is good but I think we can make fewer duplication rows if will create databases in such way:
1) Replace counter_type, counter_unit, counter_name with an integer column counter_id that references a lookup table (counter).
2) Create table (counter) with rows:
* id
* name
* type_id (reference to type)
* unit_name
3) Create table (type) with rows:
* id
* name
Reasons:
* We have several counter_types, so they will be meet in a meters very often, so create different table for them.
* We have some same values in counters for different timestamp, user_id, project_id, so create different table for them.
What do you think about this idea?
Anyways I can implement TS proposal if community will decide that it's better.