release package counter's ttl does not work

Bug #1154356 reported by Brian Murray on 2013-03-12
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Daisy
Medium
Unassigned

Bug Description

We have some data from 20130117 and 20130118 in the counters column family for a release and sourcepackage:

[default@crashdb] get Counters[utf8('Ubuntu 12.10:software-center')];
=> (counter=20130117, value=983)
=> (counter=20130118, value=123)
=> (counter=20130226, value=1670)

This should not be there though as we set a time to live of 4 weeks for the data. From submit.py:

def update_release_pkg_counter(counters_fam, release, src_package, date):
    # only store four weeks worth of data
    time_to_live = 60*60*24*28
    counters_fam.insert('%s:%s' % (release, src_package), {date: 1},
        ttl=time_to_live)

Come to find out that ttl is not supported with counters. See: http://www.datastax.com/dev/blog/whats-new-in-cassandra-0-8-part-2-counters and the related issue.

We should definitely remove the code in submit.py that sets the ttl. Additionally, since we only need 2 weeks of data (using 4 just in case) perhaps we should have a job that deletes anything more than that.

Changed in daisy:
importance: Undecided → Medium
Evan (ev) wrote :

Yeah, I think a daily cron job would work fine here. I think the safest approach would be a range scan (counters.get(column_finish=fourweeksago)) that then iterated over the columns and called counter_remove one-by-one. This way if we happened to timeout or otherwise fail when running this cron job, the next run of it would clean things up.

Changed in daisy:
status: New → Triaged
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers