FileStorageCache is necessary

Bug #617536 reported by Gustavo Niemeyer
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
pyjuju
Confirmed
Wishlist
Unassigned

Bug Description

It'd be great to have an interface to the remote FileStorage implementations which dealt with downloading files, and caching them locally. Here is a suggested interface:

class FileStorageCache:

    def __init__(self, file_storage, local_cache_path):
        """
        @param file_storage: FileStorage instance to wrap.
        @param local_path: Local directory to use as the cache.
        """

    def get(self, name):
        """See FileStorage.get()'s documentation.

        This method behaves the same way, but will avoid downloading the file
        a second time if it was already available locally.
        """

Among other things, this will be used to enable the Machine Agent to download formulas just once, as defined in bug #617535.

Tags: spike
Changed in ensemble:
status: New → Confirmed
importance: Undecided → Low
Changed in ensemble:
assignee: nobody → Jim Baker (jimbaker)
Revision history for this message
Jim Baker (jimbaker) wrote :

What is the desired caching policy? Some standard questions:

1. File consistency. Are these files effectively immutable (ignoring dev of course), very infrequently changed (so a global expire as seen in programs using SIGHUP), or otherwise? If the latter, do we need to consider that file A and file B depend on each other such that loading A' needs to have a consistently loaded B'?

2. Size of cache values. Are the cached objects both individually small and their number small? Or do we need some variant on LRU?

Revision history for this message
Kapil Thangavelu (hazmat) wrote :

1. for the purposes of formula files, their effectively immutable ( we store the hash as part of the storage key).

2. its a disk cache.. sizes is variable and not bounded. no lru needed as per 1.

general goal is to have a wrapper cache object around a provider file storage impl. given a request for a file key, check local storage, if it exists return openfile handle to it. if it doesn't fetch from provider storage, store in local, and return file handle to local.

hth

Changed in ensemble:
status: Confirmed → In Progress
milestone: none → 0.2
Changed in ensemble:
milestone: 0.2 → 0.3
Revision history for this message
Jim Baker (jimbaker) wrote :

Per discussion with Gustavo:
Add a persistent cache directory in say /var/ensemble/cache (eventually refactor this setting and similar ones) where we will store these immutable files over time, and over multiple runs.

FileStorageCache then is simply forwards and requests to FileStorage, but checks if the file is already cached and returns it under get.

Jim Baker (jimbaker)
Changed in ensemble:
assignee: Jim Baker (jimbaker) → nobody
Jim Baker (jimbaker)
Changed in ensemble:
status: In Progress → Confirmed
Changed in ensemble:
milestone: 0.3 → none
Changed in ensemble:
importance: Low → Wishlist
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.