Activity log for bug #1685049

Date Who What changed Old value New value Message
2017-04-21 01:53:40 Ilya Tregubov bug added bug
2017-04-21 02:10:55 Ilya Tregubov attachment added patch with remote storage changes
2017-04-21 03:34:18 Kristina Hoeppner mahara: status New In Progress
2017-04-21 03:34:20 Kristina Hoeppner mahara: importance Undecided Wishlist
2017-04-21 03:34:24 Kristina Hoeppner mahara: milestone 17.10.0
2017-04-21 03:34:51 Kristina Hoeppner mahara: assignee Ilya Tregubov (ilya-u)
2017-05-05 05:11:46 Ilya Tregubov description To use object storage (like S3) several modifications are needed (like get_path should return either local path to file or remote if it is stored remotely). Certain edge cases should be considered (for example when downloading folder content as zip archive all files in that folder should be available locally etc). To use object storage (like S3) several modifications are needed (like get_path should return either local path to file or remote if it is stored remotely). Use cases: Offloading large and old files to save money Disk can be expensive, so a simple use case is we simply want to move some of the largest and oldest files off local disk to somewhere cheaper. But we still want the convenience and performance of having the majority of files local, especially if you are hosting on-prem where the latency or bandwidth to the remote filesystem may not be great. Sharing files across maharas to save disk Clients can have multiple mahara instances, and there is much duplicated content across instances. By pointing multiple maharas at the same remote filesystem, and not allowing deletes, then large amounts of content can be de-duplicated. Sharing files across environments to save time We can have multiple environments for various types of testing, and often have ad hoc environments created on demand. Not only do we not want to have to store duplicated files, but we also want refreshing data to new environments to be as fast as possible. Using this plugin we can configure production to have full read write to the remote filesystem and store the vast bulk of content remotely. In this setup the latency and bandwidth isn't an issue as they are colocated. The local filedir on disk would only consist of small or fast churning files. A refresh of the production data back to a staging environment can be much quicker now as we skip the sitedir clone completely and stage is simple configured with readonly access to the production filesystem. Any files it creates would only be writen to it's local filesystem which can then be discarded when next refreshed. Currently it only works with AWS S3 storage. There is support for more object stores planed, in particular enabling Openstack deployments. To use a plugin you will need to create an Amazon S3 bucket for your mahara instance. You will also need Amazon SDK to make this plugin work. So currently following directory is being saved to S3 - /path_to_sitedata/artefact/file/originals There is a cron task that checks if there are new files that are not yet in S3 (depending on setting it only pushes files of certain age and size). Once files are duplicated, there is an option to delete local copy if we want to save space (cron task checks duplicated files, again only deletes those depending on setting, like large files etc). Certain operations only can deal only with local files - for example on content page - download content as zip archive. So when hitting download button, first all files will be pulled to local from S3.
2017-09-22 01:56:35 Robert Lyon mahara: milestone 17.10.0 18.04.0
2017-10-11 04:22:32 Dmitrii Metelkin mahara: assignee Ilya Tregubov (ilya-u) Dmitrii Metelkin (dmitriim)
2017-11-20 22:18:13 Robert Lyon mahara: status In Progress Fix Committed
2017-11-21 17:36:09 Kristina Hoeppner tags nominatedfeature
2018-04-05 23:00:35 Robert Lyon mahara: status Fix Committed Fix Released