Modifications to filesystem to allow object storage

Bug #1685049 reported by Ilya Tregubov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mahara
Fix Released
Wishlist
Dmitrii Metelkin

Bug Description

To use object storage (like S3) several modifications are needed (like get_path should return either local path to file or remote if it is stored remotely).

Use cases:

Offloading large and old files to save money

Disk can be expensive, so a simple use case is we simply want to move some of the largest and oldest files off local disk to somewhere cheaper. But we still want the convenience and performance of having the majority of files local, especially if you are hosting on-prem where the latency or bandwidth to the remote filesystem may not be great.

Sharing files across maharas to save disk

Clients can have multiple mahara instances, and there is much duplicated content across instances. By pointing multiple maharas at the same remote filesystem, and not allowing deletes, then large amounts of content can be de-duplicated.

Sharing files across environments to save time

We can have multiple environments for various types of testing, and often have ad hoc environments created on demand. Not only do we not want to have to store duplicated files, but we also want refreshing data to new environments to be as fast as possible.

Using this plugin we can configure production to have full read write to the remote filesystem and store the vast bulk of content remotely. In this setup the latency and bandwidth isn't an issue as they are colocated. The local filedir on disk would only consist of small or fast churning files. A refresh of the production data back to a staging environment can be much quicker now as we skip the sitedir clone completely and stage is simple configured with readonly access to the production filesystem. Any files it creates would only be writen to it's local filesystem which can then be discarded when next refreshed.

Currently it only works with AWS S3 storage. There is support for more object stores planed, in particular enabling Openstack deployments.

To use a plugin you will need to create an Amazon S3 bucket for your mahara instance. You will also need Amazon SDK to make this plugin work.

So currently following directory is being saved to S3 - /path_to_sitedata/artefact/file/originals
There is a cron task that checks if there are new files that are not yet in S3 (depending on setting it only pushes files of certain age and size). Once files are duplicated, there is an option to delete local copy if we want to save space (cron task checks duplicated files, again only deletes those depending on setting, like large files etc). Certain operations only can deal only with local files - for example on content page - download content as zip archive. So when hitting download button, first all files will be pulled to local from S3.

Revision history for this message
Ilya Tregubov (ilya-u) wrote :

patch contains modification for get_path etc

Revision history for this message
Ilya Tregubov (ilya-u) wrote :

Please note - currently only original images are stored in S3

Changed in mahara:
status: New → In Progress
importance: Undecided → Wishlist
milestone: none → 17.10.0
assignee: nobody → Ilya Tregubov (ilya-u)
Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :
Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

The functionality is based on the concept for https://github.com/catalyst/moodle-tool_objectfs

Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

This feature was discussed during the 60th developer meeting. The log is available at https://meetbot.mahara.org/mahara-dev/2017/mahara-dev.2017-05-04-07.03.log.html#l-125

Ilya Tregubov (ilya-u)
description: updated
Robert Lyon (robertl-9)
Changed in mahara:
milestone: 17.10.0 → 18.04.0
Changed in mahara:
assignee: Ilya Tregubov (ilya-u) → Dmitrii Metelkin (dmitriim)
Revision history for this message
Mahara Bot (dev-mahara) wrote : A patch has been submitted for review

Patch for "master" branch: https://reviews.mahara.org/8132

Revision history for this message
Ghada El-Zoghbi (ghada-z) wrote :
Revision history for this message
Mahara Bot (dev-mahara) wrote : A change has been merged

Reviewed: https://reviews.mahara.org/8132
Committed: https://git.mahara.org/mahara/mahara/commit/78c87713a4fafb06419843b7cac2e887e1ac0c82
Submitter: Robert Lyon (<email address hidden>)
Branch: master

commit 78c87713a4fafb06419843b7cac2e887e1ac0c82
Author: Ilya Tregubov <email address hidden>
Date: Thu Apr 6 10:27:13 2017 +1000

Bug 1685049: Remote file system modification

behatnotneeded

Enables Mahara to save files to an external file system
- object storage (such as AWS's S3) -
which can reduce the cost of storage

Change-Id: I76822612f2922ba0ef2a0b7a4efb9cd2b96979a6

Robert Lyon (robertl-9)
Changed in mahara:
status: In Progress → Fix Committed
tags: added: nominatedfeature
Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :
Robert Lyon (robertl-9)
Changed in mahara:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.