Record page editing history

Bug #1157543 reported by Aaron Wells
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Mahara
Confirmed
Wishlist
Unassigned

Bug Description

I'd like to "dogfood" Mahara's developer documentation by moving it from Mediawiki to the Mahara platform itself, probably as group pages so it could still be edited by many people. One of the features that is lacking to make this feasible, though, is having an edit history. The wiki allows us to keep track of who made what edits, and to revert to older versions if necessary.

So, I'd like to see Mahara gain the ability to keep track of a page's edit history. At a minimum it needs:

1. Who made changes, at what time
2. What the page looked like at each revision
3. Ability to revert the page to a previous version

Also good would be:

4. Ability to "diff" versions and highlight what changed
5. Comments that go in the history log
6. "Publish" ability (including publishing at a later time/date)
7. Versioning file attachments, see also https://bugs.launchpad.net/mahara/+bug/548195

Tags: versioning
Changed in mahara:
status: New → Confirmed
Revision history for this message
Aaron Wells (u-aaronw) wrote :

(The other main feature that's lacking is full-text search, but we're working on that with a new elasticsearch search plugin.)

Revision history for this message
Nigel Cunningham (nigelc-g) wrote :

As discussed with Kristina and Ghada last week after the ePortfolio forum, this is a good idea, and could also be integrated with the soft-copies of templates idea. I'm going to start work on it.

Changed in mahara:
assignee: nobody → Nigel Cunningham (nigelc-g)
Changed in mahara:
status: Confirmed → In Progress
Revision history for this message
Aaron Wells (u-aaronw) wrote :

Hi Nigel,

Robert and I were just discussing this one a week or two back as well. We were thinking it would be pretty difficult to achieve page versioning as I described it here, because it requires you to version every artifact and block on the page, as well as the relationships between them.

A more lightweight alternative that would meet some of the page-versioning use-cases would be to add an on-demand "snapshot" functionality. Snapshot copies would not be listed in the normal Portfolio -> Pages area, but would instead be visible from maybe a "history" tab of the page they were snapshotted from. Since we already have full page copy functionality, and the ability to lock pages, it should be a relatively easy thing to implement. This would be useful for dogfooding documentation, if we could rollback to a snapshot, and maybe make it possible to "publish" a snapshot version of a page. It would also satisfy our need to archive pages that are submitted for grading.

On the other hand, if you think you can tackle full-on page versioning, that would be even better. :)

Cheers,
Aaron

Revision history for this message
Nigel Cunningham (nigelc-g) wrote :

Sorry for my slow response.

I'm working on full-on page versioning, and am trying to make it as simple as possible.

My implementation plan is as follows:
 X Make tables to store incremental diffs between versions of the database row that stores the artefact/block/page (simply item ID, revision ID for item, mtime, diff; item ID + rev ID = Primary key).
 X Add lib/revision.php to do the work of calculating the diff and storing it (hooks into commit fns and called between generating the new version of a DB row and actually saving it).
- Modify existing database structures where possible so that all of the data related to a revision is in the row. At the moment, I'm reworking the code for storing block locations within a layout, putting them into a serialised array in a field in view, and adding a view_block table to implement what will now be a many-to-many relationship between views and blocks.
_ Design and add a UI for viewing past versions of a page and its components. Kristina was thinking of a slider at the bottom of the page.
_ Also provide support for deleting/merging old revisions, and cascade delete unneeded blocks/artefacts as required
_ Provide a UI for en/disabling revisions (sitewide and more granular)

Regards,

Nigel

Revision history for this message
Aaron Wells (u-aaronw) wrote :

Hi Nigel,

Cool beans, it sounds like you know what you're doing. ;)

My only concerns are:

1. Will this make it more difficult to make changes to the database structures in the future? For instance going from Mahara 1.7 to 1.8 we refactored all the page layout tables considerably in order to go from a columns-only layout to a rows-and-columns layout.

2. We're actually trying to reduce the number of serialized data tables in the database because they've caused us headaches in the past due to not being very accessible by normal database SELECT & UPDATE operations. And the PHP serialize() method in particular poses a security risk. (See https://www.owasp.org/index.php/PHP_Object_Injection ). So, it'd be preferable to store data in normal DB tables & columns; or if that's not feasible, to store them as JSON rather than serialize().

Cheers,
Aaron

Revision history for this message
Nigel Cunningham (nigelc-g) wrote :

Thanks for the reply.

Good questions - I've just been given some other work to do, but will give these both further consideration when I get the opportunity.

Nigel

Revision history for this message
Nigel Cunningham (nigelc-g) wrote :

Hi again.

I gave this some more thought over the weekend. For DB field changes, I think the best approach going forward would be as follows:

When viewing a past revision of an asset, we first apply the diffs to get back to the record as it actually was at the time.

We then apply the modifications that would have happened to that row as a result of changes to the database since that date (giving it default values for fields that have been added and removing fields that are no longer used). This would give us a version that should be able to be passed to the current API without causing errors.

Hopefully, most of this could be achieved by tracking the DB structure in a similar way to the record changes, but I think we'd need some sort of hook functionality to allow for transformations that are more complicated than the simple addition or removal of fields.

And lots of unit tests!

Revision history for this message
Nigel Cunningham (nigelc-g) wrote :

PS: Will give more consideration to whether we can avoid serializing the data.

Re using JSON instead, fine - I wasn't aware of those issues.

Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

I still like the idea of snapshotting a page manually (besides the automatic versioning) because that could tell the author / viewer more clearly which version is important. If we version everything then it can be very tricky to find the version that you are looking for. Mediawiki I think tackles this by allowing you to leave a comment.

Etherpad makes it really easy by allowing you to save a page. That actually doesn't create a copy you can download, but a highlight on the timeline slider (a star in the case of Etherpad). Easy to get back to and move from that more important revision to another important revision rather than looking through every single revision.

Also: When are revisions made? When a block is changed and saved or when the entire page is "saved" (by clicking the "Done" button which not many people do I would say as it is not required). Or would we set the revision saving interval like we do for watchlist notifications so that there is no overwhelming number of notifications?

Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

See also bug #1393011 for some ideas in regards to versioning of the resume.

tags: added: versioning
Revision history for this message
Nigel Cunningham (nigelc-g) wrote :

Hi guys.

Getting back to this for at least a little while.

Kristina, I agree that you might not want to keep versions of everything. Perhaps there could be a sitewide or finer grained configuration (per institution / group / user and/or per object type?) option that controls whether revisions are automatically saved. If that's disabled, there would then be a button that let the user manually choose to save a new revision.

I like the idea of being able to tag revisions and easily access them. That should be done no matter what else is decided.

Regarding when revisions are made, it's got to be when the data is saved (regardless of whether that's by ajax or by clicking 'Done'). Thinking of what was written above again, I'd suggest that if revisions are being automatically saved, there should be an easily accessible way of controlling things (Radio button / drop down for switching between Don't save revisions / Combine all the changes being made into 1 revision / New revision on each change).

Revision history for this message
Nigel Cunningham (nigelc-g) wrote :

Further to Aaron's comment...

"2. We're actually trying to reduce the number of serialized data tables in the database because they've caused us headaches in the past due to not being very accessible by normal database SELECT & UPDATE operations. And the PHP serialize() method in particular poses a security risk. (See https://www.owasp.org/index.php/PHP_Object_Injection ). So, it'd be preferable to store data in normal DB tables & columns; or if that's not feasible, to store them as JSON rather than serialize()."

I've switched the code to using json encode. Since the diffs themselves aren't going to be the objects of selects or updates, I don't think there should be issues there. If we want to be able to find a change where we modified something, perhaps the revision record should have a field for a search description and/or tags?

Revision history for this message
Mahara Bot (dev-mahara) wrote : A patch has been submitted for review

Patch for "master" branch: https://reviews.mahara.org/4415

Changed in mahara:
assignee: Nigel Cunningham (nigelc-g) → nobody
status: In Progress → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.