Comment 4 for bug 547902

Revision history for this message
Nigel-catalyst (nigel-catalyst) wrote :

There are some really good points on both sides here.

On one side are the users, who want to be able to copy embed codes from their favourite sites, and have it all just work. These people, whether they know it or not, also don't want their sites to be hacked.

The other side has the system administrators, who don't want their users and servers being compromised. But they bear the brunt of complaints
when users can't embed stuff from whatever the latest fashionable site is :)

The current Mahara solution prudently favours the system administrators - nobody wants their servers hacked. But it's clear that this is probl
ematic for users.

Given that we can't control the method by which users want to embed content (pasting in embed codes from other sites), we have to try find a s
olution that allows users to embed stuff from "trusted" sites. While the system administrators may have their own definitions of "trusted", th
e reality is that they shouldn't all be forced to keep up with the latest web 2.0 innovations just to give users this ability.

With all that in mind, I came up with a potential solution: how about if we solve this problem in the same way that antivirus products do?

The list of viruses is ever-changing, yet this doesn't cause system admins to keep up with all of them. Most sysadmins have never even _heard_
 of 99.9% of viruses, because the antivirus companies distribute signature files that enable the system to recognise viruses. And we could use
 this same technique with the HTML filtering software, so that most sites will work on all Mahara installations without much effort from anyone.

What I am proposing is that the mahara.org distributes, separately from Mahara, a download containing embedded content HTMLPurifier filters. HTMLPurifier is the name of the filtering software we are using, and it supports having filters to allow certain content. If mahara.org provided this as a download, then people could just grab the new filters from time to time, which would keep them up to date and allow new sites to be added.

This download could be maintained by the community, with people suggesting new filters, others writing them, and mahara.org distributing them. Hell, I'm sure that the Moodle community would be interested too, and maybe even the HTMLPurifier community. If this idea is workable, we could even spin this off to the FOSS community at large to do (or at least the part that cares about HTMLPurifier :).

I think this solution has these benefits:

* Users will be able to embed content from all manner of interesting social networking sites, without being able to embed malicious content most of the time.
* Sysadmins will just have to download the signature file from time to time. Assuming they trust mahara.org to do a good job, this is all they'll have to do to keep their users happy. We can provide an extra level of control for sysadmins to disable filters too, if they don't want to allow e.g. flickr for whatever reason.
* The community at large will have to make the filters, but as they say, many hands make light work :). After a while, the list will be quite comprehensive, with not much input required to keep it up to date.
* We (the Mahara core team) will have to validate filters for security, but this is a reasonably simple job. We'll have to provide the downloads, but we can script away a bunch of the work so that when someone writes a filter, we have little to do but vet it for security before it's available in the next signature update.
* Eventually, we could arrange for Mahara itself to grab the signature file on a weekly basis, using MNET or https for security :)

Does anyone have any thoughts about this? Good idea/bad idea?