Mime-type recognition fails for file uploads when Mahara behind reverse proxy

Bug #591373 reported by Damien Tougas
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mahara
Fix Released
High
Richard Mansfield
1.2
Fix Released
High
Richard Mansfield

Bug Description

When Mahara is hosted behind a reverse proxy, files uploaded are not recognized as their proper types. For example: if I go to "My Portfolio" and "My Files" to upload an gzip file, the file will upload properly but will not be properly recognized as being an archive. The result is that the generic file icon is used and the "Unzip" link is not displayed.

After looking at the code, I can see that Mahara is depending on the browser to send the proper mime-type when the file is uploaded. In our case, this value is being set to a generic value (application/octet-stream) when passed through our reverse proxy. The code that is responsible for handling new archive files (from archive/file/lib.php) is here:

    public static function new_archive($path, $data) {
        if (!isset($data->filetype)) {
            return self::archive_from_file($path, $data);
        }
        $descriptions = self::archive_file_descriptions();
        $validtypes = self::archive_mime_types();
        if (isset($validtypes[$data->filetype])) {
            return self::archive_from_file($path, $data, $descriptions[$validtypes[$data->filetype]->description]);
        }
        return false;
    }

This code checks for a mime-type first (sent by the browser) and if it finds one, assumes that it is correct. Perhaps it would be better if the code tried to determine the mime-type using the file extension in the case where the browser sends "application/octet-stream". Like this:

    public static function new_archive($path, $data) {
        if (!isset($data->filetype)) {
            return self::archive_from_file($path, $data);
        }
        if ($data->filetype == "application/octet-stream" &&
            $result = self::archive_from_file($path, $data)) {
            return $result;
        }
        $descriptions = self::archive_file_descriptions();
        $validtypes = self::archive_mime_types();
        if (isset($validtypes[$data->filetype])) {
            return self::archive_from_file($path, $data, $descriptions[$validtypes[$data->filetype]->description]);
        }
        return false;
    }

When I make this change, everything works properly for me.

Changed in mahara:
assignee: nobody → Richard Mansfield (richard-mansfield)
milestone: none → 1.3.0
Changed in mahara:
importance: Undecided → High
status: New → In Progress
Changed in mahara:
status: In Progress → Fix Committed
Revision history for this message
Richard Mansfield (richard-mansfield) wrote :

Hi Damien,

Your patch looks fine, I haven't actually applied it though, because I needed to fix this in another way, and I thought it best to avoid messing around trying to open up every single file with zip_read, etc. to see if it's a zip/tar archive.

There are lots of complaints on the forums about the mime_content_type function not being available, and that's what we were using to guess mimetypes when the browser was giving us nothing. I've hopefully done a better job this time by testing every uploaded file first with PHP's fileinfo, then using mime_content_type, and then if that fails, making a guess based on the filename extension.

So I'm hoping now (current head of master and 1.2.x) that archives, images, and other kinds of files will be recognised more reliably as soon as they're uploaded, even when your reverse proxy is being unkind.

Richard

Revision history for this message
Damien Tougas (damien-tougas) wrote :

Hello Richard,

Excellent! I think your fix sounds like a good idea.

When using mime_content_type, I still think that you should attempt to determine the file type based on file extension if the mime-type is application/octet-stream. That will be the default mime-type used if a browser/proxy doesn't know what the file is.

Or perhaps I am misunderstanding your implementation...

Revision history for this message
Richard Mansfield (richard-mansfield) wrote :

Damien:

I think mime_content_type would still have had a go at it even after the browser says it's application/octet-stream.

But I've added your suggestion now here too: http://gitorious.org/mahara/mahara/commit/851d6d3a44dac9445a96867c3120d91977d95fa7

So after this we'll be trying three times: browser/proxy, followed by mime_content_type, followed by filename extension.

Changed in mahara:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.