Some file formats allow extracting when file is not an archive

Bug #803229 reported by Chris Wharton
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mahara
Won't Fix
Low
Unassigned

Bug Description

Mahara Version affected:
v1.3, 1.4a

When uploading an archive of .docx and some other file types (eg Adobe Captivate), Mahara recognises the file type incorrectly. Mahara allows unzipping of these files and breaks the files into individual components (eg XML metadata).

When downloading the uploaded file, the browser recognises the file as a zip file, but the OS recognises the file correctly as a .docx

Revision history for this message
Chris Wharton (y-chrisw) wrote :
Revision history for this message
François Marier (fmarier) wrote :

Interesting, I wonder if that's a bug in the PHP mimetype detection code...

The funny thing is that it's not entirely wrong, these files are zip files after all :)

Changed in mahara:
status: New → Triaged
importance: Undecided → Low
Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

There is a forum post at http://mahara.org/interaction/forum/topic.php?id=3791#post16704 that alludes to the same problem. Though she write of Adobe Captivate files (which I guess are also XML plus content files) and MS Office.

Revision history for this message
François Marier (fmarier) wrote :

Also mentioned on this forum thread: http://mahara.org/interaction/forum/topic.php?id=4155

Revision history for this message
Ruslan Kabalin (rkabalin) wrote :

I can't reproduce it. Docx is treated as normal document file. This might be related to php version or OS. I am on Debian Squeeze, Apache 2.2.16-6+squeeze4, PHP 5.3.3-7+squeeze3

Melissa Draper (melissa)
Changed in mahara:
assignee: nobody → Melissa Draper (melissa)
Revision history for this message
Melissa Draper (melissa) wrote :

I can't reproduce this, however I suspect this patch may fix the issue.

Can someone who can reproduce the issue please test it?

Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

Melissa, this only happens when you upload a zip file which includes docx etc. and then you extract the files and docx is then seen as archive. Not when you upload a docx on its own.

Changed in mahara:
status: Triaged → In Progress
Melissa Draper (melissa)
Changed in mahara:
assignee: Melissa Draper (melissa) → nobody
status: In Progress → Confirmed
Revision history for this message
awillson (awillson) wrote :

If this bug is still open...

This problem seems to be related to multiple issues:

Mahara (as of 1.6.2) Issue:
1. The docx, pptx, xlsx mime types are not included in the 'artefact_file_mime_types' table in the database.

Web Server Issue:
2. The web server (Apache or IIS) may need to be configured to send appropriate mime types for docx, pptx, xlsx.

Web Browser Issue:
3. IE8 is known to have issues with docx, pptx, xlsx mime types.

Revision history for this message
awillson (awillson) wrote :

Sorry for the debian distribution addition-deletion. I am using debian, but I'm not using the debian packaged Mahara software. So I don't know if it affects the default debian package.

no longer affects: debian
Revision history for this message
awillson (awillson) wrote :

I think I found the file that needs updating...

htdocs/artefact/file/filetypes.xml

and add the following for MS OOXML compatiability...

    <filetype>
        <description>docx</description>
        <mimetypes>
            <mimetype>application/vnd.openxmlformats-officedocument.wordprocessingml.document</mimetype>
        </mimetypes>
    </filetype>
    <filetype>
        <description>pptx</description>
        <mimetypes>
            <mimetype>application/vnd.openxmlformats-officedocument.presentationml.presentation</mimetype>
        </mimetypes>
    </filetype>
    <filetype>
        <description>xlsx</description>
        <mimetypes>
            <mimetype>application/vnd.openxmlformats-officedocument.spreadsheetml.sheet</mimetype>
        </mimetypes>
    </filetype>

Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

I can't replicate this on Mahara 1.6 on my local machine.

Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

related: bug #1193158

Revision history for this message
Robert Lyon (robertl-9) wrote :

This problem might also be related to:

https://bugs.launchpad.net/mahara/+bug/1249166
https://bugs.launchpad.net/mahara/+bug/1249858

Where the mimetype is incorrectly set in the browser's registry and then mahara is not correctly sniffing out the true mimetype

Revision history for this message
Rebecca Blundell (rjb-dev) wrote :

I can't replicate the problem. I tried uploading docx files as part of a .zip and .tar archive which also contained other file formats, and in both cases Mahara treated the docx correctly as a single file. (Using master branch of Mahara, Firefox Quantum 58.0, php 7.1.13)

Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

It's still a problem with .key files, Mac Keynote files, for example.

Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

Setting this to "Won't fix" because it's a minor issue and someone trying to extract the file in their own files area would see that it would alter the files.

Changed in mahara:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.