.ts files always get recognized as application/x-linguist and never as video/mp2t (mpeg transport stream)

Bug #502642 reported by Oliver Joos on 2010-01-03
44
This bug affects 6 people
Affects Status Importance Assigned to Milestone
shared-mime-info
Fix Released
Medium
Baltix
Undecided
Unassigned
shared-mime-info (Ubuntu)
Low
Unassigned

Bug Description

Binary package hint: shared-mime-info

I checked Hardy, Jaunty and Karmic: *.ts files are always mapped to application/x-linguist by /usr/share/mime/packages/freedesktop.org.xml. But there are several Linux-based settop boxes that record videos in *.ts-files. So if application/x-linguist is obsolete then it should simply be replaced by video/mpeg. Otherwise the mime framework has to distinguish such files by inspecting their headers. If someone knows more about x-linguist files, please add a comment!

The problem has already been reported upstream two month ago (see freedesktop.org above). But for 10.04 LTS we should fix it by a patch, if it is not fixed upstream by then.

As a workaround copy the attached file to ~/.local/share/mime/packages/mpeg-ts.xml and update your mime-database with:
$ update-mime-database ~/.local/share/mime

To trigger the regeneration of failed thumbnails just delete them:
$ rm ~/.thumbnails/fail/gnome-thumbnail-factory/*.png

ProblemType: Bug
Architecture: i386
Date: Sun Jan 3 17:47:29 2010
DistroRelease: Ubuntu 9.10
InstallationMedia: Ubuntu 9.10 "Karmic Koala" - Release i386 (20091028.5)
Package: shared-mime-info 0.70-0ubuntu1
ProcEnviron:
 PATH=(custom, user)
 LANG=de_CH.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.31-16.53-generic
SourcePackage: shared-mime-info
Uname: Linux 2.6.31-16-generic i686

This bug probably covers the lack of support for AVCHD HD video file extensions - as generated by many HD camcorders.

.MTS and .M2TS file extensions should resolve to mime type video/MP2T

Binary package hint: shared-mime-info

I checked Hardy, Jaunty and Karmic: *.ts files are always mapped to application/x-linguist by /usr/share/mime/packages/freedesktop.org.xml. But there are several Linux-based settop boxes that record videos in *.ts-files. So if application/x-linguist is obsolete then it should simply be replaced by video/mpeg. Otherwise the mime framework has to distinguish such files by inspecting their headers. If someone knows more about x-linguist files, please add a comment!

The problem has already been reported upstream two month ago (see freedesktop.org above). But for 10.04 LTS we should fix it by a patch, if it is not fixed upstream by then.

As a workaround copy the attached file to ~/.local/share/mime/packages/mpeg-ts.xml and update your mime-database with:
$ update-mime-database ~/.local/share/mime

To trigger the regeneration of failed thumbnails just delete them:
$ rm ~/.thumbnails/fail/gnome-thumbnail-factory/*.png

ProblemType: Bug
Architecture: i386
Date: Sun Jan 3 17:47:29 2010
DistroRelease: Ubuntu 9.10
InstallationMedia: Ubuntu 9.10 "Karmic Koala" - Release i386 (20091028.5)
Package: shared-mime-info 0.70-0ubuntu1
ProcEnviron:
 PATH=(custom, user)
 LANG=de_CH.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.31-16.53-generic
SourcePackage: shared-mime-info
Uname: Linux 2.6.31-16-generic i686

Oliver Joos (oliver-joos) wrote :
Sebastien Bacher (seb128) wrote :

Thank you for your bug report

Changed in shared-mime-info (Ubuntu):
importance: Undecided → Low
status: New → Triaged
Changed in shared-mime-info:
status: Unknown → Confirmed

*** Bug 24690 has been marked as a duplicate of this bug. ***

Patch for video/mp2t attached to Bug 24690.

I just tested the new patch attached to http://bugs.freedesktop.org/show_bug.cgi?id=24690#c1

My .ts-Files indeed all start with 0x47. But after the patch is applied, all files that start with 0x47 are "video/mp2t", even if their filenames do not end with ".ts". I have here .ts-files that come together with meta files with same name but different suffix. Some of these meta files incidentally start with 0x47, and hence are treated as videos too.

I'd prefer that files are "video/mp2t" only if they start with 0x47 *AND* their filename ends with '.ts' (or '.mp2t' or '.mpts' or '.mpg' or other suffix that might be common for mpeg transport stream files).

Can you test with this pattern please?

<match type="big32" mask="0xff5fff1f" value="0x47400010" offset="0"/>

Regards, Daniel

This does not match my .ts. They were recorded on a digital cable receiver and start like this:

00000000 47 40 1f 10 00 7f 80 24 00 00 01 00 00 00 00 27 |G@.....$.......'|
...

If I remove some more bits from your mask then it matches my .ts too:
(I did this without knowledge about mpeg stream standards!)

  <match type="big32" mask="0xff5fe01f" value="0x47400010" offset="0"/>

Changed in shared-mime-info:
status: Confirmed → Invalid

24690 has been marked as duplicate of 14276.

Changed in shared-mime-info:
status: Invalid → Unknown
Changed in shared-mime-info:
status: Unknown → Confirmed

Created an attachment (id=33748)
Adds support for video/mp2t (MPEG-2 transport streams)

I have here the patch to recognize video/mp2t (MPEG-2 transport streams). It's an enhanced version of https://bugs.freedesktop.org/show_bug.cgi%3Fid%3D24690%23c1. Naming comes from http://www.rfc-editor.org/rfc/rfc3555.txt (page 38). The glob patterns come from http://en.wikipedia.org/wiki/MPEG_transport_stream and are all not yet mentioned in the mime database, except *.ts and *.m2t.

To control matching order of magic patterns one may use "magic priority=", but glob patterns match in order of their appearance in the database file, so:

*.ts is currently recognized as application/x-linguist but without any magic pattern. If we position video/mp2t AFTER it, we can control with the magic pattern if a *.ts becomes video/mp2t or not. I carefully chose the magic pattern to be exactly as strict as necessary to match all valid video/mp2t.

*.m2t is currently recognized as video/mpeg which is wrong according to Wikipedia and e.g. https://bugs.launchpad.net/bugs/89543. So I chose to position video/mp2t BEFORE video/mpeg.

Maybe this is be better:

+ <_comment>MPEG-2 Transport Stream</_comment>
+ <acronym>MPEG-2 TS</acronym>
+ <expanded-acronym>Moving Picture Experts Group 2 Transport Stream</expanded-acronym>

It is not wise to use the same acronym for two different MIME types.

Created an attachment (id=33767)
Adds support for video/mp2t (MPEG-2 transport streams) + unique acronym

@Stanislav: Thanks! I don't know where these acronyms are used, but it is definitely better to keep them unique. So I replaced <acronym> and <expanded-acronym>.

The lower-case letters in <_comment> were chosen intentionally, because other mime types are like this. Probably it does not matter - this attribute is later mapped to a translated string anyway.

Changed in shared-mime-info (Ubuntu):
status: Triaged → In Progress
Changed in shared-mime-info:
status: Confirmed → In Progress

I submitted the final patch upstream a few weeks ago. No feedback yet.

Please help testing / committing it for Lucid LTS.

tags: added: patch
Sebastien Bacher (seb128) wrote :

thank you for your work there, how did you determine the magic to use? I've no files in that format to test that on so I'm a bit reluctant to upload the change in lucid, it seems it's also not something requested a lot seeing that the bug is the only opened on launchpad so far about the issue

Changed in shared-mime-info (Ubuntu):
status: In Progress → Triaged
Sebastien Bacher (seb128) wrote :

do you have any example of such video you could share there?

Oliver Joos (oliver-joos) wrote :

You can read upstream how magic was determined. Basically I took the proposal of Stanislav and adapted it after reading about mpeg headers on rfc3555 and Wikipedia (details on https://bugs.freedesktop.org/show_bug.cgi?id=14276#c7)

The attached sample file is a video/mp2t, recorded by a common settop box. Stanislavs files seem to have slightly different header bits. This file is ok for testing. But for the magic I still prefer to obey as close as possible the standards for mpeg transport streams.

BUT...!! As a final test I applied the patch and then checked the mimetype of almost every file in my installed Karmic. This includes the sources and testfiles of shared-mime-info. (warning: this took hours!):

  $ sudo find /bin /boot /etc /home /lib /opt /root /sbin /tmp /usr /var \
     -xdev -type f -exec mimetype \{\} \; | grep video/mp2t

Unfortunately I found that files beginning with 'GIMP' also match the magic and gimp resource files are not yet in the mime-database (so tweaking priorities is not a solution). Therefore I will try to tighten the magic to exclude those files. Please stay tuned...

Created an attachment (id=34447)
Patch to add support for video/mp2t (MPEG-2 transport streams) + unique acronym + unscrambled only

Patch has been improved. I added another 2 bits to the magic mask to only match if a transport stream is not scrambled. This is ok because scrambled streams could not be handled by common desktop apps anyway. With this mask GIMP resources don't match anymore. (They start with the 4 bytes "GIMP" which is a valid MP2T header)

If you own MP2T files (from a camcorder or a settop box), please test if they match and leave a comment! (Feel free to ask me how to apply the patch to your system)

For details about successful tests see comment 7+8 on https://bugs.launchpad.net/ubuntu/+source/shared-mime-info/+bug/502642

@Sebastien: thanks for asking about the magic. This made me running the hard test described in last comment. Now I fixed the last false positives.

To accomplish this I added another 2 bits to the magic mask to only match if a transport stream is not scrambled. This is no big problem because scrambled streams could not be handled by common apps anyway. With this mask GIMP resources don't match anymore. In an installed Karmic not a single file matches. So now I consider it to be safe to include it in Lucid (LTS).

Furthermore I removed the glob pattern *.m2t from video/mpeg. According to http://en.wikipedia.org/wiki/MPEG_transport_stream and bug #89543 it should match video/mp2t.

A sample transport stream file is attached to the last comment.
[Correction of last comment: I took the proposal of Daniel Leidert, not Stanislav - see upstream]

summary: .ts files always get recognized as application/x-linguist and never as
- video/mpeg (transport stream)
+ video/mp2t (mpeg transport stream)
Sebastien Bacher (seb128) wrote :

Thank you for your work on the bug

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package shared-mime-info - 0.71-1ubuntu1

---------------
shared-mime-info (0.71-1ubuntu1) lucid; urgency=low

  * debian/patches/151_video_mp2t_definition.patch:
    - change to add the mpeg2 transport streams definition, thanks Oliver Joos
      (lp: #502642)
  * debian/patches/rosetta_translations_update.patch:
    - updated translations using a rosetta export (lp: #542084)
 -- Sebastien Bacher <email address hidden> Fri, 26 Mar 2010 00:37:27 +0100

Changed in shared-mime-info (Ubuntu):
status: Triaged → Fix Released

Please add this patch to shared-mime-info. Otherwise other bugs are blocked:
https://bugzilla.gnome.org/show_bug.cgi?id=614422
http://trac.videolan.org/vlc/ticket/3485
http://bugs.xine-project.org/show_bug.cgi?id=341

The patch has been successfully applied for Ubuntu 10.04 (see URL in bug details)

Current git now has magic for linguist files, could you check if that causes any needs for changes to this patch?

A (small as sanely possible) test file would be welcome for the test suite, do you happen to have one available?

I checked new magic for linguist type in current git and saw no problem. The new magic alone does not prevent video/mp2t files to match linguist type. But it does not interfere with my patch either. Together with my patch video/mp2t still are recognized reliably.

I did not check if true application/x-linguist files are affected by my patch, because I have none. But I am pretty sure that this is not the case, because the magic in my patch is quite restrictive. And to verify this I pulled the source of shared-mime-info (including all its test-files) into an Ubuntu Live-CD and then I did:

  $ sudo find /bin /boot /etc /home /lib /opt /root /sbin /tmp /usr /var \
     -xdev -type f -exec mimetype \{\} \; | grep video/mp2t

This showed no false positives.

For a small video/mp2t test file see https://bugs.launchpad.net/ubuntu/+source/shared-mime-info/+bug/502642/comments/7.

There's a linguist test file in git. Adding this to tests/list (at end of the video files section) and the mp2t test file in place makes the test suite pass:

test-mp2t.ts video/mp2t xo

The test file seems to be taken from copyrighted material. It's just a couple of seconds but I suppose it'd be better to have a non-copyrighted one there if someone can produce one e.g. with a camcorder.

Created an attachment (id=34779)
non-copyrighted mpeg2 transport stream recorded by a settopbox

Thanks for testing. I agree that only non-copyrighted streams should be used as test files. I recorded a video/mp2t snippet with black picture and silent audio. This attachment should be ok to be included in shared-mime-info.

Created an attachment (id=35043)
non-copyrighted mpeg2 transport stream recorded by a settopbox

The previous attachment was sub-optimal. It shows is black image sequence but indeed is a non-decrypted content. Now I found a test channel with a nice test pattern (color stripes without any letters or logos). Therefore I submit a new attachment which is IMHO well-suited for inclusion into the source tree of shared-mime-info.

commit e88b56dd9836c36124baf69d88ae22be8ef776b1
Author: Oliver Joos <email address hidden>
Date: Fri May 21 20:06:31 2010 +0300

    Add video/mp2t.

    http://bugs.freedesktop.org/show_bug.cgi?id=14276

On an up-to-date 10.04 LTS system (which has shared-mime-info 0.71-1ubuntu2) I can still see a strange phenomenon:

*.ts files on a local hard disk are recognized fine (MPEG-2 transport stream).
*.ts files on a network share (SMB on a XP host) are still recognized as "application/x-linguist".

What is going on here?

Oliver Joos (oliver-joos) wrote :

@stefan: I guess for files on gvfs-mounted volumes only the filename extension is checked. My remote volumes are all mounted through NFS, which is handled by the kernel (not gvfs) and is fully transparent to the mime-type framework. So the header bytes are checked remotely too.

Unfortunately video/mp2t and application/x-linguist both match the extension *.ts, only have different header bytes. My patch for video/mp2t places its rule next to video/mpeg, which is AFTER application/x-linguist. That's why application/x-linguist has priority as long as header bytes do not count. (I did not want to break existing associations with my patch ;-)

To help fixing your issue, please try to modify your /usr/share/mime/packages/freedesktop.org.xml. Move the following section of text/vnd.trolltech.linguist (alias application/x-linguist) down to the end, just before the </mime-info>:
  <mime-type type="text/vnd.trolltech.linguist">
    ...
    <glob pattern="*.ts"/>
    <alias type="application/x-linguist"/>
  </mime-type>
Then rebuild your mime-database (don't care if there are "unknown media types"):
$ sudo update-mime-database /usr/share/mime

Are your files on the network share now "MPEG-2 transport stream"?

@Oliver: That works, thanks.

I would like to note that my /usr/share/mime/packages/freedesktop.org.xml has no mime-type "text/vnd.trolltech.linguist" with an alias "application/x-linguist", just a plain entry for a mime-type "application/x-linguist". Moving that XML element down towards the end did the trick.

Linguist is an application to translate Qt applications's GUIs? I don't intend to use that ever, so this workaround won't break anything for me.

Anyway, I would rather prefer to force gvfs to analyze the file header. I'll try to find out if it can be done.

Just for the records, with gvfs-1.6.1 it's not possible to include the file content in the mime-type evaluation process.
Only the filename is passed to the relevant libglib function.

Kẏra (thekyriarchy) wrote :

.mts and .m2ts are also extensions for the mpeg transport stream

Oliver Joos (oliver-joos) wrote :

@Stefan: thank you for checking this! I will ask the guys who develop QT Linguist and will then open a new bug to move its rule to the end of freedesktop.org.xml. Using remote linguist files might be far less common than video streams. Apropos "text/vnd.trolltech.linguist": I use shared-mime-info 0.70-0ubuntu1 of Ubuntu Karmic. Perhaps Linguists mime-type got renamed.

@Danny: Right. Common extensions are .m2t .m2ts .ts .mts .cpi .clpi .mpl .mpls .bdm .bdmv. My settop-box generates .ts and I wouldn't want to rename each recording just to see it as thumbnail. ;-)

Sebastien Bacher (seb128) wrote :

not sure what the new comments describe a new bug but new issues should be discussed on a new bug and not a closed one

Changed in shared-mime-info:
importance: Unknown → Medium
status: In Progress → Fix Released
Changed in shared-mime-info:
importance: Medium → Unknown
Changed in shared-mime-info:
importance: Unknown → Medium
Michael (michaeljt) wrote :

I did a quick search and found this bug after I discovered that upgrading to Natty causes all my Linguist files to be opened by Banshee by default. Stefan, you mentioned that gvfs 1.6.1 can't handle looking at file contents to determine the mime type. Does that mean that later versions can?

Oliver Joos (oliver-joos) wrote :

@Michael: Are local files affected too, or only remote files (samba, ftp, ...)? The new rule for *.ts in freedesktop.org.xml has been tuned very carefully to only match legal mpeg transport streams, alt least locally:

<match value="0x47400010" type="big32" offset="0" mask="0xff4000df"/>

Michael (michaeljt) wrote :

These are local files. I was also rather surprised, as I found that rule and though I think it could match a text file by chance, it didn't seem to match the first .ts file I checked. (After finding this bug I assumed that the rule was being ignored.) Any suggestions as to what I could try here?

Oliver Joos (oliver-joos) wrote :

Please open a new bug report, check and describe your problem as exactly as possible (OS version, local vs remote files, ...) and attach some typical Linguist files. Then I will try to reproduce it.

@Michael: I checked gvfs-1.6.4 (used in Maverick) and 1.8.0 (Natty). This part of the code has not changed since 1.6.1. However, this only affects files on remote locations.

Michael (michaeljt) wrote :

Oliver, sorry that I only just got round to this - created bug #782285.

Mantas Kriaučiūnas (mantas) wrote :

I still get Video .ts files recognised as text/vnd.trolltech.linguist in up to date Ubuntu 11.04 - see attached file (this file can be viewed with VLC video player)
These .ts video files are recorded with VLC video player from DVB-T broadcast (H264 AVC format)

Jochen Fahrner (jofa) wrote :

I'm on 12.04 with shared-mime-info 1.0-0ubuntu4.1 and still suffering from this bug. The attachment in comment #3 works for me. I don't understand why this is still an issue after 4 years.

Lewis Balentine (lewis-s) wrote :

"I don't understand why this is still an issue after 4 years."
I make it closer to five years.
Apparently freedesktop.org is not interested in updating their files.

Michael (michaeljt) wrote :

I hate to say it, but as part of a small paid open source development team with a very active community bug tracker for our own product, which is not supposed to prevent us from working on the features we need to implement (for the paying customers who make the open product possible) I can very well understand why this is still an issue after four, five or whatever years.

Oliver Joos (oliver-joos) wrote :

I reported this bug and it was fixed in March 2010. If you see new problems then please feel free to open a new bug report! Describe the exact situation: does it fail with local ts-files? or only on network shares (samba)? Are the files encrypted or playable with your videoplayer? Do you use Nautilus or another file browser? Add version numbers of everything, ect...
And leave a comment here with a link to your new bug report to help affected people follow you.

To solve it yourself try my final patch from comment #20.

I just checked my Linux Mint 17.1 which is based on Ubuntu 14.04 LTS: its file browser Nemo (= Nautilus clone) does recognize my ts-files correctly as "MPEG-2 transport stream", locally and remote through NFS. I have no samba network shares. My ts-files were recorded by my settop-box and are not encrypted.

simon place (psiplace) wrote :

for the record; ubuntu 16.04 LTS shows ".ts2" binary files as "text/vnd.trolltech.linguist"

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.