Cannot save.jpg from web page

Bug #1585774 reported by Bob Harvey
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-download-manager (Ubuntu)
New
Undecided
Unassigned
webbrowser-app (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

 Bear with me on this, it's a bit of a voyage.. There is his /thing/ I do when creating publicity for a web site, that works fine in Firefox, chrome, opera, Vivaldi, on a desktop; works fine in firefox, chrome, safari, and dolphin on IOS, and with internet or chrome on android.. It doesn't work in our browser on he M10 tablet under OTA10.1

1. Go to e.g.
http://www.geograph.org.uk/stamp.php?id=2612549
(there are nearly 5 million photos, I chose a nice one for you to play with)

2. Press 'get stamped image'
you get an instruction to right-click and save as, and a watermarked image to do that to.

3. Our version of that is long-press, and download.
so do that.

what gets downloaded is not the.jpg but appears to be a fragment of html, or the url.

That is /not/ what happens on any other browser on any other platform.. Elsewhere I get the.jpg

Revision history for this message
Bob Harvey (bobharvey) wrote :
Revision history for this message
Bob Harvey (bobharvey) wrote :
Revision history for this message
Bob Harvey (bobharvey) wrote :

Now, my understanding of *nix is that file types are not identified by extension but by 'magic' from the first sector, and these don't seem to be .jpg files

Revision history for this message
Bob Harvey (bobharvey) wrote :

Belay that
The developer at Geograph, having read this, has made a change so the problem is bypassed.

I have asked him for details so that I can document what it is that other browsers could infer but we needed made explicit.

Revision history for this message
Olivier Tilloy (osomon) wrote :

I can confirm that this works as expected (tested on arale running rc-proposed).

@Bob: I’m going to mark the bug invalid, but I’m very interested in hearing what the developer at Geograph did on his end to make it work. This might be something that will affect other websites, and that we can ease in the browser.

Changed in webbrowser-app (Ubuntu):
status: New → Invalid
Revision history for this message
Bob Harvey (bobharvey) wrote :

I got this from the developer, the inestimable Barry Hunter:
"I added a
Content-Disposition: inline; filename="$filename.jpg"
header, which tells the browser a sensible filename for the file. Which does invalidate the reproduction steps in the bug report!

As you say other browsers were able to deduce a their own filename (ie realise it a jpeg) and save as that, so in theory such a header not needed.

(I did double check, its producing valid jpegs, they have the proper magic bytes :)"

Revision history for this message
Bob Harvey (bobharvey) wrote :

Thanks, Olivier, I hope that answers all!

Revision history for this message
Olivier Tilloy (osomon) wrote :

Thanks Bob, that answers it indeed.
I believe what happened is that the ubuntu download manager issued a request that looked like that:

  GET /stamp.php?id=2612549&font=&style=&weight=&gravity=South&pointsize= HTTP/1.1
  Host: t0.geograph.org.uk

Since it didn’t get any hint as to what the filename should be, it used the filename of the resource URL, i.e. "stamp.php". Now with the content-disposition header it knows what the suggested filename should be, and it uses it instead.

Do you still have the old copies of the files named "stamp.php*" in your Downloads folder? I’d be curious to compare them to the actual JPG file.

Revision history for this message
Bob Harvey (bobharvey) wrote :

Greetings!
> Do you still have the old copies of the files named "stamp.php*"
Almost certainly, but I am at work at the moment. I will forward a couple of them later.

Bob

Revision history for this message
Bob Harvey (bobharvey) wrote :
Revision history for this message
Bob Harvey (bobharvey) wrote :
Revision history for this message
Olivier Tilloy (osomon) wrote :

I can confirm that both files you just attached are valid JPG pictures, you can try renaming them with a jpg extension and opening them with any image viewer.

So the issue really was the missing filename hint, nothing more.

Revision history for this message
Bob Harvey (bobharvey) wrote :

>renaming them with a jpg extension
Yes, I did the same on windows

>So the issue really was the missing filename hint
Or, looking at it another way, the issue was really the browswer needing a file name hint to determine the file type, rather than using the file "magic".

After all, every /other/ browser on every /other/ platform I have used had worked out the files were jpegs, and treated them as such, without needing a hint at all.

I've been using *nix since the days of BSD and even Xenix, and spent year on Solaris. None of those systems made assumption about file types from the file extension. I rather thought that was one of Windw's weaknesses, rather than something to be adopted.

Anyroad, thanks for your time and I am sure you will think about all this...

I am just glad I can use the new tablet to save and re-use images.

Revision history for this message
Olivier Tilloy (osomon) wrote :

Yes, this is a known limitation of the current architecture: the content hub relies on the advertised mime type to determine which applications can receive the file. If no explicit mime type is provided by the server, it tries to infer it from the filename.

In the case of downloading inside the browser, the advertised mime type (or lack thereof) shouldn’t matter, but if no filename is provided, the browser still need to assign one. Checking the first few bytes of the downloaded file would help in determining the real mime type, but inferring an extension from it is not necessarily trivial.

I’m adding an ubuntu-download-manager task to the bug report, in case we can improve the situation in that component.

Out of curiosity, before the developer at Geograph applied that fix, what filename did other browsers suggest for the image, in their "Save As" dialog?

Revision history for this message
Bob Harvey (bobharvey) wrote :

I just checked the Windows based browsers and in the current configuration (with the hint) I get things like "geograph-4967963-by-Richard-Hoare.jpg".

I have looked back at the older ones and found the same sort of thing, it looked very familiar

On IOS things don't have file names I can see, so I can't help you there - I will do the job on IOS and attach the result in a minute to find out.

Revision history for this message
Bob Harvey (bobharvey) wrote :
Revision history for this message
Bob Harvey (bobharvey) wrote :

Can't deterimine if the last example was before the change at Geograph, so I have gone back another week, and this is /definitely/ downloaded before the change. IOS

Revision history for this message
Bob Harvey (bobharvey) wrote :

Hmm. Just called "image.jpg" when sent from IOS. I just mailed that second one to myself from the IOS cameraroll, and it came in an email with the name IMG_3747.jpg

I have just found a backup directory of images saved from Firefox on Windows from earlier in the year, and they have names like "IMG_0910.JPG" so I was wrong about the "geograph-4967963-by-Richard-Hoare.jpg" format - that seems to have belonged to another method of downloading, the non-stamped one.

Looks like Firefox was allocated sequential download names in the same way that IOS has been.

Revision history for this message
Olivier Tilloy (osomon) wrote :

Ok, that confirms that the only missing bit in our architecture is making UDM infer the mime type and the file extension from the actual contents of the downloaded file.
Thanks Bob!

Revision history for this message
Bob Harvey (bobharvey) wrote :

Is this residual bug related to bug #1500742 ?

Revision history for this message
Olivier Tilloy (osomon) wrote :

No, that’s a different issue.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.