Locally saved webpages not displaying correctly

Bug #350407 reported by James Hurley
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mozilla Firefox
Confirmed
Unknown
firefox (Ubuntu)
Triaged
Low
Unassigned

Bug Description

WORKAROUND:
Firefox Scrapbook Addon: https://addons.mozilla.org/en-US/firefox/addon/427

----------------------------------------------------------------------------------

Firefox Information: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.8) Gecko/2009032711 Ubuntu/8.04 (hardy) Firefox/3.0.8

Web pages that have been saved locally are not displaying correctly. To see this problem run the following steps:

1). Go to the jwebunit website ( http://jwebunit.sourceforge.net/ ). Suggesting saving a page from this website as a test since this provided a very good example of the problem.

2). Right click on the main web page and select "Save As..."

3). Save the web page somewhere on the local drive. On Ubuntu there is an option in the save dialog specifying how to save the web page. Make sure that "Web page, complete" is selected.

4). Navigate to the location where the web page was stored and attempt to load the saved page back into firefox

At this point the web page that is loaded from the hard disk should display very differently than the real web page. Styles and elements of the original web page should be missing. I also tested this same process on Windows XP SP2 (using firefox 3.0.8) and I see the exact same behavior. This seems to eliminate the problem being caused by local browser settings (since I see the same thing under Windows and Linux on two separate machines with totally different preferences). Furthermore I was able to run steps 1 - 4 using Opera 9.64 and this produces a saved web page that displays identically to the web page navigated to on the internet.

...Additionally I just tried loading the webpage saved locally using opera and this loads into firefox correctly. So the problem appears to be with the actual web page save using firefox and not the fact that the page is saved locally.

ProblemType: Bug
Architecture: i386
Date: Sat Mar 28 11:36:15 2009
DistroRelease: Ubuntu 8.04
Package: firefox-3.0 3.0.8+nobinonly-0ubuntu0.8.04.2
PackageArchitecture: i386
ProcEnviron:
 PATH=/home/username/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: firefox-3.0
Uname: Linux 2.6.24-23-generic i686

Update - Looks like this bug report may have gone to the wrong place. I was expecting this to be submitted against firefox instead of Ubuntu... Apparently selecting --> Help --> Report a Problem inside of the firefox web browser sends bug reports to Ubuntu.

-------------

Ok, please disregard this bug report; I've got this submitted to the correct place now.

Tags: apport-bug
Revision history for this message
In , Bzbarsky (bzbarsky) wrote :

To adam. This is not a duplicate, I think, but bug 115107 needs to get fixed
first.

Revision history for this message
In , Jmdesp (jmdesp) wrote :

Another page with the same problem :
http://webnouveau.net/

Revision history for this message
In , Adamlock (adamlock) wrote :

.

Revision history for this message
In , Alfonso (amla70) wrote :

*** Bug 162108 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Felix Miata (mrmazda) wrote :

Hasn't the future arrived by now? How hard would this be to fix? Can we please
have some attention to this soon?

Revision history for this message
In , Felix Miata (mrmazda) wrote :

*** Bug 202737 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Bugzilla-accessibleinter (bugzilla-accessibleinter) wrote :

*** Bug 223406 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Olivier Cahagne (wolruf) wrote :

*** Bug 224586 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Bugzilla-accessibleinter (bugzilla-accessibleinter) wrote :

*** Bug 225009 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Bugzilla-spray (bugzilla-spray) wrote :

*** Bug 235791 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Bugzilla-accessibleinter (bugzilla-accessibleinter) wrote :

*** Bug 236069 has been marked as a duplicate of this bug. ***

Revision history for this message
In , RickB (rick-777) wrote :

Also Save Page As does not flatten the url() items in the CSS file(s).

In the HTML, an image element is correctly flattened, e.g.
    <img src="http://www.somewhere.com/logo.png">
becomes
    <img src="saveasfolder/logo.png">

In CSS, images should also be flattened - but are not. E.g.
    body { background-image: url(http://www.somewhere.com/background.png); }
should become
    body { background-image: url(saveasfolder/background.png); }

The same applies for the other syntactic variants of the same thing (i.e. with
or without ' quote marks, with or without the url() lexical token.

Rick :-)

Revision history for this message
In , Timeless-bemail (timeless-bemail) wrote :

rick: that's bug 115107, which oddly enough is listed in the dependencies. in
the future, please at least check depencies before you comment about something
which you think might be related. ideally you'd do a normal bug search....

Revision history for this message
In , Bugzilla-spray (bugzilla-spray) wrote :

*** Bug 237106 has been marked as a duplicate of this bug. ***

Revision history for this message
In , phi1ipp (phi1ipp) wrote :

This page is an example of the problem:

http://weblogs.mozillazine.org/hyatt/

In that case, it is exacerbated by the fact that a JavaScript script selects the
css file to use (after user action). These alternate css files are not saved.

Revision history for this message
In , Tom-b52 (tom-b52) wrote :

similar to not saving CSS images:
CSS not fixed up by webbrowserpersist (background images not saved)
http://bugzilla.mozilla.org/show_bug.cgi?id=115107

there is a work in progress patch there, but hasn't been rolled into nightly
build as of last month (may2004)

Revision history for this message
In , Bugzilla-spray (bugzilla-spray) wrote :

*** Bug 252392 has been marked as a duplicate of this bug. ***

Revision history for this message
In , A-geek (a-geek) wrote :

please see also http://bugzilla.mozilla.org/show_bug.cgi?id=115107#c67

voting for both bugs

Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2) Gecko/20040803
MultiZilla/1.6.4.0b Mnenhy/0.6.0.104

Revision history for this message
In , Bugzilla-accessibleinter (bugzilla-accessibleinter) wrote :

*** Bug 263600 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Anonym25712 (anonym25712) wrote :

Sorry for spamming, but would this be really hard to fix?

It's really annoying when you want to save a page to:
 - Open the file
 - Find where the css file is located
 - Open the css file and save it
 - Modify the page to use the local version of the css file

'Save page' does this perfectly when the css file is not @imported, so is it so
hard to include the support for the @import tag? It seems so, since this bug was
opened more than 2.5 years ago...

Revision history for this message
In , Bugzilla-accessibleinter (bugzilla-accessibleinter) wrote :

*** Bug 267662 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Bugzilla-accessibleinter (bugzilla-accessibleinter) wrote :

*** Bug 271626 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Bugzilla-accessibleinter (bugzilla-accessibleinter) wrote :

*** Bug 273091 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Gavin Sharp (gavin-sharp) wrote :

*** Bug 278895 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Ion Freeman (ionfreeman) wrote :

Sorry for rereporting this bug, but it didn't come up in response to my query. Anyway, it is nigh on
three years old, and is simply a matter of resolving the @import tags. Can you fix it?

Revision history for this message
In , Timeless-bemail (timeless-bemail) wrote :

you can fix it.

Revision history for this message
In , Ostgote (ostgote) wrote :
Revision history for this message
In , Jerfa (jerfa) wrote :

*** Bug 281478 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Gavin Sharp (gavin-sharp) wrote :

Please don't set blocking flags, other than to request them (?).

Revision history for this message
In , Simon-annear (simon-annear) wrote :

*** Bug 287525 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Joshbirnbaum-mozil (joshbirnbaum-mozil) wrote :

*** Bug 294724 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Matti-mversen (matti-mversen) wrote :

*** Bug 297180 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Philringnalda (philringnalda) wrote :

*** Bug 309632 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Martijn-martijn (martijn-martijn) wrote :

*** Bug 309737 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Bugzilla-spray (bugzilla-spray) wrote :

*** Bug 314665 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Brendy-gmail (brendy-gmail) wrote :

I am putting up a $25 bounty for this bug. This is very basic fdunctionality from an end-users standpoint -- something that people should expect to just work. The average joe isn't going to be able to manually download and modify the css files.

Anybody else want to contribute to this?

PS: If anybody does fix it, email me your paypal information.

Revision history for this message
In , Christian Biesinger (cbiesinger) wrote :

*** Bug 321349 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Cignangulo (cignangulo) wrote :

*** Bug 326131 has been marked as a duplicate of this bug. ***

Revision history for this message
In , warhammerkid (perl-programmer) wrote :

Here's the problem that I've found, and correct me if I'm wrong because I'm not intimately familiar with the code:

At the moment, nsWebBrowserPersist.cpp uses a Tree Walker (line 1581) to go through each and every "tag set" (node) and find references to external objects. Since the CSS is not parsed in the DOM, the Tree Walker doesn't find these images and stuff. Therefore, the style sheets also need to be parsed or stuff inside import() and url() would need to be collected as well.

Revision history for this message
In , Uriber (uriber) wrote :

*** Bug 337114 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Polidobj (polidobj) wrote :

*** Bug 343627 has been marked as a duplicate of this bug. ***

Revision history for this message
In , warhammerkid (perl-programmer) wrote :

For those of you still interested, I've created an extension to serve as a temporary fix to the problem. < https://addons.mozilla.org/firefox/2925/ >
Until I or someone else has the time to re-write nsWebBrowserPersist however, this is the best solution.

Revision history for this message
In , Philringnalda (philringnalda) wrote :

*** Bug 355366 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Ski (ski) wrote :

I'll throw $25 in to that bounty. It's the kind of thing that we need to make "just work".

(In reply to comment #36)
> I am putting up a $25 bounty for this bug. This is very basic fdunctionality
> from an end-users standpoint -- something that people should expect to just
> work. The average joe isn't going to be able to manually download and modify
> the css files.
>
> Anybody else want to contribute to this?
>
> PS: If anybody does fix it, email me your paypal information.
>

Revision history for this message
In , Ria-klaassen (ria-klaassen) wrote :

*** Bug 370152 has been marked as a duplicate of this bug. ***

Revision history for this message
In , John Vandenberg (jayvdb) wrote :

I tried to use the "Save Complete" extension mentioned in comment 42, but it garbled the web page on disk in FF 2.0.0.3. I have found that ScrapBook (https://addons.mozilla.org/en-US/firefox/addon/427) works well, and "Mozilla Archive Format" (https://addons.mozilla.org/en-US/firefox/addon/212) looks interesting for Firefox 1.5.x users.

Revision history for this message
In , warhammerkid (perl-programmer) wrote :

The issue (and I don't know where I saw this comment in the first place) is that, for this problem to be corrected, either the CSS parser needs to be rewritten to allow its usage by nsWebBrowserPersist for properly parsing and replacing urls in stylesheets, or a simplified CSS parser must be written for nsWebBrowserPersist, which would result in duplicated code. The best solution seems to be a full rewrite of the CSS parser to allow proper search-and-replace for urls in stylesheets, although this will take a lot of work. If someone is willing to mentor me, I have done enough research that I am pretty sure I know what needs to be done.

As a side note, the issue in comment 46 with the "Save Complete" extension has been fixed, and although it has problems with a specific time of @import rule, it is a lot better than the save functionality provided by nsWebBrowserPersist. The extension can be found at <https://addons.mozilla.org/en-US/firefox/addon/4723>.

Revision history for this message
In , Nickolay Ponomarev (asqueella) wrote :

Stephen: see bug 115107 comment 88 (that suggestion also applies to this bug - the suggested fixup interface could collect the URLs it fixes up and pass them to StoreURI, thus causing the URIs referenced from @import to be saved). I actually half-implemented that suggestion and it seemed to work fine before it fell off my plate :(

I can help you with the easier things, as time permits, if you take that approach.

Revision history for this message
In , warhammerkid (perl-programmer) wrote :

The issue that I have with working through the DOM is that, currently, anything modified by Firefox before being saved is "corrected". If the page had html like <a href=http://www.google.com>google.com</a>, the address is now enclosed in quotes. Although this is perfectly acceptable for those simply trying to save a page for later, it is not acceptable for web-developers, who would prefer that nothing was changed in their code before being saved. This is why I prefer the, albeit less effective, method of using regular expressions. The URLs can be easily collected through the DOM interfaces (see code in extension mentioned in comment 47 for example). However replacing only those URLs that need to be replaced is difficult, as a regular expression cannot ever beat a full parser for shear flexibility.

Revision history for this message
In , Nickolay Ponomarev (asqueella) wrote :

Web developers have lots of ways to get the exact source of the page.. that's not an issue in my opinion. Regular expressions are a no-go, since you can't parse HTML properly using them, plus we already have a working parser - why would we write and maintain another, regexp-based one? For a single questionable web developer use-case?

Revision history for this message
In , Bugzilla-dolphinling (bugzilla-dolphinling) wrote :

*** Bug 388565 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Jruderman (jruderman) wrote :

*** Bug 398839 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Seno-aiko (seno-aiko) wrote :

*** Bug 428046 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Polidobj (polidobj) wrote :

*** Bug 431605 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Webdesign-semnanweb (webdesign-semnanweb) wrote :

firefox 3 has been release but this bug not be cleared. look at the starting bug date, why?

Revision history for this message
In , Mike Frysinger (vapier) wrote :

the original *simple* example may be "questionable", but there are plenty of recursive usage out there where one .css imports others via @import() ... i'm not suggesting that creating a dedicated parser is a good idea, just that the idea of writing off css @import() as questionable is ridiculousness.

Revision history for this message
In , Geompse (geompse) wrote :

I'm not getting your point. You expect your saved page to look like the original. Actually Firefox is not saving all files. There is no discussion here.

Revision history for this message
In , Mike Frysinger (vapier) wrote :

my comment was a specific reply to Comment #50. take a chill pill.

Revision history for this message
In , Ludo-aelbrecht+mozilla (ludo-aelbrecht+mozilla) wrote :

#57: then why does it save the images and re-write the paths to them? What's wrong with expecting to be able to have a saved page look the same as the original one? Or am I misunderstanding your comment?

Revision history for this message
In , Devotip-tiscalinet (devotip-tiscalinet) wrote :

For most people the aim is to make a local copy of the page view some strongly appreciate a copy of the files structure.

Both are unhappy because a part of the css is lost.

Revision history for this message
In , Geompse (geompse) wrote :

Nop.
For people who wants files from the serveur, they'll need the css file linked in the @import directive.
For people who needs a copy (sort of screenshot), they'll want the same style to be applied, including those imported.

Not conflictual. Yes, the @import url should be rewritten locally for this to work properly.

(ps->Mike Frysinger=>i'm not so fluent, what's a chill pill? bad medicine probably...)

Revision history for this message
James Hurley (hurleyjames) wrote :

Firefox Information: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.8) Gecko/2009032711 Ubuntu/8.04 (hardy) Firefox/3.0.8

Web pages that have been saved locally are not displaying correctly. To see this problem run the following steps:

1). Go to the jwebunit website ( http://jwebunit.sourceforge.net/ ). Suggesting saving a page from this website as a test since this provided a very good example of the problem.

2). Right click on the main web page and select "Save As..."

3). Save the web page somewhere on the local drive. On Ubuntu there is an option in the save dialog specifying how to save the web page. Make sure that "Web page, complete" is selected.

4). Navigate to the location where the web page was stored and attempt to load the saved page back into firefox

At this point the web page that is loaded from the hard disk should display very differently than the real web page. Styles and elements of the original web page should be missing. I also tested this same process on Windows XP SP2 (using firefox 3.0.8) and I see the exact same behavior. This seems to eliminate the problem being caused by local browser settings (since I see the same thing under Windows and Linux on two separate machines with totally different preferences). Furthermore I was able to run steps 1 - 4 using Opera 9.64 and this produces a saved web page that displays identically to the web page navigated to on the internet.

...Additionally I just tried loading the webpage saved locally using opera and this loads into firefox correctly. So the problem appears to be with the actual web page save using firefox and not the fact that the page is saved locally.

ProblemType: Bug
Architecture: i386
Date: Sat Mar 28 11:36:15 2009
DistroRelease: Ubuntu 8.04
Package: firefox-3.0 3.0.8+nobinonly-0ubuntu0.8.04.2
PackageArchitecture: i386
ProcEnviron:
 PATH=/home/username/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: firefox-3.0
Uname: Linux 2.6.24-23-generic i686

Revision history for this message
James Hurley (hurleyjames) wrote :
description: updated
description: updated
description: updated
Revision history for this message
In , Kevin Brosnan (kbrosnan) wrote :

*** Bug 498472 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Jo-hermans (jo-hermans) wrote :

*** Bug 524301 has been marked as a duplicate of this bug. ***

Revision history for this message
Monkey (monkey-libre) wrote :

I´ve assigned this bug to the firefox-3.0 package.

Thank You for making Ubuntu better.

affects: ubuntu → firefox-3.0 (Ubuntu)
Revision history for this message
John Vivirito (gnomefreak) wrote :

We are in the process of updating all Firefox versions to un-versioned packages for all Ubuntu versions

affects: firefox-3.0 (Ubuntu) → firefox (Ubuntu)
Revision history for this message
Draycen DeCator (ddecator) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. We are sorry that we do not always have the capacity to look at all reported bugs in a timely manner. I have been able to confirm that this bug still occurs with daily build of Firefox 3.7. A report has also been found upstream, so I will be linking this report with that one.

Changed in firefox (Ubuntu):
status: New → Incomplete
Revision history for this message
Micah Gersten (micahg) wrote :

Thank you for your bug report. This bug has been reported to the developers of the software. You can track it and make comments at: https://bugzilla.mozilla.org/show_bug.cgi?id=126309
I'm going to mark it as Triaged and wait for upstream to work on this. Thanks for taking the time to make Ubuntu better! Please report any other issues you may find.

Changed in firefox (Ubuntu):
importance: Undecided → Low
status: Incomplete → Triaged
description: updated
Changed in firefox:
status: Unknown → Confirmed
Changed in firefox:
importance: Unknown → Medium
Revision history for this message
In , Vincent-moz (vincent-moz) wrote :

Until this bug is fixed, couldn't Firefox display a warning saying that some CSS files are missing when doing the "Save Page As" / "Web Page, complete"? (I don't know the Firefox internals, but since the page is already displayed, I suppose that Firefox could have this information quite easily.)

This would avoid to make the user wonder why a saved page looks wrong when opened, because Firefox doesn't display any error when a CSS file is not found.

Revision history for this message
In , Wingus (wingus) wrote :

Please ficks this bug. I am dying. I can't wait any longer. I want to see this bug ficked before I pass away. I want to enjoy my life. Please

Revision history for this message
In , Chrizilla (chrizilla) wrote :

@ Lil B: Others are not responsible for your life. Suggestion: Why don't you use the maff add-on ?? It not only saves an identical, faithful copy of the web page, it also saves disk space due to zip compression!

Revision history for this message
In , Geompse (geompse) wrote :

@chrizoo : You have no humor.
@lilb : I am only waiting since 2008, take the queue :)
@mozilla : This is a bug, please fix

Revision history for this message
In , Serhiy Zahoriya (xintx-ua) wrote :

A reminder: there's 50$ bounty for fixing this bug from https://bugzilla.mozilla.org/show_bug.cgi?id=126309#c36 and https://bugzilla.mozilla.org/show_bug.cgi?id=126309#c44.
Please hurry up while they are still alive. Are they?
P.S. The future in now.

Revision history for this message
In , Sjw (sjw) wrote :

Wow, no patch since 11 years for this annoying bug :o

Revision history for this message
In , Gijskruitbosch+bugs (gijskruitbosch+bugs) wrote :

*** Bug 1106261 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Geompse (geompse) wrote :

Happy New Year, bug !

Revision history for this message
In , Wingus (wingus) wrote :

Remember bug when we used to play? Let's get together again some time and work it out again, like we used to again, let's again. Merry new year to yours and of course your others. Promise, Lil B

Revision history for this message
In , Wingus (wingus) wrote :

The bounty should be adjusted for inflation, in fact I'll throw all my gold nuggets into a new river. I am going to invest in this bug instead. Promise, Lil B

Revision history for this message
In , Bugzilla-mozilla-org-d (bugzilla-mozilla-org-d) wrote :

If you are serious about the bounty, please consider putting it up at https://www.bountysource.com/issues/3508687-save-page-does-not-save-import-ed-css to streamline the process.

Changed in firefox:
importance: Medium → Unknown
Revision history for this message
In , Release-mgmt-account-bot (release-mgmt-account-bot) wrote :

The severity field for this bug is relatively low, S3. However, the bug has 34 duplicates, 113 votes and 99 CCs.
:Gijs, could you consider increasing the bug severity?

For more information, please visit [auto_nag documentation](https://wiki.mozilla.org/Release_Management/autonag#severity_underestimated.py).

Revision history for this message
In , Autonag-nomail-bot (autonag-nomail-bot) wrote :

The last needinfo from me was triggered in error by recent activity on the bug. I'm clearing the needinfo since this is a very old bug and I don't know if it's still relevant.

Revision history for this message
In , Geompse (geompse) wrote :

(In reply to Release mgmt bot [:suhaib / :marco/ :calixte] from comment #75)
Good bot

Revision history for this message
In , Gijskruitbosch+bugs (gijskruitbosch+bugs) wrote :

*** Bug 1799127 has been marked as a duplicate of this bug. ***

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.