PDFs causing browser problems on arana

Bug #238572 reported by Richard H.
4
Affects Status Importance Assigned to Milestone
Document Library
Fix Committed
Critical
Unassigned

Bug Description

This is odd. Last week we couldn't get PDF download links on a Silva page (via a DL Listing object) to work on specific PCs; namely Elisabeth's, mine and David's. They would appear to be loading, and then the machine would freeze (in IE and in Firefox). We thought it might be tied in with Acrobat v6 as all three of us have the full v6 creator application installed, and the browser plugin is v6 (even though we have later versions of the reader, e.g. v7 or v8) installed on the same machines. It always uses the v6 plugin for the full adobe acrobat application when you select on the pdf download link.

Anyway, today we have noticed that the PDF links are now working, although there seems to be a considerable delay before the PDF displays.

Does anyone have any ideas what might be causing this? We tested using Vista with IE7 and Acrobat Reader 8 last week and the pdf links worked fine. We also tested out the links in OS-X using Firefox and no problems at all. Windows XP with Reader 8 only also behaved itself fine, yet before today they weren't working on our PCs and yet now they are!

Revision history for this message
Richard H. (richard-hewison) wrote :

I can confirm that if I download and install the latest Acrobat Reader (v8.x.x) then this overrides the browser plugin that makes v6 Professional load the pdf. Instead, the v8.x.xreader loads it and the pdf displays fine. therefore there is an issue with v6 of Acrobat and pdf links from the DL.

Revision history for this message
Kit Blake (kitblake) wrote :

To be specific: this is an issue with v6 of Acrobat on PC's with Vista, correct? Is it a problem with pdf links from the DL, or with any pdf link?

Revision history for this message
Richard H. (richard-hewison) wrote : [Bug 238572] Re: PDFs causing browser problems on arana

This does appear to be an issue with DL Links and Acrobat v6 (in XP - we don't have many Vista machines and none of them have a v6 of Acrobat installed) when loading up PDF documents.

The very same browser plugin (which loads the pdf into the full Acrobat v6 Professional app) which has problems with DL pdfs always worked fine on other web sites with pdfs (including our current staff web site).

Now I have a v8.x.x reader installed, this now acts as the browser plugin and the pdfs load fine. It seems to be a specific issue with v6 of Acrobat but we can't be 100% certain that it's an XP only issue or not.

I will consult with David and see whether this is a big issue with us or not. We would always recommend people use the most recent Acrobat reader anyway.

>>> On 11/06/2008 at 17:16, Kit Blake <email address hidden> wrote:
> To be specific: this is an issue with v6 of Acrobat on PC's with Vista,
> correct? Is it a problem with pdf links from the DL, or with any pdf
> link?
>
> --
> PDFs causing browser problems on arana
> https://bugs.launchpad.net/bugs/238572
> You received this bug notification because you are a direct subscriber
> of the bug.

Revision history for this message
D Sparkes (david-sparkes) wrote :

New developments, Richard tried downloading one of the DL PDFs from home. He only has the latest version of adobe reader.

I think the reason why the mac has been consistently excluded from these issues because it downloads files rather than trying to open them in the browser.

This leads me to think that Zope3 has its PDF mime type set slightly incorrectly. Where are the mime settings in Zope3?

Revision history for this message
Richard H. (richard-hewison) wrote :

Just to clarify what David said; I accessed devtrain from home and whilst the browser behaves as if it is loading the pdf, it doesn't actually display it. This is on an XP PC laptop, with only Acrobat Reader v8.x installed. Despite the v8.x reader working fine when I first installed it on my work PC (which also has v6 of the full Adobe Acrobat Professional), this is also now behaving in the same way.

As David asked previously, could this be an issue relating to MIME types and Zope 3?

Revision history for this message
Richard H. (richard-hewison) wrote :

To add a further complication to this problem, I *don't* experience any problems at all when I view the pdf from within the DL application itself. It only seems to be when viewing them via the DL Links created via the DL Listing object in Silva.

Revision history for this message
Kit Blake (kitblake) wrote :

We don't understand why this is happening, because the link is just a hyperlink, and in both cases it goes to the same resource. We do have reports from another OAI user of PDFs that are perfectly fine but visitors with the latest Acrobat plugin get a message that the PDF is corrupted. It's not.

Can you tell us what browser(s) you're using for these tests? We think it has to be a plugin problem. It's the same resource.

We'd like to to ask you to try something, when the browser doesn't display the PDF. Can you try to download it directly? Right mouse on the link, and "Save Link As..." or whatever your browser suggests. Then see if you can open it locally.

Revision history for this message
Richard H. (richard-hewison) wrote :

Okay,

There is definitely something odd going on. I just tried the following (using Mozilla Firefox 2.0.0.14):

I visited http://devtrain.beds.ac.uk/isd/guides/guides/gw and selected the first DL pdf link (Archiving GroupWise items) and the pdf loaded fine. That doesn't happen very often!

I went back in the browser and I selected it a second time and again it loaded fine. I then selected the second DL pdf link (Introducing Novell GroupWise WebAccess v7.0.x) and I immediately got a 'Internal Server Error' which I definitely wasn't expecting!

I went back to the first pdf link again and this time the browserbehaved as if it had loaded the pdf (showing uob-archiving-pdf (application/pdf object)) in the browser's titlebar, but it was instead still displaying the web page with the links. The browser reports 'Stopped' in the Browser status bar. I selected 'back' on the browser and visually there's no difference except that I was actually back at the page with the links.

The third pdf link on the webpage behaves in the same way (doesn't display the pdf) but the fourth link does display the pdf correctly for the first two times you select it. On the third click I get the 'Internal Server Error' again.

Please bare in mind that the server errors were not part of the problem before today, but this must be relevant somehow. The same pdf links are giving different results depending on when or how many times you select it!

I will continue to investigate and report further findings later this afternoon.

>>> On 19/06/2008 at 13:43, Kit Blake <email address hidden> wrote:
> We don't understand why this is happening, because the link is just a
> hyperlink, and in both cases it goes to the same resource. We do have
> reports from another OAI user of PDFs that are perfectly fine but
> visitors with the latest Acrobat plugin get a message that the PDF is
> corrupted. It's not.
>
> Can you tell us what browser(s) you're using for these tests? We think
> it has to be a plugin problem. It's the same resource.
>
> We'd like to to ask you to try something, when the browser doesn't
> display the PDF. Can you try to download it directly? Right mouse on the
> link, and "Save Link As..." or whatever your browser suggests. Then see
> if you can open it locally.
>
> --
> PDFs causing browser problems on arana
> https://bugs.launchpad.net/bugs/238572
> You received this bug notification because you are a direct subscriber
> of the bug.

Revision history for this message
Richard H. (richard-hewison) wrote :

David is going to post some header information describing a number of different scenarios relevant to this problem. However, I will draw some conclusions here first:

We have placed a pdf directly into devtrain as a Silva file. It is located as a link on the very same page as the DL links that we are using to show the PDF problem. If you select the PDF from the direct link, it loads and displays perfectly every time. If you select a PDF link from the DL link, we get the problems described earlier. This is using the same server, the same Zope and Silva instance, the same browser, the same PC and the same adobe browser plug-in. The only difference is the method used to fetch the pdf and the location of the pdf itself.

So, it looks like have a problem which is specifically related to fetching a PDF file from the OAI repository via a DL Listing object in Silva and displaying it via an adobe acrobat browser plugin.

If you can access the page, have a look at http://devtrain.beds.ac.uk/isd/guides/gw. The direct PDF link is displayed underneath the DL links.

Revision history for this message
D Sparkes (david-sparkes) wrote :
Download full text (3.5 KiB)

Below are the headers for Firefox 2 when it tries to display the pdf file. Saving the file always works. It is only when we have the acrobat reader plug-in on our browser that it has problems with the documents out of the document library.

When opening:
http://documents.beds.ac.uk/dl/training/handle/3736129497719782140/uob-archiving-gw.pdf
we get a 'stopped' message at the bottom of the browser and the headers look like this.

GET /dl/training/handle/3736129497719782140/uob-archiving-gw.pdf HTTP/1.1
Host: documents.beds.ac.uk
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://devtrain.beds.ac.uk/isd/guides/gw

HTTP/1.0 200 OK
Date: Thu, 19 Jun 2008 13:24:23 GMT
Server: Twisted/2.1.0 TwistedWeb/SVN-Trunk
Content-Length: 135386
X-Powered-By: Zope (www.zope.org), Python (www.python.org)
Accept-Ranges: bytes
Content-Type: application/pdf
X-Cache: MISS from ragno.beds.ac.uk
X-Cache-Lookup: HIT from ragno.beds.ac.uk:80
Connection: keep-alive

another document from the main university site (that opens sucessfully in the adobe reader plug-in window in the browser):

http://www.beds.ac.uk/aboutus/qa/foi2005/gov/11legfra/1-legal-ia.pdf

GET /aboutus/qa/foi2005/gov/11legfra/1-legal-ia.pdf HTTP/1.1
Host: www.beds.ac.uk
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Range: bytes=65536-120586,65536-65537

HTTP/1.x 206 Partial Content
Date: Thu, 19 Jun 2008 13:58:49 GMT
Server: Zope/(Zope 2.8.8-final, python 2.3.6, linux2) ZServer/1.1
Content-Length: 55342
content-disposition: inline;filename=1-legal-ia.pdf
Accept-Ranges: bytes
Last-Modified: Wed, 01 Aug 2007 09:41:49 GMT
Content-Type: multipart/byteranges; boundary=127.0.0.2.1001.22317.1213883929.531.2250
X-Cache: MISS from ragno.beds.ac.uk
X-Cache-Lookup: HIT from ragno.beds.ac.uk:80
Connection: keep-alive
----------------------------------------------------------

This is a successful pdf request on the same page that we added via the silva interface (below)

http://devtrain.beds.ac.uk/isd/guides/gw/uob-gw70

GET /isd/guides/gw/uob-gw70 HTTP/1.1
Host: devtrain.beds.ac.uk
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://devtrain.beds.ac.uk/isd/guides/gw
Authorization: Basic ZHNwYXJrZXM6XnlobkJodTg=

HTTP/1.0 200 OK
Date: Thu, 19 Jun 2008 14:30:50 GMT
...

Read more...

Revision history for this message
Kit Blake (kitblake) wrote :

Using Firefox (2.0.0.14 without an Adobe plugin) all of the above mentioned documents download fine.

Using Safari, which does open PDFs in the browser, but not with an Adobe plugin, I can't reproduce any of the problems either. I have clicked on all documents numerous times, and every time it appears in my browser as expected.

With a Windows machine and Explorer with Acrobat plugin, I can reproduce the problem, almost exactly as you described.

So, after more than an afternoon of investigation (2 people), we've found that there's a setting called "Allow fast web view" (this is in Adobe Reader, not the plugin, under "Internet" preferences). If you turn that off, the PDFs show up just fine. What's happening is the PDF was distilled (produced) in a 'linearized' Adobe format that allows the plugin to decide to download the file in chunks. You can see it happening in Exploror's status bar. Somewhere these chunks are not being handled properly.

We could say that your users should turn that preference off. If you Google that phrase you'll find lots of hits with the same advice. But we don't like that solution and it's likely that it won't really work in your organization.

There is probably a way to solve the problem on the serving side, we just haven't figured out how. But we see lots of other applications out there with the same headache, all because of the Adobe plugin. We'll have to investigate and find how, and where (in Apache, in the module, in the application) this gets solved.

We'll need to put this on the ToDo list for the next phase as it's not a quick fix.

Revision history for this message
Kit Blake (kitblake) wrote :

Forgot to add this, the file sizes:
uob-archiving-gw.pdf 136k
uob-gw70-web.pdf 568k
uob-gw70.pdf 1000k
uob-use-gw-externally.pdf 184k
uob-gw-post-offices.pdf 76k
uob-gw70-diff-1.pdf 248k
uob-blackberry.pdf 616k

The smaller docs tend to show up fine, while the plugin tries to chunk the bigger ones, and that's failing.

Revision history for this message
Richard H. (richard-hewison) wrote :

Hi Kit,

Many thanks for all the effort put into investigating this.

We do still have one question though - on our example webpage at http://devtrain.beds.ac.uk/isd/guides/gw we have a list of 7 DL links. The third link is 'Using GroupWise 7.0.x at the University' and there is a pdf and a text version available. If you select the pdf two or three times then the PDF stops displaying as we have all now witnessed. However, after these 7 DL links there is a standard link to another copy of the very same PDF, which has been uploaded to Silva as a 'Silva file' rather than using the DL. This pdf loads and displays perfectly every time via the plugin, no matter how many times you select it. Surely this indicates a problem with the repository or the method used to fetch the file from the OAI, otherwise wouldn't this direct pdf link behave in the same way as the DL link?

>>> On 20/06/2008 at 10:13, Kit Blake <email address hidden> wrote:
> Using Firefox (2.0.0.14 without an Adobe plugin) all of the above
> mentioned documents download fine.
>
> Using Safari, which does open PDFs in the browser, but not with an Adobe
> plugin, I can't reproduce any of the problems either. I have clicked on
> all documents numerous times, and every time it appears in my browser as
> expected.
>
> With a Windows machine and Explorer with Acrobat plugin, I can reproduce
> the problem, almost exactly as you described.
>
> So, after more than an afternoon of investigation (2 people), we've
> found that there's a setting called "Allow fast web view" (this is in
> Adobe Reader, not the plugin, under "Internet" preferences). If you turn
> that off, the PDFs show up just fine. What's happening is the PDF was
> distilled (produced) in a 'linearized' Adobe format that allows the
> plugin to decide to download the file in chunks. You can see it
> happening in Exploror's status bar. Somewhere these chunks are not being
> handled properly.
>
> We could say that your users should turn that preference off. If you
> Google that phrase you'll find lots of hits with the same advice. But we
> don't like that solution and it's likely that it won't really work in
> your organization.
>
> There is probably a way to solve the problem on the serving side, we
> just haven't figured out how. But we see lots of other applications out
> there with the same headache, all because of the Adobe plugin. We'll
> have to investigate and find how, and where (in Apache, in the module,
> in the application) this gets solved.
>
> We'll need to put this on the ToDo list for the next phase as it's not a
> quick fix.
>
> --
> PDFs causing browser problems on arana
> https://bugs.launchpad.net/bugs/238572
> You received this bug notification because you are a direct subscriber
> of the bug.

Revision history for this message
Kit Blake (kitblake) wrote :

Exactly, the PDF that works is in Silva and served from Silva/Zope/Apache/Squid and that combination is handling the chunking fine. The other PDF is in the DL and served from DL/Zope3/Apache/Squid and somewhere it's going wrong.

Revision history for this message
Richard H. (richard-hewison) wrote :

Thanks. We're a little bit confused because your original explanation didn't mention Zope2 vs Zope3 at all in regards to this issue. So, just to clarify - this is specific to how Zope3 behaves with the Adobe plugin?

>>> On 20/06/2008 at 11:50, Kit Blake <email address hidden> wrote:
> Exactly, the PDF that works is in Silva and served from
> Silva/Zope/Apache/Squid and that combination is handling the chunking
> fine. The other PDF is in the DL and served from DL/Zope3/Apache/Squid
> and somewhere it's going wrong.
>
> --
> PDFs causing browser problems on arana
> https://bugs.launchpad.net/bugs/238572
> You received this bug notification because you are a direct subscriber
> of the bug.

Revision history for this message
Richard H. (richard-hewison) wrote :

Unfortunately switching the option in the Acrobat reader doesn't make any difference to the behaviour of the PDFs in Mozilla Firefox, and although it did appear to be making a difference in IE it soon reverted to erroring as before. :-(

This was after restarting both browsers, and then restarting the PC as well.

Revision history for this message
Kit Blake (kitblake) wrote :

We don't know what it's specific to, only that somewhere in that stack it's going wrong. Could be Zope3, could be the mod_python Apache component, we're not sure.

Strange that you're still getting errors. On the PC I used it fixed the problem immediately, without restarting anything. I see that I also turned off the Adobe Reader (Internet) preference for "Allow speculative downloading In the background".

Revision history for this message
Kit Blake (kitblake) wrote :

Here's an example page from a Google search where they recommend exactly the above (scroll to bottom):
http://www.co.napa.ca.us/common/adobe_help.asp

Revision history for this message
Richard H. (richard-hewison) wrote :

This problem with pdfs has to be fixed asap, because as it stands:

(i) pdfs load fine every time if you don't use the DL to supply the file to the viewer.

(ii) pdfs tend to lock or fail to complete loading almost all of the time via the DL when trying to supply the file to the viewer.

As you can see, this will seriously affect people's opinion of the DL as a worthwhile tool to be used on University web sites unless it is resolved.

Changed in documentlibrary:
importance: Undecided → Critical
status: New → Confirmed
Kit Blake (kitblake)
Changed in documentlibrary:
status: Confirmed → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.