TPAC: Search engines and browsers redirect endlessly on some URLs

Bug #1044132 reported by Dan Scott
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Evergreen
Triaged
Undecided
Unassigned

Bug Description

* Evergreen 2.3.0 beta2-ish

I checked on Google's Webmaster Tools for our catalogue for the first time in a LONG time and it reported a significant increase in the number of errors when we cut over to TPAC.

One of the problems appears to be endless redirects that we run into when adding items to the temporary list, using the following URL:

http://laurentian.concat.ca/eg/opac/mylist/add?anchor=record_591425;record=591425

Chrome and Firefox both refuse to load the page, as it appears to result in endless redirects.

Here's what Google's Webmaster Tools "Fetch as Google" shows:
----------------------------------------------

Fetch as Google
This is how Googlebot fetched the page.
URL: http://laurentian.concat.ca/eg/opac/mylist/add?anchor=record_591425;record=591425
Date: Thursday, August 30, 2012 5:01:20 PM PDT
Googlebot Type: Web
Download Time (in milliseconds): 292
HTTP/1.1 302 Found
Date: Fri, 31 Aug 2012 00:01:20 GMT
Server: Apache/2.2.16 (Debian)
Set-Cookie: anoncache=e0d58680d5f1f6149962db43dc062002; path=/
Location: #record_591425
Cache-Control: max-age=5
Expires: Fri, 31 Aug 2012 00:01:25 GMT
Content-Length: 284
Keep-Alive: timeout=1, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved <a href="#record_591425">here</a>.</p>
<hr>
<address>Apache/2.2.16 (Debian) Server at laurentian.concat.ca Port 80</address>
</body></html>

----------------------------------------------

To avoid indexing links that a search engine wouldn't care about, and to cut down on inadvertent search engine loops in this case, we can publish a robots.txt with something like:

User-agent: *
Disallow: /eg/opac/mylist/
Disallow: /eg/opac/myopac/
Disallow: /eg/opac/record/print/
Disallow: /eg/opac/record/email/

... and I believe we should probably add something like this to the default install (with appropriate notes in the docs).

However, we should also prevent creating these eternally redirecting links in the first place.

Tags: opac
Changed in evergreen:
status: New → Triaged
tags: added: opac
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.