SuperCat SRU double-encodes UTF8 characters, does not set character encoding

Bug #1431541 reported by Dan Scott on 2015-03-12
This bug affects 1 person
Affects Status Importance Assigned to Milestone

Bug Description

* Evergreen master
* Ubuntu 12.04

Tested with a 2.7 production system, the HTTP header was setting the charset to ISO-8859-1 (per Apache defaults I believe) because the SuperCat SRU methods were not themselves setting an explicit header. For an example, try:

curl -I ''

(replacing hostname / library shortname / search query as necessary).

Also, perhaps due to changes in Encode or MARC::XML behaviour on Ubuntu 12.04, it seems that the encode_utf8() call for $marc->as_xml_record() is no longer necessary; in fact, it corrupts any non-ASCII characters. Which isn't good.

Dan Scott (denials) wrote :

See;a=shortlog;h=refs/heads/user/dbs/lp1431541_supercat_sru_encoding for a very simple fix, tested and in production on an Ubuntu 12.04 system that has a whole ton of non-ASCII characters.

tags: added: pullrequest
Changed in evergreen:
milestone: none → 2.8.0
Mike Rylander (mrylander) wrote :

Dan, that looks entirely sane. Seems this is relevant for backport to 2.6+ ... no?

Dan Scott (denials) wrote :

Mike, yes, I believe this should be backported to 2.6. I'll set the milestones accordingly!

Bill Erickson (berick) on 2015-04-02
Changed in evergreen:
milestone: 2.8.0 → 2.8.1
Changed in evergreen:
milestone: 2.8.1 → 2.8.3
status: New → Triaged
importance: Undecided → Medium
Ben Shum (bshum) wrote :

Fix pushed to master and backported to rel_2_8 and rel_2_7.

Changed in evergreen:
status: Triaged → Fix Committed
milestone: 2.8.3 → 2.9-beta
Changed in evergreen:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers