W3C wants query parameters standardized on ampersands; semicolons are now bad

Bug #1687545 reported by Dan Scott on 2017-05-02
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Evergreen
Undecided
Unassigned

Bug Description

Per https://www.w3.org/TR/2014/REC-html5-20141028/forms.html#url-encoded-form-data, step 8 of creating URL-encoded form data, where it tells you how to add a second or subsequent key/value pair is:

  If this is not the first entry, append a single U+0026 AMPERSAND character (&) to result.

That is, semicolons are no longer an acceptable option.

Why does this matter? Because modern tools and APIs are now treating semicolons as just plain data rather than an alternative to ampersand, and this has real impact. For example, the https://developer.mozilla.org/en-US/docs/Web/API/URLSearchParams browser API and corresponding polyfill https://github.com/WebReflection/url-search-params would be a handy way to replace our custom openils.CGI code.

But given a URL of https://example.org/eg/opac/advanced?bool=and;qtype=keyword, asking for the qtype parameter returns null, and asking for the bool parameter returns more than you would expect:

  var params = new URLSearchParams(location.search.slice(1));
  params.get('qtype');
  // null
  params.get('bool');
  // "and;qtype=keyword"

We could continue using openils.CGI and adding our own workarounds to modern tools, but it might be better to standardize on ampersands sooner rather than later.

Dan Scott (denials) wrote :

Per http://perldoc.perl.org/CGI.html, "-newstyle_urls
Separate the name=value pairs in CGI parameter query strings with semicolons rather than ampersands. [...] Semicolon-delimited query strings are always accepted, and will be emitted by self_url() and query_string(). newstyle_urls became the default in version 2.64."

It's possible to return the behaviour to ampersands using the -oldstyle_urls flag. To be investigated.

Dan Scott (denials) wrote :

Confirmed, adding the flag to OpenILS/WWW/EGCatLoader.pm does the trick for much of the TPAC:

  use CGI qw(:all -utf8 -oldstyle_urls);

In Evergreen 2.10, at least, there are a lot of invocations of "use CGI":

/usr/local/share/perl/5.18.2/OpenILS/Reporter/Proxy.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/BadDebt.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/TemplateBatchBibUpdate.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/FlatFielder.pm:use CGI qw(:all -utf8);
/usr/local/share/perl/5.18.2/OpenILS/WWW/Exporter.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/EGWeb/CGI_utf8.pm:use CGI qw(:all -utf8);
/usr/local/share/perl/5.18.2/OpenILS/WWW/SuperCat/Feed.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/EGCatLoader.pm:use CGI qw(:all -utf8 -oldstyle_urls);
/usr/local/share/perl/5.18.2/OpenILS/WWW/XMLRPCGateway.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/AddedContent.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/Proxy.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/AutoSuggest.pm:use CGI qw(:all -utf8);
/usr/local/share/perl/5.18.2/OpenILS/WWW/Proxy/Authen.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/IDL2js.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/SuperCat.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/Vandelay.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/Reporter.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/PhoneList.pm:use CGI;
/usr/local/share/perl/5.18.2/OpenILS/WWW/Redirect.pm:use CGI ();

Galen Charlton (gmc) wrote :

*grumble grumble grumble in the direction of the W3C*

Now that that's out of my system, I note that because of URL patterns like '/eg/opac/results?bookbag=123;page=0;locg=1;depth=0' that get used as permalinks, we'll need to make sure we don't break them coming in; going out is a different matter, of course.

Dan Scott (denials) wrote :

The other piece of the puzzle to getting the TT2 mkurl() macro to generate links with & is to set the -oldstyle_urls flag in Template::Plugin::CGI. I thought we might have to fork that, but note that Template::Toolkit itself just celebrated its first release in three years on April 15, 2017: http://search.cpan.org/~abw/Template-Toolkit-2.27/ - so the timing might be right to press for at least an option to invoke -oldstyle_urls in the CGI plugin!

On the bright side, incoming URLs such as https://example.org/eg/opac/results?query=fiction;qtype=keyword;fi%3Asearch_format=;locg=124;detail_record_view=1;_adv=1 work with either & or ; when the server is doing the parsing. It's when we do client-side parsing of the URL (as when we reconstructing the copy location search options by parsing out "fi:locations" parameters) where we run into trouble with ';' delimiters with the URLSearchParams API and friends.

Luckily it seems like openils.CGI is mostly used in conify, serials, and acquisitions--not too many situations that are user (and bookmark / saved link) facing. Copy location search appears to be the only case where that happens in the js/ui/default/opac space on a quick perusal.

Robert J Jackson (rjackson) wrote :

It looks like perhaps this impacts end users as shown by the following log entries?

gateway log line:

2017-08-02 07:11:58 app3-prod logger: xx.xxx.xxx.xx - - [02/Aug/2017:07:11:57 -0400] "GET /eg/opac/results?fi%3Afrom_metarecord=4277067%3Bmetarecord%3D4277067%3Bmodifier%3Dmetabib%3Bfi%3Aicon_format%3Ddvd HTTP/1.1" 200 113320

corresponding entry from the osrfsys log file:

osrfsys.07.log:2017-08-02 07:11:58 app3-prod open-ils.search: [INFO:10150:Application.pm:159:15016680882512309] CALL: open-ils.search open-ils.search.biblio.multiclass.query HASH(0x4814738), from_metarecord(4277067;metarecord=4277067;modifier=metabib;fi:icon_format=dvd) depth(0), 1

This is for Evergreen Indiana consortium running with 2.12.4

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers