u8_totitle() fails to handle "." as a word separator

Bug #1902088 reported by Bernard Moreton
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
libunistring (Ubuntu)
New
Undecided
Unassigned

Bug Description

u8_totitle() fails to handle "." as a word separator, though the dot is commonly used without any space-character where initials precede a surname.
for example, the street address "Largo A.Gemelli"is rendered "Largo A.gemelli"

I find it necessary to include the dot (.), comma (,), opening and closing parentheses [(,)], hyphen (-), single and double quotes (',"), and the forward slash (/) as word separators; but the dot (.) is the most common, after the correctly-handled space ( ).

I have libunistring 0.9.10-2
under Ubuntu 20.04 LTS

 lsb_release -rd
Description: Ubuntu 20.04.1 LTS
Release: 20.04

apt-cache policy libunistring2
libunistring2:
  Installed: 0.9.10-2
  Candidate: 0.9.10-2
  Version table:
 *** 0.9.10-2 500
        500 http://gb.archive.ubuntu.com/ubuntu focal/main amd64 Packages
        100 /var/lib/dpkg/status

Revision history for this message
Bernard Moreton (bernard-moreton-1) wrote :

u8_totitle accepts most of the non-alphanumeric ascii codes as word-separators, but besides the 'dot', it fails to recognize the colon (:) and the underscore (_) as separators (can't see why), and also fails the recognize the apostrophe.

I can see reason for this last - one wouldn't want "Adam'S House", or (for that matter) "Wouldn'T".
But I suspect that there would be more cases of "M de l'Isle", or "Charles o'Hara", where the names do require capitalisation.
The abbreviative use would normally be followed by a single character (and that normally "s"), so could be allowed for with a test on the length of the following "word".

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.