Designate DNS – create TLD using valid Unicode string

Bug #1918654 reported by Arkady Shtempler
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Designate
New
Undecided
Unassigned

Bug Description

Scenario:
Try to create a new TLD using Unicode string, for example: “例え”

Actual Result:
API fails with:
Details: {'code': 400, 'type': 'invalid_object', 'message': 'Provided object is not valid. Got a ValueError error with message \u4f8b\u3048 is not an TLD', 'request_id': 'req-fa7cb43e-a6e9-41f2-b878-1ced6a853a2a'}

Expected Result:
Should pass

Notes:
1) There is an tempest patch could be used to reproduce the scenario:
https://review.opendev.org/c/openstack/designate-tempest-plugin/+/779558

2) Documentation: https://docs.openstack.org/api-ref/dns/?expanded=get-the-name-servers-for-a-zone-detail,list-all-recordsets-owned-by-project-detail,create-zone-detail,create-tld-detail#create-tld

Tags: designate dns
affects: neutron → designate
Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

You likely need to use the ascii representation for unicode domains, see https://en.wikipedia.org/wiki/Internationalized_domain_name , same as you would need to do for creating a zone matching that TLD later.

Changed in designate:
status: New → Incomplete
Revision history for this message
Michael Johnson (johnsom) wrote :

I think this is a valid bug as we should be supporting Internationalized Domain Names (IDN) in a user friendly way.

There are already TLDs using unicode characters: https://www.iana.org/domains/root/db

There is no reason the Designate tools can't handle the unicode character conversion to punycode on behalf of the user.

For example, I can use dig to resolve "人民网.中国", we should be able to handle those domains in Designate.

Revision history for this message
Michael Johnson (johnsom) wrote :

The Designate objects are performing a regex on the input value here: https://opendev.org/openstack/designate/src/branch/master/designate/objects/fields.py#L293

With the regex here:
https://opendev.org/openstack/designate/src/branch/master/designate/objects/fields.py#L98

Making the assumption that the TLD will be ascii only.

This is inconsistent with the API which declares it supports UTF-8 and accepts the strings in other fields such as the description of a TLD.

Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

Which version of dig supports this? 9.11.3-1ubuntu1.14-Ubuntu returns nxdomain for me.

While I think that the API should only deal with the punycode representation at that is also the only the to actually exist in the DNS system itself, it may be reasonable to add conversion support into the client, although even that may be confusing. If e.g. showing a zone or tld, should the name be shown in punycode or UTF-8? Users may expect the latter, but that could break existing applications expecting the former? So my suggestion would be to stick to the existing state and just add some documentation showing users how to convert their UTF-8 labels to punycode.

Revision history for this message
Michael Johnson (johnsom) wrote :

Here is the output:
$ dig 人民网.中国

; <<>> DiG 9.11.27-RedHat-9.11.27-1.fc33 <<>> 人民网.中国
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62795
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;人民网.中国. IN A

;; ANSWER SECTION:
人民网.中国. 7198 IN A 58.68.146.208

;; Query time: 3731 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Mon Mar 15 08:38:30 PDT 2021
;; MSG SIZE rcvd: 70

Or

dig 人民网.中国

; <<>> DiG 9.16.1-Ubuntu <<>> 人民网.中国
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29429
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;人民网.中国. IN A

;; ANSWER SECTION:
人民网.中国. 6787 IN A 58.68.146.208

;; Query time: 0 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Mon Mar 15 15:45:21 UTC 2021
;; MSG SIZE rcvd: 70

The problem I have here is we are inconsistent in our API usage of UTF-8. JSON is UTF-8, the documentation lists that we return UTF-8 content, the API accepts and handles UTF-8 strings for other fields such as the description fields, and by the RFCs TLDs support UTF-8.
I think this can be corrected in designate in a way that does not break existing usage, but we need to embrace other cultures and languages.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for Designate because there has been no activity for 60 days.]

Changed in designate:
status: Incomplete → Expired
Changed in designate:
status: Expired → New
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.