HTML entities interpreted incorrectly

Bug #1561849 reported by StSav012 on 2016-03-25
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
elinks (Ubuntu)
Undecided
Unassigned

Bug Description

Recently, I've found that if a link contains something like an HTML entity but without semicolon after it, it's interpreted as a valid HTML entity and substituted with the corresponding symbol even in links. This breaks some sites.

Steps to reproduce:
• Launch elinks and navigate to https://m.vk.com/ . Make sure that POST requests are sent with confirmation to see the URL they sent to. That is set by default.
• Enter whatever you want into “email” and “pass” fields.
• Hit Log in button at the page.

The POST form there has action URL that looks like
https://login.vk.com/?act=login&_origin=https://m.vk.com&ip_h=…&lg_h=…&role=pda&utf8=1

What actually happens:
• Despite that “&lg” has no semicolon at the end, it's being interpreted as ‘≶’ sign, and you'll see that in the POST request confirmation. This sign is only in the Unicode, so it's displayed as several other signs in different code pages. Anyway, this breaks the form. Checked with different terminals.

What is expected to happen:
• The sequence “&lg” is left as it is.

System Information:
• Ubuntu 14.04 LTS 64bit
• elinks 0.12~pre6-4ubuntu1 (the newest by now)

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers