Whitespace is not preserved during collation
Bug #567212 reported by
Gregor Middell
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
collatex |
New
|
Undecided
|
Unassigned |
Bug Description
The current default tokenizer is greedily consuming whitespace. Instead of consuming it at the tokenizer level, whitespace should be preserved in the token and only stripped during token normalization.
To post a comment you must log in.