Cannot parse symbol libs with non-english symbol names

Bug #1806206 reported by Aleksandr Sh on 2018-12-01
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Wayne Stambaugh

Bug Description

I can create symbols with non-english names like "Резистор" in the symbol editor and save the lib.
But after reopening KiCad, it cannot load the lib (and cannot load the cache if the symbol was added to schematic). Try to open the attached lib.

Non-english names in footprint libs seems to work fine.

Application: kicad
Version: (6.0.0-rc1-dev-1291-g61b749f0b), release build
    wxWidgets 3.0.4
    libcurl/7.61.1 OpenSSL/1.1.1 (WinSSL) zlib/1.2.11 brotli/1.0.6 libidn2/2.0.5 libpsl/0.20.2 (+libidn2/2.0.5) nghttp2/1.34.0
Platform: Windows 8 (build 9200), 64-bit edition, 64 bit, Little endian, wxMSW
Build Info:
    wxWidgets: 3.0.4 (wchar_t,wx containers,compatible with 2.8)
    Boost: 1.68.0
    OpenCASCADE Community Edition: 6.9.1
    Curl: 7.61.1
    Compiler: GCC 8.2.0 with C++ ABI 1013

Build settings:

Aleksandr Sh (dsa-t) wrote :
Maciej Suminski (orsonmmz) wrote :

I have managed to load the library without any problems on Linux, so the problem might be Windows specific. What is the error message you are getting?

Wayne Stambaugh (stambaughw) wrote :

I can confirm that the provided library does not open on windows. What's odd is that it fails to open with a "expected Y or N in input/source at line 6, offset 23 but the file looks fine to me.

Changed in kicad:
status: New → Triaged
importance: Undecided → Medium
milestone: none → 5.1.0


It is not odd!
The parser if just broken.

this line is a UTF8 line, but the parser parses it as a char (ASCII8) line.

The parser stops at the middle of the UTF8 symbol name because it found a ' ', i.e. a separator, because the UTF8 "char" is not correctly parsed.

Wayne Stambaugh (stambaughw) wrote :

@JP, wouldn't this same problem occur with the old sscanf() based parser? If so, then we probably should not allow utf8 characters as symbol names which would effectively be a file format change. Fixing the parser is an option if we want to allow utf8 characters in legacy file formats.

Wayne Stambaugh (stambaughw) wrote :

@JP, I just tested 4.0.7 and the sscanf parser does indeed work with the library included in this bug report so I will have to fix the parser. Given that this was released in 5.0.0, I can believe it took this long to find the bug.


I am guessing the probability to encountering this bug is low: the parser stops reading a token if a space is found.
So encountering a UTF8 sequence that contains the 0x20 byte (and is not the "space" char) is perhaps not frequent.

Changed in kicad:
assignee: nobody → Wayne Stambaugh (stambaughw)
status: Triaged → In Progress
Wayne Stambaugh (stambaughw) wrote :

I just pushed the fix for this. Please let me know if you are still having issues.

Aleksandr Sh (dsa-t) wrote :

I wonder if these characters with 0A are handled correctly:
ĊȊЊԊ؊܊ࠊऊਊଊఊഊช༊ညᄊሊጊᐊᔊᘊᜊ᠊ᤊᨊᬊᴊḊἊ ℊ∊⌊␊┊☊✊⠊⤊⨊⬊Ⰺⴊ⸊⼊《ㄊ㈊㌊㐊㔊㘊

I'm not sure why you would use them in component fields through.

Wayne Stambaugh (stambaughw) wrote :

@Aleksandr, I just did a copy of the character string you provided and successfully created a symbol with that name. Obviously kicad cannot display those characters in the name filed but it does save and load the symbol with that name correctly. There are some differences in the characters I am seeing on this bug report page compared to what windows is displaying but my guess is this is a font mapping issue.

Aleksandr Sh (dsa-t) wrote :

Right, UTF-8 does not allow 0x0A byte anywhere except for actual line feed character therefore there is no problems with line feeds in field values.

KiCad Janitor (kicad-janitor) wrote :

Fixed in revision a61a51f26e9e400431c9ac58994281263284010f

Changed in kicad:
status: In Progress → Fix Committed
Changed in kicad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Bug attachments