Truncation Of Cell Data When using OpenPyxl

Bug #2022918 reported by Harvey
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
New
Undecided
Unassigned

Bug Description

When using OpenPyxl 2.5.12 (required in our case) to write data into a .xlsx cell, there is some string data where the text is only partially written to the cell and therefore truncated at some arbitrary point in the text.

Openpyxl makes use of lxml and this bug only appears to happen when using lxml 4.7.0 or greater, with earlier versions not causing truncation of cell data.

The attached python script attempts to write some repeated randomly generated strings of Bulgarian and should generate 'test.xlsx' which contains the single cell where the data has been truncated/cut-off. It has been truncated 4007 characters into the 15971 character long original text but this is never consistent across occurrences where the truncation occurs. There is a possibility the truncation is directly affected by Cyrillic Unicode characters, but the issue doesn't initially appear with only the lines where truncation occurs and usually occurs several lines deep into a text sample.

There is a possibility that this issue may stem from this change where parsing is done directly by encoding to UTF-8 instead of using Py_UNICODE strings like in previous versions:
https://github.com/lxml/lxml/commit/02a49b1d6ad177c948652f8b4d72aa0e2b386b89

The attached script will only recreate the error with openpyxl 2.5.12 and lxml 4.7.0 (or greater) and will create a new xlsx file in the current working directory the script is ran from.

Report Information:
Python : sys.version_info(major=3, minor=8, micro=2, releaselevel='final', serial=0)
lxml.etree : (4, 7, 0, 0)
libxml used : (2, 9, 12)
libxml compiled : (2, 9, 12)
libxslt used : (1, 1, 34)
libxslt compiled : (1, 1, 34)

Tags: truncation
Revision history for this message
Harvey (harvey240) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.