Activity log for bug #1666195

Date Who What changed Old value New value Message
2017-02-20 11:50:24 Daniel PUIU bug added bug
2017-02-20 11:50:24 Daniel PUIU attachment added xml used https://bugs.launchpad.net/bugs/1666195/+attachment/4822669/+files/example1.arxml
2017-02-20 11:56:49 Daniel PUIU description Python : sys.version_info(major=3, minor=4, micro=4, releaselevel='final', serial=0) lxml.etree : (3, 6, 0, 0) libxml used : (2, 9, 3) libxml compiled : (2, 9, 3) libxslt used : (1, 1, 29) libxslt compiled : (1, 1, 29) For very long xmls, the returned sourceline is wrong. I have build a custom xml to show my problem (which is attached to this bug): <?xml version="1.0" encoding="UTF-8"?> <root> <child> <grandchild>GC0</grandchild> <grandchild>GC1</grandchild> <grandchild>GC2</grandchild> <grandchild>GC3</grandchild> <grandchild>GC4</grandchild> <grandchild>GC5</grandchild> <grandchild>GC6</grandchild> <grandchild>GC7</grandchild> <grandchild>GC8</grandchild> <grandchild>GC9</grandchild> </child> ... <child> <grandchild>GC0</grandchild> <grandchild>GC1</grandchild> <grandchild>GC2</grandchild> <grandchild>GC3</grandchild> <grandchild>GC4</grandchild> <grandchild>GC5</grandchild> <grandchild>GC6</grandchild> <grandchild>GC7</grandchild> <grandchild>GC8</grandchild> <grandchild>GC9</grandchild> </child> </root> I have 32768 child nodes. Starting with 5461st child the returned sourceline is wrong by at least 1 line: The 5461st node is at line 65535, but sourceline returns 65536. The following code: for grandchild in children[5461].getchildren(): print(grandchild.getparent().sourceline, grandchild.sourceline) prints 65536 65536 65536 65537 65536 65538 65536 65539 65536 65540 65536 65541 65536 65542 65536 65543 65536 65544 65536 65545 for a higher level of nesting elements the difference between the real sourceline and the returned sourceline grows. In this case the following code: for grandchild in children[-1].getchildren(): print(grandchild.getparent().sourceline, grandchild.sourceline) prints 393208 393208 393208 393209 393208 393210 393208 393211 393208 393212 393208 393213 393208 393214 393208 393215 393208 393216 393208 393217 so the difference is still 1. Python : sys.version_info(major=3, minor=4, micro=4, releaselevel='final', serial=0) lxml.etree : (3, 6, 0, 0) libxml used : (2, 9, 3) libxml compiled : (2, 9, 3) libxslt used : (1, 1, 29) libxslt compiled : (1, 1, 29) windows 10 x64, python x64, lxml x64 For very long xmls, the returned sourceline is wrong. I have build a custom xml to show my problem (which is attached to this bug): <?xml version="1.0" encoding="UTF-8"?> <root>  <child>   <grandchild>GC0</grandchild>   <grandchild>GC1</grandchild>   <grandchild>GC2</grandchild>   <grandchild>GC3</grandchild>   <grandchild>GC4</grandchild>   <grandchild>GC5</grandchild>   <grandchild>GC6</grandchild>   <grandchild>GC7</grandchild>   <grandchild>GC8</grandchild>   <grandchild>GC9</grandchild>  </child>         ...  <child>   <grandchild>GC0</grandchild>   <grandchild>GC1</grandchild>   <grandchild>GC2</grandchild>   <grandchild>GC3</grandchild>   <grandchild>GC4</grandchild>   <grandchild>GC5</grandchild>   <grandchild>GC6</grandchild>   <grandchild>GC7</grandchild>   <grandchild>GC8</grandchild>   <grandchild>GC9</grandchild>  </child> </root> I have 32768 child nodes. Starting with 5461st child the returned sourceline is wrong by at least 1 line: The 5461st node is at line 65535, but sourceline returns 65536. The following code: for grandchild in children[5461].getchildren():  print(grandchild.getparent().sourceline, grandchild.sourceline) prints 65536 65536 65536 65537 65536 65538 65536 65539 65536 65540 65536 65541 65536 65542 65536 65543 65536 65544 65536 65545 for a higher level of nesting elements the difference between the real sourceline and the returned sourceline grows. In this case the following code: for grandchild in children[-1].getchildren():  print(grandchild.getparent().sourceline, grandchild.sourceline) prints 393208 393208 393208 393209 393208 393210 393208 393211 393208 393212 393208 393213 393208 393214 393208 393215 393208 393216 393208 393217 so the difference is still 1.
2017-02-20 14:21:34 Daniel PUIU attachment added second xml https://bugs.launchpad.net/lxml/+bug/1666195/+attachment/4822745/+files/example2.xml
2017-02-24 16:58:21 scoder lxml: status New Invalid