file-position is confused by utf-8 buffering
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
SBCL |
Fix Released
|
Medium
|
Unassigned |
Bug Description
file-position seems to be computed by (- position-
# cat ~/bug/sbcl-
(in-package :cl-user)
(defparameter *u* "/tmp/u")
(with-open-file (s *u* :direction :output :if-exists :rename-and-delete)
(princ "Faré λ 自由 foo" s))
(with-open-file (s *u* :direction :input)
(format t "~&file length: ~D~%" (file-length s))
(loop :for pos = (file-position s)
:for c = (read-char s nil nil)
:for nil = (format t "~&pos ~2D ~S~%" pos c)
:while c))
(delete-file *u*)
(quit)
# sbcl --load ~/bug/sbcl-
This is SBCL 1.0.42.37, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://
SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses. See the CREDITS and COPYING files in the
distribution for more information.
file length: 19
pos 0 #\F
pos 7 #\a
pos 8 #\r
pos 9 #\LATIN_
pos 10 #\
pos 11 #\GREEK_
pos 12 #\
pos 13 #\U81EA
pos 14 #\U7531
pos 15 #\
pos 16 #\f
pos 17 #\o
pos 18 #\o
pos 19 NIL
This is on Linux amd64, which shouldn't matter, with a recentish SBCL.
I'm building the latest SBCL to test it, but from the git log
I don't think it has been magically solved.
NB: I found this problem while writing a function using file-position
to read backwards in a file and portably find the stream-line-column
at the current position of a file stream despite encoding issues. It
was quite annoying to not have read/write consistency for the position.
PS: thanks to Nikodemus for updating ASDF to 2.009.
Changed in sbcl: | |
assignee: | nobody → Nikodemus Siivola (nikodemus) |
status: | Triaged → In Progress |
Changed in sbcl: | |
status: | Fix Committed → Fix Released |
Note that the code that implements this was correct back in the days of CMUCL that only had 8-bit encodings.