octets-to-string use-value does not function properly with EUC-JP encoding

Bug #314939 reported by Tapio Saarinen
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Fix Released
Medium
Unassigned

Bug Description

(defvar *euc-jp* (drakma:http-request "http://blogs.yahoo.co.jp/akira_w0325/27287392.html" :external-format-in :euc-jp :external-format-out :euc-jp))

;;The following works as expected:
(handler-bind ((sb-int:character-decoding-error
                         #'(lambda (c)
                             (invoke-restart 'use-value #\?))))
             (sb-ext:octets-to-string *euc-jp* :external-format :ascii))

;;This, however, does not, complaining that "?" is not a character:
(handler-bind ((sb-int:character-decoding-error
                         #'(lambda (c)
                             (invoke-restart 'use-value #\?))))
             (sb-ext:octets-to-string *euc-jp* :external-format :euc-jp))

Found on 1.0.18, same behavior is exhibited by 1.0.23 and CVS HEAD.

uname -a:
Linux Whimsy 2.6.27-ARCH #1 SMP PREEMPT Sat Nov 8 10:11:43 UTC 2008 i686 Intel(R) Pentium(R) Dual CPU T2370 @ 1.73GHz GenuineIntel GNU/Linux

*FEATURES* (SBCL-1.0.18)
(:CHUNGA :SPLIT-SEQUENCE CFFI-FEATURES:FLAT-NAMESPACE CFFI-FEATURES:X86
 CFFI-FEATURES:UNIX :CFFI CFFI-SYS::FLAT-NAMESPACE :LITTLE-ENDIAN
 :FLEXI-STREAMS :SB-BSD-SOCKETS-ADDRINFO :ASDF :SB-THREAD :ANSI-CL :COMMON-LISP
 :SBCL :SB-DOC :SB-TEST :SB-LDB :SB-PACKAGE-LOCKS :SB-UNICODE :SB-EVAL
 :SB-SOURCE-LOCATIONS :IEEE-FLOATING-POINT :X86 :UNIX :ELF :LINUX :LARGEFILE
 :GENCGC :STACK-GROWS-DOWNWARD-NOT-UPWARD :C-STACK-IS-CONTROL-STACK
 :COMPARE-AND-SWAP-VOPS :UNWIND-TO-FRAME-AND-CALL-VOP :RAW-INSTANCE-INIT-VOPS
 :STACK-ALLOCATABLE-CLOSURES :ALIEN-CALLBACKS :CYCLE-COUNTER :LINKAGE-TABLE
 :OS-PROVIDES-DLOPEN :OS-PROVIDES-PUTWC :OS-PROVIDES-SUSECONDS-T)

description: updated
Changed in sbcl:
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Matthias Benkard (mulk) wrote :

For my brief analysis of the problem, see: http://stackoverflow.com/questions/420300/lisp-need-help-getting-correct-behaviour-from-sbcl-when-converting-octet-stream/425000#425000

In short, I think there might be a bug in mb-util.lisp. I haven't looked at the current version of the source, so this may or may not have been fixed since 1.0.18.

Revision history for this message
Christophe Rhodes (csr21-cantab) wrote : Fix in my external-formats branch

Hi,

 status inprogress
 tag external-formats octets
 done

I have committed what I think is a fix for this into my external-formats
branch, available at http://rvw.doc.gold.ac.uk/sullivan/git/sbcl.git
From git, viewable at
http://rvw.doc.gold.ac.uk/sullivan/cgi-bin/gitweb.cgi

I'd appreciate it if you could check to see that my fix actually works
for your purposes; I am not a user of any of the multibyte encodings
defined through mb-util, so it would be good to know that I haven't
broken anything else and that this does in fact fix this problem.

Best,

Christophe

Changed in sbcl:
status: Confirmed → In Progress
Revision history for this message
Christophe Rhodes (csr21-cantab) wrote : Branch external-formats committed

Hi,

 status fixcommitted
 done

I merged my external-formats branch as the commits between 1.0.32.15 and
1.0.32.25. I think the fix for this bug was in sbcl-1.0.32.16; I hope
it works for you.

Christophe

Changed in sbcl:
status: In Progress → Fix Committed
Revision history for this message
Tapio Saarinen (q-admin-bitlong-org) wrote :

Hi,
Sorry for the delay in responding. Everything seems to be working fine using the latest SBCL sources which is great.

Thanks!

Tapio

Changed in sbcl:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.