Comment 3 for bug 256453

Revision history for this message
Derick Eddington (derick-eddington) wrote : Re: [Bug 256453] [NEW] (command-line) cannot handle multi-byte command-line arguments

On Sat, 2008-08-09 at 20:05 +0000, Derick Eddington wrote:
> There's another issue I noticed that led me to discover the above:
>
> ikarus-main.c, line 79:
>
> string_set(str, i, integer_to_char(s[i]));
>
> this will transcode the command-line arguments as Latin-1. Instead, I
> suggest the command-line arguments be put in bytevectors and then in
> Scheme code the bytevectors are transcoded using (native-transcoder).
> I'll try it out.

Here's what I've done:

[This requires manually running makefile.ss with an old/working ikarus
to build a new ikarus.boot because the old initialization expression for
command-line-arguments in ikarus.boot.4.prebuilt does not work with the
new ikarus process; in fact it segfaults when the guard for
command-line-arguments attempts to call die, and this patch also fixes
that (rare) issue.]

=== modified file 'scheme/ikarus.command-line.ss'
--- scheme/ikarus.command-line.ss 2008-01-29 05:34:34 +0000
+++ scheme/ikarus.command-line.ss 2008-08-10 03:45:25 +0000
@@ -22,10 +22,12 @@

   (define (command-line) (command-line-arguments))
   (define command-line-arguments
- (make-parameter ($arg-list)
+ (make-parameter
+ (map (lambda (bv)
+ (bytevector->string bv (native-transcoder)))
+ ($arg-list))
       (lambda (x)
         (if (and (list? x) (andmap string? x))
             x
- (die 'command-list
- "invalid command-line-arguments ~s\n" x))))))
+ (die 'command-line-arguments "not a list of strings" x))))))

=== modified file 'scheme/makefile.ss'
--- scheme/makefile.ss 2008-08-08 15:29:18 +0000
+++ scheme/makefile.ss 2008-08-10 04:02:57 +0000
@@ -74,7 +74,6 @@
     "ikarus.numerics.ss"
     "ikarus.conditions.ss"
     "ikarus.guardians.ss"
- "ikarus.command-line.ss"
     "ikarus.codecs.ss"
     "ikarus.bytevectors.ss"
     "ikarus.posix.ss"
@@ -105,6 +104,7 @@
     "ikarus.promises.ss"
     "ikarus.enumerations.ss"
     "ikarus.not-yet-implemented.ss"
+ "ikarus.command-line.ss"
     "ikarus.main.ss"
     ))

=== modified file 'src/ikarus-main.c'
--- src/ikarus-main.c 2008-08-09 12:47:44 +0000
+++ src/ikarus-main.c 2008-08-10 03:02:00 +0000
@@ -70,17 +70,12 @@
     while(i > 0){
       char* s = argv[i];
       int n = strlen(s);
- ikptr str = ik_unsafe_alloc(pcb, align(n*string_char_size+disp_string_data+1))
- + string_tag;
- ref(str, off_string_length) = fix(n);
- {
- int i;
- for(i=0; i<n; i++){
- string_set(str, i, integer_to_char(s[i]));
- }
- }
+ ikptr bv = ik_unsafe_alloc(pcb, align(disp_bytevector_data+n+1))
+ + bytevector_tag;
+ ref(bv, off_bytevector_length) = fix(n);
+ memcpy((char*)(bv+off_bytevector_data), s, n+1);
       ikptr p = ik_unsafe_alloc(pcb, pair_size);
- ref(p, disp_car) = str;
+ ref(p, disp_car) = bv;
       ref(p, disp_cdr) = arg_list;
       arg_list = p+pair_tag;
       i--;

$ ./command-line ქართული 한국어 Ελληνικ
("./command-line" "ქართული" "한국어" "Ελληνικ")