Logical pathname namestrings don't (and can't) all round-trip through unparsing & parsing

Bug #1789108 reported by Richard M Kreuter
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
New
Undecided
Unassigned

Bug Description

There's a subtle lossiness in logical pathname namestring production & parsing:

--------
* (namestring (make-pathname :host "SYS" :directory nil :name "A"))

"SYS:A"
* (pathname-directory (make-pathname :host "SYS" :directory nil :name "A"))

NIL
* (namestring (make-pathname :host "SYS" :directory nil :name "A"))

"SYS:A"
* (pathname-directory
   (parse-namestring (namestring (make-pathname :host "SYS" :directory nil :name "A"))))

(:ABSOLUTE)
--------

Due to the semantics of MERGE-PATHNAMES, these are functionally quite different pathnames:

--------
* (let ((p (make-pathname :host "SYS" :directory nil :name "A"))
        (d "sys:x;y.z.newest"))
    (values (merge-pathnames p d)
            (merge-pathnames (parse-namestring (namestring p)) d)))

#P"SYS:X;A.Z.NEWEST"
#P"SYS:A.Z.NEWEST"
--------

The consequence is that one of NIL or (:ABSOLUTE) should not have a standard namestring. Probably the most SBCL-like thing to do is to error when trying to produce a namestring for one of them, an to print that one with #<>. (It would also be possible to extend LPN syntax, but that's not how SBCL usually does things.)

[Editorial: if anyone cares to work on this, I believe it's a superior choice to say that (:ABSOLUTE) is the one that does not have a standard namestring and shouldn't be the result of a parse. Here's my analysis of why.

1. ANSI 19.3.1.1.3 paragraph 1 appears to motivate many extant implementations to parse a logical pathname namestring containing no directory markers to have a directory of (:ABSOLUTE). However, nothing explicitly indicates that 19.3.1.1.3 is required to apply to logical pathname namestrings that do not contain directories. IOW, the section, inclusive of its title, "The Directory part of a Logical Pathname Namestring", can be understood to be a description of how to parse logical pathname namestrings containing a directory part, rather than a prescription for all logical pathname namestrings. (Cf., the remainder of 19.3.1.1 does not explicitly state that the absence of the name, type, or version parts of a logical pathname namestring are supposed to parse to NIL, though that's a reasonable and IMO correct choice.)

2. As far as the intent of the standard, the issue PATHNAME-LOGICAL indicates that "The Genera facility uses the same logical pathname syntax". However, Genera's logical pathnames appear to have parsed logical pathname namestrings not containing a directory part to have NIL for the directory component. (If anyone has access to the MIT CADR sources circa 1980, you'll find a source file called "pathnm" that defines a slightly simpler logical pathname namestring syntax that also parses the absence of directory words to a NIL directory component. It's phylogenetically plausible to suppose that the Genera capability was upwardly compatible with the CADR one.)

Of course, CLHS issues and authorial intent are non-binding, but I think they are reasonable evidence to the point that 19.3.1.1.3 is of ambiguous applicability to LPN namestrings that do not contain a directory part.

3. So if one grants that the parse "SYS:A" might (indeed, does, and, per points 1 & 2, conformingly so in the range of valid interpretations of ANSI) differ across implementations, then it's worth noting that each of these two options has different consequences for how users can employ LPN namestrings and set up LPN translations on a given implementation, and that users who expect that those consequences apply to be required of all implementations have always been mistaken.

For example, consider

  (open "FOO:A.B.NEWEST")

On implementations that parse that namestring to have NIL for the directory, OPEN will operate on a pathname that has inherited the directory component of *DEFAULT-PATHNAME-DEFAULTS* due to the implicit merge; on implementations that parse the namestring to have (:ABSOLUTE), OPEN will operate on a pathname that does not inherit the directory component of *DEFAULT-PATHNAME-DEFAULTS*. A user who expects the first parse, NIL, can probably be expected to key in such a form only when they've set their *DEFAULT-PATHNAME-DEFAULTS* to a logical pathname on host FOO. By contrast, a user who expects the second parse, (:ABSOLUTE), will likely not bother manipulating *DEFAULT-PATHNAME-DEFAULTS* before keying in such a form. Both users would likely be surprised by the consequences when they try an implementation that does things the other way.

4. However, there is one, IMO significant, way in which the NIL parse is superior to (:ABSOLUTE), in that it makes logical pathname namestrings more like physical pathname namestrings for the most important use case for namestrings: inputting in just the name part. In particular, there has probably never been a physical file system that /required/ a syntax to indicate ``please construe this as relative to the current device & directory in the environment'', and so

  (open "a")

has probably always and everywhere opened a file whose name part was "a" or "A" in the current environment when *DEFAULT-PATHNAME-DEFAULTS* is a physical pathname, for every physical file system that's ever existed. And in logical pathnames implementations that parse "a" to have a directory component of NIL, the user can use identical input for the name part (or parse the same input from a file, etc.) whether they're addressing files directly with physical pathnames or indirectly with logical pathnames.

But for logical pathnames implementations that parse "a" to have a directory component of (:ABSOLUTE), it's necessary to key in

  (open ";a")

to get the desired defaulting semantic when *DEFAULT-PATHNAME-DEFAULTS* is a logical pathname. For interactive use (when you're typing in a namestring at the toplevel and you're familiar with or can see your current environment) this is at worst a minor drudge, but for programmatic handling of arbitrary inputs it's a source of potential errors.

IOW, LPNs don't need to be quite so clunky and weird as many implementors appear to have interpreted; and this particular clunkiness and weirdness, should, IMO, serve as an indication that ANSI should not be interpreted that way.]

Bug report boilerplate:

$ uname -a
Darwin m5.localdomain 14.5.0 Darwin Kernel Version 14.5.0: Wed Jul 29 02:26:53 PDT 2015; root:xnu-2782.40.9~1/RELEASE_X86_64 x86_64
$ !sh
sh ./run-sbcl.sh --no-userinit --no-sysinit
This is SBCL 1.4.10.145-0ec8b87b2-dirty, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses. See the CREDITS and COPYING files in the
distribution for more information.
* *features*

(:X86-64 :64-BIT :64-BIT-REGISTERS :ALIEN-CALLBACKS :ANSI-CL :ASH-RIGHT-VOPS
 :BSD :C-STACK-IS-CONTROL-STACK :CALL-SYMBOL :COMMON-LISP
 :COMPACT-INSTANCE-HEADER :COMPARE-AND-SWAP-VOPS :COMPLEX-FLOAT-VOPS
 :CYCLE-COUNTER :DARWIN :DARWIN9-OR-BETTER :FLOAT-EQL-VOPS
 :FP-AND-PC-STANDARD-SAVE :GENCGC :IEEE-FLOATING-POINT :IMMOBILE-CODE
 :IMMOBILE-SPACE :INLINE-CONSTANTS :INODE64 :INTEGER-EQL-VOP :LINKAGE-TABLE
 :LITTLE-ENDIAN :MACH-EXCEPTION-HANDLER :MACH-O :MEMORY-BARRIER-VOPS
 :MULTIPLY-HIGH-VOPS :OS-PROVIDES-BLKSIZE-T :OS-PROVIDES-DLADDR
 :OS-PROVIDES-DLOPEN :OS-PROVIDES-PUTWC :OS-PROVIDES-SUSECONDS-T
 :PACKAGE-LOCAL-NICKNAMES :RAW-INSTANCE-INIT-VOPS :RAW-SIGNED-WORD
 :RELOCATABLE-HEAP :SB-DOC :SB-EVAL :SB-LDB :SB-PACKAGE-LOCKS :SB-SIMD-PACK
 :SB-SOURCE-LOCATIONS :SB-THREAD :SB-UNICODE :SBCL :STACK-ALLOCATABLE-CLOSURES
 :STACK-ALLOCATABLE-FIXED-OBJECTS :STACK-ALLOCATABLE-LISTS
 :STACK-ALLOCATABLE-VECTORS :STACK-GROWS-DOWNWARD-NOT-UPWARD :SYMBOL-INFO-VOPS
 :UD2-BREAKPOINTS :UNBIND-N-VOP :UNDEFINED-FUN-RESTARTS :UNIX
 :UNWIND-TO-FRAME-AND-CALL-VOP)

description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.