Partly-wild logical pathname components don't work

Bug #1888552 reported by Tim Bradshaw on 2020-07-22
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Undecided
Unassigned

Bug Description

Given a logical host defined as

(setf (logical-pathname-translations "FUNGE")
      `(("SPOT;**;*.*.*"
         ,(merge-pathnames
           (make-pathname
            :directory '(:relative "funge" "spot" :wild-inferiors)
            :name :wild :type :wild :version :wild)
           (user-homedir-pathname)))))

Then

(parse-namestring "FUNGE:SPOT;*.LISP")

Works. But

(make-pathname :host "FUNGE"
               :device nil
               :directory '(:absolute "SPOT")
               :name "*"
               :type "LISP"
               :version nil
               :defaults nil)

Raises an error (sb-kernel:namestring-parse-error), complaining that the name is not valid, because it's got wildcard characters in it.

But in fact it is valid I am almost sure: From http://www.lispworks.com/documentation/HyperSpec/Body/19_ca.htm it's clear that a name is either a word or a wildcard-word, and "*" is a wildcard-word.

It's possible that the intention is that any string for a name (or other component) can never denote a wildcard, but that makes constructing pathnames extremely painful if they have name components like "*-FOO" say: I can't see any way of constructing such a pathname which does not involve jumping through enormous hoops (parsing a namestring, extracting the name from it, putting this back into another pathname).

If that is the case then this isn't a bug.

SBCL version: 2.0.6

uname -a:

Linux ts 5.4.0-40-generic #44-Ubuntu SMP Tue Jun 23 00:01:04 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

*features*

(:swank :org.tfeb.tools.require-module :quicklisp :asdf3.3 :asdf3.2 :asdf3.1
 :asdf3 :asdf2 :asdf :os-unix :non-base-chars-exist-p :asdf-unicode :x86-64
 :gencgc :64-bit :ansi-cl :common-lisp :elf :ieee-floating-point :linux
 :little-endian :package-local-nicknames :sb-ldb :sb-package-locks :sb-thread
 :sb-unicode :sbcl :unix)

Richard M Kreuter (kreuter) wrote :

Here's a go at the argument that this is a conformance issue:

A.1. 19.3.1.6 says wildcard-words parse as strings, and so it ought to be the case that

(pathname-name #P"SYS:A;B*")
=> "B*"

A.2. 19.2.2.5 says ``Any component can be taken from the corresponding component of another pathname.''

Therefore, it seems that the domain of NAME arguments for logical pathnames constructible by MAKE-PATHNAME [*] ought to be at least the range of name components that can be the result of a logical pathname parse. And so

(make-pathname :name "B*" :defaults #P"SYS:A;")

ought to work.

I guess one swing some counterargument that 19.3.1.6's phrase ``other wildcard-words parse into strings'' doesn't explicitly say that the resulting component will /be/ (or contain, for the directory) a string. That seems a little tortured.

[*] It's anybody's guess whether MAKE-PATHNAME is required to be able to construct logical pathnames. The HOST argument is defined to be a ``valid physical pathname host''. It would be intuitive to suppose, based on the concepts' names, that ``valid physical pathname host'' and ``valid logical pathname host'' ought to be a partition of ``valid pathname host'', but they're not explicitly said to be disjoint, and ``valid physical pathname host'' is defined as an object ``that is recognized by the implementation as the name of a host'', and so could be construed to include all valid logical pathname hosts (making the ``or a valid logical pathname host'' in the definition of ``valid pathname host'' redundant). But that's an orthogonal defect in the standard.

Further if you suppose that MAKE-PATHNAME really isn't supposed to be able to take logical host designator, i.e., it's permitted of an implementation to have that be a type error, then it's unclear what's supposed to happen if the DEFAULTS argument is a logical pathname and HOST is unsupplied. That's maybe a second orthogonal defect.

(Btw., that :DEFAULTS NIL in your example is definitely not kosher.)

Tim Bradshaw (tfeb) wrote :

Agree the NIL default is wrong, sorry about that.

I think (having now read more carefully) that the spec is unclear on whether MAKE-PATHNAME can make logical pathnames. However I think that's a deficiency in the spec: I think implementations should be able to reason about and construct logical pathnames at least as well as they can about other pathnames (and in fact better, because more of them is specified). So if the spec is read so MAKE-PATHNAME does not need to be able to make logical pathnames, then I think this isn't a conformance problem but a feature request, but I think it's a feature worth having.

(In my actual use-case I've solved the problem by simply having namestrings with wildcards in which then get turned into logical pathnames and then TRANSLATE-PATHNAME works on them fine. This makes the thing a little more limited but in fact I was never using the flexibility of having components broken out, so it does not matter.)

Richard M Kreuter (kreuter) wrote :

I agree it's good for MAKE-PATHNAME to be able to make logical pathnames. Nothing precludes implementations doing that, and SBCL does, so that's not at issue.

What is seemingly non-conforming for SBCL is the use of PATTERNs rather than strings for wildcard words for logical pathnames.

In case it matters to you, there are as of this writing a number of of unresolved logical pathname bugs in SBCL. A launchpad search for "logical pathname" uncovers at least a half dozen, a couple of which have bitten me over time.

Richard M Kreuter (kreuter) wrote :

(This is just a followup on the comment about the type of the HOST parameter to MAKE-PATHNAME. SBCL's use of PATTERNs in logical pathname components is still a prospective conformance issue.)

Upon further examination, I think there are a few ways to make sense of the peculiarities here:

1. Noting that MAKE-PATHNAME that uses the term ``valid physical pathname host'', suppose that's simply an editing error, and that it ought to say ``valid pathname host''.

2. Interpret the definition of ``valid physical pathname host'' as an editing error, since it appears to describe all host names. I think there are two ways it could have been written differently:

2a. Suppose that it ought to be read only to describe the names of hosts that aren't logical hosts.
2b. Suppose that it ought to be read only to describe names that aren't the names of logical hosts.

The difference between 2a and 2b is that 2a would permit an implementation to have two host namespaces, while 2b wouldn't. Probably nobody cares about this, but in the Cleanup committee discussion that got logical pathnames into ANSI, there was some consideration of the desirability of multiple host namespaces so that logical host names need not conflict with physical file system names.

But the real "eureka!" for me here is that if an implementation wanted to have two host namespaces, then ``valid physical pathname host'' and ``valid logical pathname host'' should NOT comprise a partition of ``valid pathname host'', since certain names could name each kind of host simultaneously.

3. But maybe the simplest interpretation is to note that since the definition of ``valid physical pathname host'' doesn't qualify the kind of host the object might name, if you forego analysis of the name of the concept and don't imagine extra words that aren't there, then MAKE-PATHNAME is required to accept all names for all hosts as a HOST argument.

I think this turns out to be the best(?) interpretation practically, and it's also perhaps compatible with what the Cleanup committee did agree the standard was supposed to mean (or say, it's a little unclear):

``If an attempt is made to construct a pathname specifying a host name that is defined both as a physical and a logical host, the effects are implementation defined.''

http://www.lispworks.com/documentation/HyperSpec/Issues/iss258_w.htm

This interpretation is (IMO) a slightly sloppier way to get to the same effect as option 1 plus option 2a, though.

Nevertheless, I guess it'd be reasonable for some implementor to make a reading like option 2b without option 1, and so have MAKE-PATHNAME only accept names for physical hosts. If this is a reasonable way to read the standard, programs that supply MAKE-PATHNAME a valid logical pathname host or a logical host for the HOST argument might be implicitly non-portable, all the same.

Tim Bradshaw (tfeb) wrote :

OK, I can see the problem now. If I say (make-pathname ... :host h) then, on systems where hosts make sense for physical pathnames (which they still might even on Unixy machines and did on Windows last time I looked) then if h names *both* a physical and a logical host you're in trouble, because you need to decide which. And whether or not h exists as a physical host can change depending on (say) what's in DNS or the hosts file this week.

But I am clearly confused about the difference between the name of a logical host and a logical host. There seems to be an implication that these might be different things (different types), but the spec implies that they are both strings (or lists of strings or :unspecific for physical pathname hosts), I think. If that's correct then the only way there could be two namespaces is that make-pathname needs not to work with logical hosts at all, since there is no way apart from that to decide which namespace "FOO" belongs to. That seriously reduces the utility of make-pathname for anyone wanting to reason about logical pathnames. In particular it means that

(make-pathname ... :host (pathname-host <pathmame-which-may-be-logical>))

should return a physical pathname.

My personal take would be make-pathname should accept both things, but that if there is a defined logical host then it should take precedence, because this is something the program did, rather than an accident of the environment in which the program is running. I think that's compatible with PATHNAME-HOST-PARSING (however there are errors in that: they talk about (make-pathname (pathname-host ...)) and that can't ever make sense: I *think* they must mean (make-pathname :host (pathname-host ...))), although it decides what 'implementation defined' means.

I had forgotten how many worms there were in all this.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers