SB-IMPL::CANONICALIZE-PATHNAME handles :UP incorrectly

Bug #1740777 reported by Richard M Kreuter on 2018-01-01
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Undecided
Unassigned

Bug Description

SB-IMPL::CANONICALIZE-PATHNAME handles :UP poorly.

1. SB-IMPL::CANONICALIZE-PATHNAME treats :UP the same as :BACK:

--
$ sh ./run-sbcl.sh --no-userinit --no-sysinit
(running SBCL from: .)
This is SBCL 1.4.3.46-2f2210d, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses. See the CREDITS and COPYING files in the
distribution for more information.
* (sb-impl::canonicalize-pathname (make-pathname :directory '(:absolute "tmp" :up)))

#P"/"
* (sb-impl::canonicalize-pathname (make-pathname :directory '(:absolute "tmp" :back)))

#P"/"
--

This treatment misses the difference between :UP and :BACK, specifically ":UP differs from :BACK only in file systems that support multiple names for directories, perhaps via symbolic links" [Hyperspec 19.2.2.4.3].

The punchline is that on Unix (at least since the introduction of symlinks), it's incorrect to elide :UP as a syntactic operation. So, for example, on my OSX host:

--
* (truename "/tmp/../")

#P"/private/"
* (let* ((tmp-up (pathname "/tmp/../")) (true-up (truename tmp-up)))
    (null (set-difference (directory tmp-up) (directory true-up)
                          :test #'equal)))

NIL
* (let* ((tmp-up (pathname "/tmp/../")) (true-up (truename tmp-up)))
    (null (set-difference (directory (merge-pathnames "*.*" tmp-up))
                          (directory (merge-pathnames "*.*" true-up))
     :test #'equal)))

NIL
--

I would expect the last two forms to return T.

As for what to do about it: on Unix, don't elide :UP in the argument to DIRECTORY. When DIRECTORY is called with :RESOLVE-SYMLINKS true (the default), the directory will get cleaned up "on the way out" by realpath(3). When :RESOLVE-SYMLINKS is NIL, I think it's appropriate to return pathnames having :UP in the directory. (If it's desired to "clean up" those directories, one can map

(lambda (pathname)
  (merge-pathnames
   (truename (make-pathname :name nil :type nil :version nil
                            :defaults pathname))
   pathname))

over the result of DIRECTORY. If anyone cares a bunch about this idea, it could be another keyword to DIRECTORY, maybe.)

For Windows, I don't really know the platform. My understanding is that recent versions support symbolic links to directories; if so, I'd guess that eliding :UP would be inappropriate there, too.

2. CANONICALIZE-PATHNAME validates for the use of :UP after :ABSOLUTE, presumably intended as an implementation of the sentence "[u]sing :ABSOLUTE or :WILD-INFERIORS immediately followed by :UP or :BACK signals an error of type FILE-ERROR." [Hyperspec 19.2.2.4.3] This has a couple problems:

2a. This validation combines poorly the :UP elision. In particular, any absolute directory having more :UP elements than non-:UP elements, in any order, will signal this FILE-ERROR, because once you've elided the non-:UP elements, you'll have a directory starting with (:ABSOLUTE :UP):

* (directory "/bin/../../")
  0: (SB-IMPL::CANONICALIZE-PATHNAME #P"/bin/../../")

debugger invoked on a SB-INT:SIMPLE-FILE-ERROR in thread
#<THREAD "main thread" RUNNING {1001968083}>:
  Invalid use of :UP after :ABSOLUTE.

Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.

restarts (invokable by number or by possibly-abbreviated name):
  0: [ABORT] Exit debugger, returning to top level.

(SB-IMPL::CANONICALIZE-PATHNAME #P"/bin/../../")
0] :abort

Getting rid of the :UP elision resolves this defect.

2b. I think SBCL on Unix ought to deviate from ANSI on the requirement that :ABSOLUTE followed immediately by :UP signal an error, because that is the the obvious pathname representation of a valid (albeit not canonical) name for an absolute directory.

[Editorial: as a matter of history, the Cleanup committee that produced this prohibition didn't uniformly grok Unix's "dotdot" links. The prohibition on (:ABSOLUTE :BACK) makes a decent amount of sense given how Multics, the LispMs, and VMS file specifications worked; I believe the prohibition on (:ABSOLUTE :UP) was an overzealous analogy.]

The attached patch gets rid of the :UP elision, and so solves problems 1 and 2a; but only for Unix, because I do not profess to know what the correct thing to do on Windows is.

If anyone reading this would like me to also submit a change to effect 2b, please let me know.

Richard M Kreuter (kreuter) wrote :
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers