SB-IMPL::CANONICALIZE-PATHNAME handles :UP incorrectly

Bug #1740777 reported by Richard M Kreuter
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
New
Undecided
Unassigned

Bug Description

SB-IMPL::CANONICALIZE-PATHNAME handles :UP poorly.

1. SB-IMPL::CANONICALIZE-PATHNAME treats :UP the same as :BACK:

--
$ sh ./run-sbcl.sh --no-userinit --no-sysinit
(running SBCL from: .)
This is SBCL 1.4.3.46-2f2210d, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses. See the CREDITS and COPYING files in the
distribution for more information.
* (sb-impl::canonicalize-pathname (make-pathname :directory '(:absolute "tmp" :up)))

#P"/"
* (sb-impl::canonicalize-pathname (make-pathname :directory '(:absolute "tmp" :back)))

#P"/"
--

This treatment misses the difference between :UP and :BACK, specifically ":UP differs from :BACK only in file systems that support multiple names for directories, perhaps via symbolic links" [Hyperspec 19.2.2.4.3].

The punchline is that on Unix (at least since the introduction of symlinks), it's incorrect to elide :UP as a syntactic operation. So, for example, on my OSX host:

--
* (truename "/tmp/../")

#P"/private/"
* (let* ((tmp-up (pathname "/tmp/../")) (true-up (truename tmp-up)))
    (null (set-difference (directory tmp-up) (directory true-up)
                          :test #'equal)))

NIL
* (let* ((tmp-up (pathname "/tmp/../")) (true-up (truename tmp-up)))
    (null (set-difference (directory (merge-pathnames "*.*" tmp-up))
                          (directory (merge-pathnames "*.*" true-up))
     :test #'equal)))

NIL
--

I would expect the last two forms to return T.

As for what to do about it: on Unix, don't elide :UP in the argument to DIRECTORY. When DIRECTORY is called with :RESOLVE-SYMLINKS true (the default), the directory will get cleaned up "on the way out" by realpath(3). When :RESOLVE-SYMLINKS is NIL, I think it's appropriate to return pathnames having :UP in the directory. (If it's desired to "clean up" those directories, one can map

(lambda (pathname)
  (merge-pathnames
   (truename (make-pathname :name nil :type nil :version nil
                            :defaults pathname))
   pathname))

over the result of DIRECTORY. If anyone cares a bunch about this idea, it could be another keyword to DIRECTORY, maybe.)

For Windows, I don't really know the platform. My understanding is that recent versions support symbolic links to directories; if so, I'd guess that eliding :UP would be inappropriate there, too.

2. CANONICALIZE-PATHNAME validates for the use of :UP after :ABSOLUTE, presumably intended as an implementation of the sentence "[u]sing :ABSOLUTE or :WILD-INFERIORS immediately followed by :UP or :BACK signals an error of type FILE-ERROR." [Hyperspec 19.2.2.4.3] This has a couple problems:

2a. This validation combines poorly the :UP elision. In particular, any absolute directory having more :UP elements than non-:UP elements, in any order, will signal this FILE-ERROR, because once you've elided the non-:UP elements, you'll have a directory starting with (:ABSOLUTE :UP):

* (directory "/bin/../../")
  0: (SB-IMPL::CANONICALIZE-PATHNAME #P"/bin/../../")

debugger invoked on a SB-INT:SIMPLE-FILE-ERROR in thread
#<THREAD "main thread" RUNNING {1001968083}>:
  Invalid use of :UP after :ABSOLUTE.

Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.

restarts (invokable by number or by possibly-abbreviated name):
  0: [ABORT] Exit debugger, returning to top level.

(SB-IMPL::CANONICALIZE-PATHNAME #P"/bin/../../")
0] :abort

Getting rid of the :UP elision resolves this defect.

2b. I think SBCL on Unix ought to deviate from ANSI on the requirement that :ABSOLUTE followed immediately by :UP signal an error, because that is the the obvious pathname representation of a valid (albeit not canonical) name for an absolute directory.

[Editorial: as a matter of history, the Cleanup committee that produced this prohibition didn't uniformly grok Unix's "dotdot" links. The prohibition on (:ABSOLUTE :BACK) makes a decent amount of sense given how Multics, the LispMs, and VMS file specifications worked; I believe the prohibition on (:ABSOLUTE :UP) was an overzealous analogy.]

The attached patch gets rid of the :UP elision, and so solves problems 1 and 2a; but only for Unix, because I do not profess to know what the correct thing to do on Windows is.

If anyone reading this would like me to also submit a change to effect 2b, please let me know.

Revision history for this message
Richard M Kreuter (kreuter) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.