DELETE-DIRECTORY loses when supplied an "as-file" pathname containing special characters in the name or type.

Bug #1740624 reported by Richard M Kreuter on 2017-12-30
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
SBCL
Medium
Jan Moringen

Bug Description

DELETE-DIRECTORY attempts to operate on the wrong file when supplied an argument having a name or type that contains characters that have special namestring syntax:

--
$ sh ./run-sbcl.sh --no-userinit --no-sysinit
(running SBCL from: .)
This is SBCL 1.4.3.20-1af2446-dirty, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.

SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses. See the CREDITS and COPYING files in the
distribution for more information.
* (setq *default-pathname-defaults* (ensure-directories-exist #p"/tmp/delete-directory/" :verbose t))

creating directory: /tmp/delete-directory/
#P"/tmp/delete-directory/"
* (ensure-directories-exist "a\\*b/")

"a\\*b/"
T
* (probe-file "a\\*b")

#P"/private/tmp/delete-directory/a\\*b/"
* (delete-directory "a\\*b")

debugger invoked on a SB-INT:SIMPLE-FILE-ERROR in thread
#<THREAD "main thread" RUNNING {1001968083}>:
  Could not delete directory /tmp/delete-directory/a\*b:
    No such file or directory

Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.

restarts (invokable by number or by possibly-abbreviated name):
  0: [ABORT] Exit debugger, returning to top level.

((LABELS SB-IMPL::DELETE-DIR :IN DELETE-DIRECTORY) #P"/tmp/delete-directory/a\\\\\\*b/")
0]
--

The problem is SB-IMPL::DIRECTORIZE-PATHNAME, which appends a Lisp namestring to the directory component; but SBCL's pathname components represent parsed namestring fields:

--
* (in-package "SB-IMPL")

#<PACKAGE "SB-IMPL">
* (directorize-pathname (merge-pathnames "a\\*b" "/"))

#P"/a\\\\\\*b/"
* (pathname-directory (directorize-pathname (merge-pathnames "a\\*b" "/")))

(:ABSOLUTE "a\\*b")
* (pathname-name "a\\*b")

"a*b"
--

Given its docstring, I'd expect DELETE-DIRECTORY to operate on the directory identified by the pathname.

I propose redefining DIRECTORIZE-PATHNAME (perhaps as a local function) with body

(nth-value 0 (parse-native-namestring (native-namestring pathname)
                                      nil pathname :as-directory t))

and document that DELETE-DIRECTORY can't take a wild pathname.

Rationale: in defense of going through the "native namestring" routines, this way of doing things ensures that the result of DIRECTORIZE-PATHNAME agrees with filename denotation semantics (because NATIVE-NAMESTRING implements filename denotation).

As for restricting DIRECTORIZE-PATHNAME and its clients to non-wild pathnames, it's not clear that there's a correct semantic for DIRECTORIZE-PATHNAME over the domain of wild pathnames, at least if it's a goal that the resulting directory-form pathname should match the directories that would be matched by the file-form pathname. Consider:

(setq p (make-pathname :defaults #p"/" :name "a" :type :wild)
      d (directorize-pathname p))

Since P matches #P"/a" and, say, #P"/a.b", but not, for example, #P"/ab", if it's desired that D match #P"/a/" and #P"/a.b/" and not "/ab/", the trouble is that that kind of matching in a directory is not expressible with SBCL's pathname functionality. (And I don't think it's worth entertaining adding that expressiveness for this reason.)

Other requested info:

--
$ uname -a
Darwin m5.localdomain 14.5.0 Darwin Kernel Version 14.5.0: Wed Jul 29 02:26:53 PDT 2015; root:xnu-2782.40.9~1/RELEASE_X86_64 x86_64

* *features*

(:64-BIT :64-BIT-REGISTERS :ALIEN-CALLBACKS :ANSI-CL :ASH-RIGHT-VOPS :BSD
 :C-STACK-IS-CONTROL-STACK :COMMON-LISP :COMPACT-INSTANCE-HEADER
 :COMPARE-AND-SWAP-VOPS :COMPLEX-FLOAT-VOPS :CYCLE-COUNTER :DARWIN
 :DARWIN9-OR-BETTER :FLOAT-EQL-VOPS :FP-AND-PC-STANDARD-SAVE :GENCGC
 :IEEE-FLOATING-POINT :IMMOBILE-CODE :IMMOBILE-SPACE :INLINE-CONSTANTS :INODE64
 :INTEGER-EQL-VOP :LINKAGE-TABLE :LITTLE-ENDIAN :MACH-EXCEPTION-HANDLER :MACH-O
 :MEMORY-BARRIER-VOPS :MULTIPLY-HIGH-VOPS :OS-PROVIDES-BLKSIZE-T
 :OS-PROVIDES-DLADDR :OS-PROVIDES-DLOPEN :OS-PROVIDES-PUTWC
 :OS-PROVIDES-SUSECONDS-T :PACKAGE-LOCAL-NICKNAMES :PRECISE-ARG-COUNT-ERROR
 :RAW-INSTANCE-INIT-VOPS :RAW-SIGNED-WORD :RELOCATABLE-HEAP :SB-DOC :SB-EVAL
 :SB-LDB :SB-PACKAGE-LOCKS :SB-SIMD-PACK :SB-SOURCE-LOCATIONS :SB-THREAD
 :SB-UNICODE :SBCL :STACK-ALLOCATABLE-CLOSURES :STACK-ALLOCATABLE-FIXED-OBJECTS
 :STACK-ALLOCATABLE-LISTS :STACK-ALLOCATABLE-VECTORS
 :STACK-GROWS-DOWNWARD-NOT-UPWARD :SYMBOL-INFO-VOPS :UD2-BREAKPOINTS
 :UNBIND-N-VOP :UNDEFINED-FUN-RESTARTS :UNIX :UNWIND-TO-FRAME-AND-CALL-VOP
 :X86-64)
--

Jan Moringen (scymtym) on 2017-12-31
Changed in sbcl:
assignee: nobody → Jan Moringen (scymtym)
status: New → Confirmed
importance: Undecided → Medium
Richard M Kreuter (kreuter) wrote :

Hi Jan,

I see you've made a commit, 179bc57a, that changes the definition of DIRECTORIZE-PATHNAME to

--
(defun directorize-pathname (pathname)
  (cond
    ((wild-pathname-p pathname)
     (simple-file-perror
      "Cannot compute directory pathname for wild pathname ~S"
      pathname))
    ((or (pathname-name pathname)
         (pathname-type pathname))
     (let ((from-file (format nil "~@[~A~]~@[.~A~]"
                              (pathname-name pathname)
                              (pathname-type pathname))))
       (make-pathname
        :host (pathname-host pathname)
        :device (pathname-device pathname)
        :directory (append (pathname-directory pathname)
                           (list from-file)))))
    (t
     pathname)))
--

This will produce incorrect results when the name or type is :UNSPECIFIC.

This kind of thing is why I suggested running the argument through the "native namestring" unparsing and parsing routines, as these are the only routines that implement filename denotation in SBCL. Even if this one routine can be made to agree with those other routines for now, having this extra bit of logic in one place is guaranteed to be a source of bugs as other things evolve.

Thanks,
Richard

Jan Moringen (scymtym) wrote :

Hi,

thanks for pointing out that problem. I will reply properly later.

Jan Moringen (scymtym) wrote :

I wanted DIRECTORIZE-PATHNAME to behave like UNPARSE-PHYSICAL-FILE except for the roundtripping restrictions and the escaping.

IMHO, implementing this behavior directly is the easiest way (I have to admit that it is apparently still too hard for me as the problem you pointed out demonstrates). In particular, I find going through native namestring unparsing and parsing hard to reason about.

Viewed from a different angle, going through native namestring unparsing and parsing basically does the same: 1) unparse the pathname, turning the name and type components (if present) into NAME.TYPE, without escaping and almost without roundtripping 2) parse the native namestring putting the final "/"-delimited substring into the directory component.

While working on this, I also noted the following difference:

  (native-namestring (make-pathname :type "foo" :directory '(:absolute "foo" "bar"))) |= NO-NATIVE-NAMESTRING-ERROR
  (directorize-pathname (make-pathname :type "foo" :directory '(:absolute "foo" "bar"))) => #P"/foo/bar/.foo/"

I'm not sure what the correct behavior is for this case.

I will try one more iteration using the direct approach, sorry.

All that said, I think designing DELETE-DIRETORY's interface like this was a mistake. IMHO, it should just signal an error when a name, type or version component is present. DIRECTORIZE-PATHNAME could maybe be exported as an extension function for users who want that behavior.

Jan Moringen (scymtym) on 2018-01-04
Changed in sbcl:
status: Confirmed → Fix Committed
Richard M Kreuter (kreuter) wrote :

I would say that the disagreement between DIRECTORIZE-PATHNAME and NATIVE-NAMESTRING is evidence that this approach is the wrong thing.

As for exposing DIRECTORIZE-PATHNAME, that's up to you; but I'll go on round-tripping through the native namestring routines, since those won't result in pathnames that ever disagree with OPEN and friends. :-)

Thank you for your attention to this issue.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers