UIOP: Improve handling of XDG_CONFIG_DIRS

Bug #1396847 reported by Jan Moringen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ASDF
Fix Released
Undecided
Robert P. Goldman

Bug Description

What I do:
(sb-posix:setenv "XDG_CONFIG_DIRS" "/foo:" 1) ; note trailing ":"
(uiop:user-configuration-directories)
|- SIMPLE-ERROR: In (UIOP/FILESYSTEM:GETENV-PATHNAMES "XDG_CONFIG_DIRS"), invalid pathname #P"": Expected an absolute pathname

(sb-posix:setenv "XDG_CONFIG_DIRS" "/foo::/bar" 1)
(uiop:user-configuration-directories)
|- same error

What happens:
The above error is signaled, presumably because the directory list is split into "/foo" and "" (and "/bar" in the second example).

What did I expect to happen:
I couldn't find anything about empty directory list entries in http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html, but I assume either signaling a more informative error or ignoring such entries (maybe signaling a STYLE-WARNING) are the two options.

I prefer ignoring empty entries since signaling an error can make it impossible to execute certain Lisp programs when the environment is not under the user's control.

Versions:
(asdf:component-version (asdf:find-system :asdf)) => "3.1.3"
(lisp-implementation-type) => "SBCL"
(lisp-implementation-version) => "1.2.4.debian"

Revision history for this message
Faré (fahree) wrote :

Yes, split-native-pathnames-string should have an (and (not (emptyp string)) ...) clause. Or split-string should have an option to do that and/or to remove empty entries altogether, that split-native-pathnames-string would use.

Robert, are you on it, or do you want me to do it?

Changed in asdf:
status: New → In Progress
assignee: nobody → Robert P. Goldman (rpgoldman)
Revision history for this message
Robert P. Goldman (rpgoldman) wrote :

I can replicate this on ACL using
 (setf (sys:getenv "XDG_CONFIG_DIRS") "/foo:")
(uiop:user-configuration-directories)

I note this snippet since ACL provides more convenient source-finding when debugging ASDF.

Let's define what we think is the Right Thing here.

I am reluctant to comply with Jan's request for DWIMing this. I understand the desire to have something nice happen when the user has a defective environment, but (a) we are just hurting the user by not telling her that her environment is defective and (b) what happens when someone else uses this function and the error is not benign (i.e., there is not an empty entry as we think, but a garbage string in this environment variable)?

For that reason, I am inclined to raise an error in this case. Some things I could do to mitigate this:

1. I could raise an exported, identifiable condition, so that the caller could catch it and provide a restart allowing the user to fix his environment and continue.
2. Provide an optional ON-EMPTY-ENTRY argument, defaulting to :ERROR, but also accepting :REMOVE.
3. Both of the above.

Please let me know your opinions.

Revision history for this message
Robert P. Goldman (rpgoldman) wrote :

What was I thinking?
4. [Less fussy:] Make an empty entry cause a CERROR with continuation removing the empty entries.

This is the one I will go with unless someone objects.

Revision history for this message
Stelian Ionescu (sionescu) wrote : Re: [Bug 1396847] Re: UIOP: Improve handling of XDG_CONFIG_DIRS

On Thu, 2014-11-27 at 17:39 +0000, Robert P. Goldman wrote:
> What was I thinking?
> 4. [Less fussy:] Make an empty entry cause a CERROR with continuation removing the empty entries.
>
> This is the one I will go with unless someone objects.

Empty entries are a very common occurrence in shell programming, so
please ignore them.

--
Stelian Ionescu a.k.a. fe[nl]ix
Quidquid latine dictum sit, altum videtur.

Revision history for this message
Faré (fahree) wrote :

I agree with Jan and Stelian that ignoring empty entries, *in this case*, is the best thing to do.

The best way to do it, in my opinion is to parse an empty entry "" as NIL instead of #P"". See my proposed fix above.

Revision history for this message
Robert P. Goldman (rpgoldman) wrote :

OK, just to clarify: don't you use an empty pathname specifically to be semantically meaningful when you are processing CL_SOURCE_REGISTRY:
"When considering environment variable 'CL_SOURCE_REGISTRY' ASDF will skip to next configuration if it's an empty string."
?
I see that you don't use this separator function to process the contents of CL_SOURCE_REGISTRY, but this function proclaims that it gets absolute directory names, and it's exported. So jamming what seems to me an arbitrary semantics onto the empty entries -- especially when this is an arbitrary semantics that we ourselves abandon at times -- seems wrong to me.

If this was an internal function that wasn't advertised, I wouldn't care, but it's not. *AND* we're breaking the behavior way down at the bottom of the call tree, making code un-reusable (e.g., we couldn't reuse this code to process CL_SOURCE_REGISTRY) if you disagree with the policy.

So I propose to make discard-empty-pathnames be the default behavior, but not to force it.

Revision history for this message
Robert P. Goldman (rpgoldman) wrote :

Please see if the patch I just pushed is acceptable.

Changed in asdf:
status: In Progress → Fix Committed
Revision history for this message
Faré (fahree) wrote :

I was thinking about the simpler:

 (defun split-native-pathnames-string (string &rest constraints &key &allow-other-keys)
     "Given a string of pathnames specified in native OS syntax, separate them in a list,
check constraints and normalize each one as per ENSURE-PATHNAME,
where an empty string denotes NIL." ;;; <---- slight change in semantics, but pretty conservative in previously supported cases.
     (loop :for namestring :in (split-string string :separator (string (inter-directory-separator)))
          :collect (unless (emptyp namestring) (apply 'parse-native-namestring namestring constraints))))

This *should* work with both user-configuration-directories (which uses subpathname*) and with parse-source-registry, modulo changing (equal "" s) to (emptyp s). parse-output-translations-string already uses an (or (null string) (equal string "")), which could also be replaced by a call to emptyp.

See attached patch. Only tested on sbcl-1.2.3.62-989a1d6-linux-x64.

Revision history for this message
Faré (fahree) wrote :

If you go the way of another keyword arguments, it would be nice to "just" make it a new "constraint" argument of ensure-pathname, say empty-is-nil, or something.

Revision history for this message
Robert P. Goldman (rpgoldman) wrote :

Faré wrote:
> If you go the way of another keyword arguments, it would be nice to
> "just" make it a new "constraint" argument of ensure-pathname, say
> empty-is-nil, or something.
>

But that means we (potentially) get NILs spattered around, instead of getting empty strings simply discarded, which is the requested behavior.

Revision history for this message
Faré (fahree) wrote :

The discarding is done at a later stage in processing: the source-registry can already handle NILs (it *has* to, and uses subpathname* as appropriate), and the output-translations wants to know about empty entries, so you shouldn't just remove them.

Revision history for this message
Robert P. Goldman (rpgoldman) wrote :

Faré wrote:
> The discarding is done at a later stage in processing: the source-
> registry can already handle NILs (it *has* to, and uses subpathname* as
> appropriate), and the output-translations wants to know about empty
> entries, so you shouldn't just remove them.
>

OK, I see that now.

OK, will try again with EMPTY-IS-NIL option.

Revision history for this message
Robert P. Goldman (rpgoldman) wrote :

Unfortunately, simply adding EMPTY-IS-NIL doesn't work straightforwardly, because PARSE-NATIVE-NAMESTRING crashes (in PARSE-NAMESTRING). The reason is that it treats the empty string as something with which defaults should be merged, misinterpreting it as a relative pathname. This means that by the time we get to ENSURE-PATHNAME :EMPTY-IS-NIL T, we have turned "" into #P"./" and crash.

That means, AFAICT, that we cannot use EMPTY-AS-NIL as another ENSURE-PATHNAME constraint. We must handle it at least in PARSE-NATIVE-NAMESTRING.

This is starting to seem extremely messy, because there seem like multiple entry points. Potentially then, at every level (i.e., at all of the exported filename-handling functions from filesystem.lisp) one might need to handle an empty string. Guessing what's the right policy for each entry point here seems very tricky. These are the functions that are exported:

   ;; Native namestrings
   #:native-namestring #:parse-native-namestring
   ;; Probing the filesystem
   #:truename* #:safe-file-write-date #:probe-file* #:directory-exists-p #:file-exists-p
   #:directory* #:filter-logical-directory-results #:directory-files #:subdirectories
   #:collect-sub*directories
   ;; Resolving symlinks somewhat
   #:truenamize #:resolve-symlinks #:*resolve-symlinks* #:resolve-symlinks*
   ;; merging with cwd
   #:get-pathname-defaults #:call-with-current-directory #:with-current-directory
   ;; Environment pathnames
   #:split-native-pathnames-string
   #:getenv-pathname #:getenv-pathnames
   #:getenv-absolute-directory #:getenv-absolute-directories
   #:lisp-implementation-directory #:lisp-implementation-pathname-p
   ;; Simple filesystem operations
   #:ensure-all-directories-exist
   #:rename-file-overwriting-target

Unless someone can suggest a policy that covers the cases, I'm close to throwing up my hands, and going back to treating an empty string as an error.

Revision history for this message
Faré (fahree) wrote :

Well, one solution is the patch I posted above.

Obviously, if you add an EMPTY-IS-NIL flag to ENSURE-PATHNAME, it would have to be checked before calling any PARSE*NAMESTRING function.

Revision history for this message
Robert P. Goldman (rpgoldman) wrote :

The problem is that merging occurs -- wrongly, since the config directories are required to be absolute pathnames -- before we call ENSURE-PATHNAME. So the empty string is mangled before we get to ENSURE-PATHNAME.

I'll see if I can propose an alternative. I think the problem is that GETENV-ABSOLUTE-PATHNAMES invokes PARSE-UNIX-NAMESTRING before we get to ENSURE-PATHNAME, causing this incorrect behavior:

 0[9]: (UIOP/PATHNAME:PARSE-UNIX-NAMESTRING "" :ENSURE-DIRECTORY T)
 0[9]: returned #P"./"

Put differently, although we call GETENV-ABSOLUTE-DIRECTORIES, we lose track of the fact that we must have *absolute* pathnames only, so treat "" as something with which we can merge.

This suggests that WANT-ABSOLUTE must be passed to PARSE-UNIX-NAMESTRING, and checked there. Or, as in my earlier proposed patch, we must check for EMPTY earlier in the call tree, because we need to check for the special empty case of a non-absolute pathname (an empty string is a non-absolute pathname, but a non-absolute pathname need not be an empty string). Actually, I believe the answer is "both": we must check for WANT-ABSOLUTE in PARSE-UNIX-NAMESTRING *and* we must check the empty case early.

So even if we take your fix to SPLIT-NATIVE-PATHNAME-STRING, we have found a separate bug: we must check for WANT-ABSOLUTE in PARSE-NATIVE-NAMESTRING, before PARSE-UNIX-NAMESTRING takes a pathname and "absoutizes" it.

Revision history for this message
Robert P. Goldman (rpgoldman) wrote :

3.1.4.2 has a proposed solution and test.

Note that this proposed solution does NOT work on CMUCL. It seems like absolute pathnames may not be properly detected on CMUCL because of something involving its search paths.

I will look into this a bit further. I've been using 20e, which is now obsolete, anyway.

Revision history for this message
Robert P. Goldman (rpgoldman) wrote :

OK, CMUCL problem was due to use of the wrong function for UIOP:GETENV. Pushed a bugfix for this.

Revision history for this message
Faré (fahree) wrote :

Looking at 3.1.4.2.

Don't we always want empty-is-nil in getenv-pathnames and even more so in getenv-absolute-pathnames?
There's no way "" can denote an absolute pathname (although, maybe, on genera, or using logical-pathnames, but environment variables don't make sense in such a context).

You will probably want to get rid of comments around split-native-pathnames-string before release, including the <---- comment.

The absolute-pathname-p in parse-native-namestring seems completely bogus: the pathname will be parsed via pathname by parse-namestring, which can have a sensibly different syntax than parse-native-namestring (notably on SCL). I don't understand what's special about the want-absolute constraint that makes you want to check it early.

Revision history for this message
Robert P. Goldman (rpgoldman) wrote :

* Fare: Don't we always want empty-is-nil in getenv-pathnames and even more so in getenv-absolute-pathnames [directories, I think you meant]?

Both of those default to EMPTY-IS-NIL T in the latest version.

But not that it doesn't matter: in both cases the EMPTY-IS-NIL aspect is handled by SPLIT-NATIVE-PATHNAMES-STRING, so that EMPTY-IS-NIL turns out not to matter in the case of these functions. Indeed, arguably I should make :EMPTY-IS-NIL NIL be an error case (which I guess is what you are suggesting). OK, I'll eliminate the possibility of specifying EMPTY-IS-NIL NIL here, since SPLIT-NATIVE-PATHNAMES-STRING preempts that.

There's no way "" can denote an absolute pathname (although, maybe, on genera, or using logical-pathnames, but environment variables don't make sense in such a context).

* Fare: You will probably want to get rid of comments around split-native-pathnames-string before release, including the <---- comment.

Yes, I will do that. Wanted to leave that there for review purposes, but cutting it out now.

* Fare: The absolute-pathname-p in parse-native-namestring seems completely bogus: the pathname will be parsed via pathname by parse-namestring, which can have a sensibly different syntax than parse-native-namestring (notably on SCL). I don't understand what's special about the want-absolute constraint that makes you want to check it early.

AFAICT, I must check it early because otherwise it seems like PARSE-UNIX-NAMESTRING may munge a pathname there. Here's an example:

UIOP/FILESYSTEM> (parse-native-namestring "")
#P"./"

I was definitely getting some names passed through here earlier, before I winnowed out the "" coming in. I don't understand well enough the desired behavior of parse-unix-namestring (e.g., I don't understand why a pathname is always returned, ignoring ENSURE-DIRECTORY -- shouldn't a pathname be passed to ENSURE-PATHNAME, instead of just being returned?), or what happens when *NIL-PATHNAME* is merged (presumably "" turned into a directory pathname because it had an empty name?).

Unfortunately, I have lost track of the precise rationale for trapping this early. All I can recall is that in the earlier version some non-absolute pathnames were slipping through. I don't know if this only happened before we eliminated the empty strings, or if it could otherwise happen.

If I had more time, I'd remove the early check, and test thoroughly, but I don't. So I'm going to leave it in out of an excess of caution for now.

Revision history for this message
Faré (fahree) wrote :

Interpreting #p"" as (make-pathname :directory '(:relative)) and printing it #P"./" is perfectly reasonable. If want-absolute is true, ensure-pathname will then correctly complain that it's not absolute.

parse-unix-namestring is called by ensure-pathname, and the other way around. That's a bit ugly, but oh well. ensure-directory isn't ignored at all, it's the first thing used after the check-type for the type argument. It's passed to ensure-pathname as part of the keys, redundantly with the fact that we already ensured it's a directory.

Revision history for this message
Robert P. Goldman (rpgoldman) wrote :

OK, simplified away the unnecessary early check for absolute pathname in PARSE-NATIVE-NAMESTRING. Seems no longer necessary.

Revision history for this message
Faré (fahree) wrote :

Dear Robert,

your latest patch removed the necessary (when string ...) condition in PARSE-NATIVE-NAMESTRING. It was breaking everything for me inside the getenv-pathname "XDG_CACHE_HOME", so I restored it. Also added a tweak to getenv-pathnames. I didn't bump the version — tell me if I should. Please pull.

Changed in asdf:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.