Symlinks can cause infinite loop in INITIALIZE-SOURCE-REGISTRY

Bug #1533766 reported by Robert P. Goldman
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
ASDF
Fix Released
Undecided
Robert P. Goldman

Bug Description

Symlinks can cause a loop in the directory structure that causes INITIALIZE-SOURCE-REGISTRY to loop infinitely.

See attached tarball with directory structure.

Then do something like:

(asdf:initialize-source-registry '(:source-registry (:tree (:home "Downloads/asdf-loop/")) :inherit-configuration))

Goes away forever on SBCL and ACL. Does *not* go away forever on CCL. I suspect this may have something to do with differences in treatment of symlinks, for which see other ASDF bugs.

Hat tip to Jeff Rye for the recipe for reproduction.

Do we need a loop-detector?

Revision history for this message
Robert P. Goldman (rpgoldman) wrote :
Revision history for this message
Faré (fahree) wrote :

This is a known (to me) issue. DIRECTORY just doesn't let you control what to do or not with symlinks.

That's the reason why I believe NOT following symlinks is the correct behavior, and why I am telling CCL to not follow them, since it has this capability.

Revision history for this message
Faré (fahree) wrote :

I'm tempted to mark that as "WONTFIX".

Sure, we could write some kind of circularity detector and use it. Meh. I won't do it. Patch welcome.

Revision history for this message
Robert P. Goldman (rpgoldman) wrote :

As I understand things, the search is done under collect-sub*directories-asd-files.

This is a search function; would that be an appropriate place to add a VISITED (closed) set?

The one thing I'm not sure about is what the DIRECTORY argument is? Is this always an absolute path? Is this ever a symbolic link, or are symbolic links always resolved?

The visited list will only work (or only be easy to implement, anyway!) if the directories are canonically named. If there could be aliasing, things will be much more difficult.

Revision history for this message
Robert P. Goldman (rpgoldman) wrote :

Related question: why does TRUENAMIZE leave logical pathnames as logical pathnames, instead of resolving them? I was thinking of using that to get as close to a canonical name as possible.

Actually, I'm also puzzled about why ENSURE-ABSOLUTE-PATHNAME doesn't invoke ENSURE-PATHNAME...

If you have any recommendations for how to canonicalize pathnames, I would be very grateful.

I suppose given all those slots -- many of which may be don't cares -- working with strings might be better than working with CL PATHNAMEs.

Revision history for this message
Faré (fahree) wrote :

Well, yes, there could be another argument VISITED that when NIL does nothing, when T gets replaced by a fresh hash-table, and when a hash-table contains a set of visited filenames.

No, I don't think the DIRECTORY argument is always absolute, physical, or anything.

Whether symbolic links are resolved is implementation-dependent. I try not to on supported implementations (like CCL), but most implementations follow symlinks.

Yes, you will need to canonicalize pathnames. Note that although the CLHS says DIRECTORY returns TRUENAMEs, some implementations (e.g. CLISP, I believe) actually have different canonicalization rules for DIRECTORY and for TRUENAME — and of course, if you don't follow symlinks or return truenames, as would be nice, then don't expect a truename to be returned. Therefore, if/when you use VISITED, be sure to canonicalize with TRUENAME except on supported implementations — but then again, TRUENAME may fail (e.g. some ancestor .. lacks access rights), and what do you do then?

It's a cesspool of fail to try to do that portably, which is why I didn't even try. It works well enough for ASDF purposes. For more, go IOlib.

Revision history for this message
Faré (fahree) wrote :

TRUENAMIZE does not resolve TRUENAMEs, so that ASDF may work at all with "logical" pathnames. Otherwise, the .asd file (which is always truenamized?) get physicalized, and logical pathnames don't behave as expected.

If it were me, I'd declare "logical" pathnames dead, and tell janderson and pcos to suck it (only reported asdf users to use logical pathnames). I'd also lobby for their being removed from the next CL standard (as well as wilcard characters). I'd expose the OS interface, in an OS-dependent but otherwise implementation-independent way. The only parts of CL that really need access to a file are LOAD and COMPILE-FILE; and OS-independent primitives could be offered for these, based on open streams.

As for ensure-absolute-pathname, it's not calling ensure-pathname, and instead, ensure-pathname is calling it. What would you have it call ensure-pathname for?

Revision history for this message
Luís Oliveira (luismbo) wrote :

FWIW, I just got bitten by this and was about to report it. :-)

Revision history for this message
Robert P. Goldman (rpgoldman) wrote :

I have just pushed 3.1.6.16 with a proposed fix for this bug. If you have test cases, please try them.

Changed in asdf:
status: New → Fix Committed
Revision history for this message
Luís Oliveira (luismbo) wrote :

Robert, I confirm that my test case (which is simply to insert a backward link like `ln -s .. foo` somewhere in the tree that ASDF traverses) works with 3.1.5.16. Thanks!

Changed in asdf:
assignee: nobody → Robert P. Goldman (rpgoldman)
milestone: none → 3.1.7
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.