Symbolic Link Rant

See SymbolicLink.

[Moved here from SymbolicLink]

Symbolic links are a hack to get around the fact that Unix doesn't allow multiple hardlinks to directories, doesn't provide back-references to hard links, and other limitations of its semantics.

I think that ignores several symlink benefits. 1) A hardlink to an object on another physical device is meaningless. 2) It can be useful to preserve a link to an object that may be deleted and recreated. 3) It can be useful to have different permission settings on the file and the link to the file. 4) you can always be sure you're not deleting the last hard link to a file when deleting a symlink.

At a fundamental level, symbolic links are just delayed binding of the link name. Instead of evaluating the object which the link will refer to at link creation time, it will be evaluated at lookup time. The problem with this is that the computation that's allowed in the creation of "symbolic links" is broken. All you can do is specify a pathname to lookup. You can't do generic computation, that's simply not allowed.

What people really want isn't symbolic links. They want multiple hardlinks to directories, bidirectional links (that is, actual back-references), and the ability to dynamically generate objects at a specific point in the "file"system by registering a process at that particular location. (Not what PlanNine provides but more like Hurd's translators.)

Sounds like what people really want is a DirectedAcyclicGraph of persistent objects, instead of a hierarchical file system. Files (and directories) want to be in a multiarchy, of which a hierarchy is a special case.

Directed Acyclic Graph? Why limit it thus?

True enough. The motivation is to avoid cycles that force a bit more complexity in code to handle the analogs of things like directory listings. Enumerating the contents of node A, for example, when those contents include a cycle that includes A, requires the routine to maintain something like a "recursion set" (the name of the Smalltalk variable used for this) to break the cycle. More challenging, typically, is the question of how the results of such an enumeration (including the cycle) should be conveyed to the user/caller.

Cyclic graphs also have synchronization problems with concurrent access, as it is difficult to define a precise locking order for the nodes in a graph. (If two nodes are locked by two different processes in a different order, DeadLock might result. Not good inside an OS kernel...

And you need locking because?? Locking against deletion is unnecessary in a generic directed graph. So what operation in the kernel would need to lock objects?

[If it was RK who asked that, I assume it's a rhetorical question. If it wasn't RK who asked, see discussion over in HardLink.]

IOW, there is no such operation? And since processes can define their own locking orders, or disregard locks completely, defining a locking order in the filesystem is solving a problem that doesn't exist by castrating FS semantics? -- RK


I have this DirectedAcyclicGraph as a filesystem, it doesn't deadlock, but it is slow. Any takers?


It would be very slow to limit the graph to be acyclic, especially if the filesystem spanned machines across the network.


EditText of this page (last edited February 3, 2007) or FindPage with title or text search