Reference Counting

In each object, keep a count of the number of references to the object. (Add 1 when copying the reference, subtract 1 when clearing or changing a reference.) When the count drops to zero, kill the object.

This is a form of automatic memory management. That is, it works without you having to think (very much) about it.

However, it fails to free (IE: orphans) groups of objects that have a circular relationship of pointers. See GarbageCollection to deal with this problem. (Some systems use both. RC can also be a part of the implementation of GC, but not the whole thing, unless the app is to have memory leaks.)

Despite what some purists say, ReferenceCounting is, in fact, one form of GarbageCollection. There are even certain circumstances where it can be the preferable method - for instance, with data structures that are guaranteed to be non-cyclical and the system is expected to be using almost all available RAM most of the time (see GarbageCollectionBook).


Reference counting can be surprisingly inefficient, especially in a naive multi-threaded implementation. A pointer assignment becomes something like:

and that has to be done when passing pointers as parameters to functions, too. GarbageCollection has a different set of costs, which makes comparisons hard, but which is usually a net win.

A less naive implementation may be able to avoid some of the work. For example, when passing a pointer to a function the caller will probably hold onto a reference for the duration, so the increment and decrement at the function boundary can be omitted. To do this kind of thing automatically, it helps if the compiler is aware of what is going on.


Reference counting has the advantage that it gives more precise lifetimes than garbage collection. Objects get finalized immediately the last reference to them is dropped. With garbage collection, objects may not be finalized until some indefinite period later; the delay makes finalization useless for doing real work, like closing O/S file or window handles. With reference counting, you can do that.


Reference counting can lead to long chains of deletes. If you have a singly-linked list, for example, dropping the last reference to the head will cause a ripple of decrements and deletes down the chain. This can be a problem for real time work because it means a simple pointer assignment can take unbounded time.

Less naive systems, when an object's count hits zero, defer decrementing the counts of objects it refers to until it is more convenient. Essentially they keep a list of objects waiting to be deleted, and process them in short, incremental slices. (Of course, this can lose the timeliness property mentioned above.)


Known Usages:


RC works in real life, too. If you are a contractor working at a site, and the count of folks with roles for you drops to zero, you depart. Similarly, if a company's number of customers drops to zero, it suspends its activities.

Garbage collection, by contrast, tends to leave dead wood lying around filling up cubes. -- PCP


Reference counting is nice in a multithreading application because if the ownership of an object is "shared" between threads, it's simpler to make the object own itself. The alternative is to synchronize all the threads every time you want to delete something; sort of "Hello, do you need this object still?" See SynchronizedSmartPointers. -- SunirShah

You mean: it is nice compared to manual memory management. I grant you that; however, implementing ReferenceCounting efficiently in a multi-threaded environment is extremely difficult. That's why PythonLanguage (which uses ReferenceCounting) has a global interpreter lock: otherwise it would be necessary to perform a locking operation every time a refcount gets bumped up or down. -- StephanHouben

Naive reference counting cannot work in a multiple process environment that has separate memory spaces, because cycles that cross address boundaries cannot be recognized in bounded time. There are techniques to get around this (most involve passing timestamps around), but they are complex. -- TomStambaugh


See also: CppCountedPointerImplementation

http://linux.rice.edu/~rahul/hbaker/LRefCounts.html talks about safe ways to make reference counting faster. Searching around should turn up papers on LinearTypes, which are a variation. These ideas show how sophisticated implementations of reference counting can differ from naive ones.

The ReCycler is a very non-naive implementation of ReferenceCounting.


Most modern unixes don't allow hard links to directories. The ln utility may allow for this, but the operating system will most likely disallow it, leading to some ambiguity in `man ln`. [mc0e] Supposedly Mac OSX (Leopard) does allow hard links to directories, but I haven't personally confirmed this [mc0e]

Early versions of Unix put the safety check into the "ln" program, but you could write your own program to use the link() system call to do any linking you pleased, so long as you ran it as root. (I did this under version 6 Unix just because of the novelty of creating e.g. directories that contained themselves.)

Many later versions of Unix or its clones migrated the safety check into the link() system call and/or into the filesystem itself, to give a firmer guarantee.

[The early version of Unix I used (AT&T System V release 3 mostly) didn't have a safety check in ln. It would happily let you create cyclical directory structures and even let you disconnect them from /, at which point they still existed for processes that had handles to them, but became inaccessible for anyone else. -- EricHodges]


CategoryGarbageCollection


EditText of this page (last edited December 4, 2012) or FindPage with title or text search