In order of locality:
These may be a fine place to put a ContextObject in a concurrent process, thus avoiding the syntactic inconvenience and coupling of explicitly passing it around.
Java supports ThreadLocalVariables in its standard library (http://java.sun.com/j2se/1.4.2/docs/api/java/lang/ThreadLocal.html). They can be added to some other languages, such as C/C++.
Microsoft Visual C++ lets you declare global and static variables as thread-local.
If a link-loader had access to all the potential 'thread-local' variables prior to any threads firing up, it wouldn't be difficult to bind each variable to an offset from an initial frame pointer and allocate an initial space to carry them just below the stack start for each thread. With a dedicated register locating the bottom of the stack, access to these would be exactly as expensive as access to parameter-local variables (i.e. an offset from a frame pointer). MSVC++ doesn't go quite this far, instead creating a per-DLL global array of data blocks.
But this isn't the case for MSVC++, which supports a dynamic link loader. The above optimization is not possible if new TLVs can pop into existence based on a dynamic link after one starts firing up the threads; it simply isn't possible to know how much space to prepare below the first stack frame for the TLVs. Actually, just reasoning about this one is pointless when it is so easy to just look it up:
In a multithreaded CommonLisp, SpecialVariables can be used as ThreadLocalVariables, by binding them in the entry point of the thread. This is true of DynamicScoping in general (which is what SpecialVariables offer).
See also: ThreadLocalStorage, ContextObject, ExplicitManagementOfImplicitContext