The scoping rules of a ProgrammingLanguage dictate how FreeVariables - symbol names which are found in the body of a scope (a function, macro, class, whatever) but not defined there - are resolved.
Several different strategies exist:
- Disallowing them altogether. Very rare; though some languages may disallow free references to all but a pre-defined set of symbols (often called keywords or special forms) which are provided by the language.
- Only two scoping levels - global and local. CeeLanguage and many assemblers use this rule. A FreeVariable defined in a C function must refer to a "global" symbol (meaning defined either at file or global scope; as opposed to within a function body). This simplifies the implementation of C greatly (and many C programmers don't miss more advanced scope rules at all). CeePlusPlus sticks to the C tradition in many ways, though the presence of features such as classes and namespaces cause C++ to relax the scoping rules quite a bit. (Still, no C++ function can bind to the local variables of another C++ function). Java, with InnerClasses, relaxes the scoping rules further.
- LexicalScoping. Used by CommonLisp, SchemeLanguage, AlgolLanguage, PascalLanguage, and many others. If a variable isn't found in a given scope, the enclosing scope is searched; repeating until the outermost scope is reached. Two important variants of this are DeepBinding and ShallowBinding. With DeepBinding, variables are bound to the environment of where the function is defined; with shallow binding variables are bound to the environment of where the function is called. Most languages which support LexicalScoping support DeepBinding for functions; most macro systems (excluding Scheme's DefineSyntax) use shallow binding. It is possible to divide DeepBinding further into two separate forms (I am not aware of any generally accepted terminology to describe these forms). In one form, a FreeVariable essentially is an alias for the actual variable in the enclosing scope; if that variable changes than the value of the variable in the function being considered also changes. (If the enclosing scope has exited, possible if FirstClass functions are mixed in with LexicalScoping, then the value the variable had at exit changes). Most languages which support DeepBinding support this form. In the other form; the FreeVariable takes the value that it has at the point when the function in question is first defined; and does not change. (In other words, the function using the FreeVariable makes a copy of the value provided by the enclosing scope). I'm not aware of any languages which do this for FreeVariables?; though objects used as closures have this behavior. Java InnerClasses (when defined within a function) sidestep this issue by only allowing references to variables in the enclosing function which are declared to be final - in other words, those whose value does not change.
- DynamicScoping (early dialects of Lisp, CommonLisp special variables, exported environment variables in UnixOs): The caller is checked for a binding for the variable; if one is found, it is used. Otherwise, the caller's caller is checked, and so on. If no definition is found, it is either an error or a default value is used, depending on the semantics of the language.
The following C-like program illustrates the different scoping rules. Apply a different rule, that's that what would print. Of course, this is not legal C in real life, as
CeeLanguage doesn't allow nested functions.
int main (void)
{
const char *scope = "Lexical, deep, by copy ";
void print_scope (void)
{
printf ("%s\n", scope);
}
scope = "Lexical, deep, aliasing";
void (*)(void) helper_func (void) /* Returns ptr to function; pretend its a closure */
{
const char *scope = "Lexical, shallow scoping";
void do_print_scope (void)
{
print_scope();
}
return do_print_scope;
}
void do_it (void)
{
const char *scope = "Dynamic Scoping";
helper_func()(); /* Call the function returned by helper_func(); */
}
do_it(); /* Print what scoping we are using */
}
Another example, in pseudo PascalLanguage: (from 'Compilers: Principles, Techniques, and Tools')
program scoping;
var r : string;
procedure show;
begin
writeln(r);
end;
procedure scope;
var r : string;
begin
r:='Dynamic';
show;
end;
begin
r:='Scope';
show;
r:='Lexical';
scope;
end.
Note: C does use LexicalScoping but does not allow function definitions to be nested (Standard C doesn't, but GNU C does). C++ also uses LexicalScoping. Namespaces and classes are just lexical scopes, like any other, as are Java InnerClasses.
For example:
int func() {
int outer_local = 1;
{ // this block introduces a new lexical scope
int inner_local = 2;
printf( "%i %i\n", inner_local, outer_local );
}
// inner_local is no longer in scope.
}
Will print:
2 1
The important point about C/C++ is that it doesn't allow FreeVariables? in one function to refer to anything defined in another function; this eliminates the need for a StaticChain and/or closures.
So this perfectly valid ISO 9899:1999 does not use LexicalScoping?
int main() {
for (int i = 0; i < 10; i++) {
dosomething(i);
}
// i is no longer accessible.
}
This is not a GNU extension, this is valid C99.
AFAIK the standard does say that blocks group a set of declarations and statements into
a syntactic unit, and a compound statement is a block.
Actually, C++ does allow nested functions; you just have to spell the internal function differently. The following is not legal in C++:
void outer (int x)
{
void inner (int y);
{
cout << y+1;
}
inner(x);
}
The following, however, is.
void outer (int x)
{
class { void operator () (int y)
{
cout << y+1;
} inner;
inner(x);
}
In other words, functors are a great way of faking nesting functions. Of course, the functor (still) has no access to the variables defined in outer.
CategoryLanguageFeature ScopeAndClosures