http://en.wikipedia.org/wiki/File_scope
Scope can vary from as little as a single expression to as much as the entire program, with many possible gradations in between. The simplest scoping rule is global scope – all entities are visible throughout the entire program. The most basic modular scoping rule is two-level scoping, with a global scope anywhere in the program, and local scope within a function. More sophisticated modular programming allows a separate module scope, where names are visible within the module (private to the module) but not visible outside it. Within a function, some languages, such as C, allow block scope to restrict scope to a subset of a function; others, notably functional languages, allow expression scope, to restrict scope to a single expression. Other scopes include file scope (notably in C), which functions similarly to module scope, and block scope outside of functions (notably in Perl).
A subtle issue is exactly when a scope begins and ends. In some languages, such as in C, a scope starts at declaration, and thus different names declared within a given block can have different scopes. This requires declaring functions before use, though not necessarily defining them, and requires forward declaration in some cases, notably for mutual recursion. In other languages, such as JavaScript or Python, a name's scope begins at the start of the relevant block (such as the start of a function), regardless of where it is defined, and all names within a given block have the same scope; in JavaScript this is known as variable hoisting. However, when the name is bound to a value varies, and behavior of in-context names that have undefined value differs: in Python use of undefined variables yields a syntax error, while in JavaScript undefined variables are usable (with undefined value), but function declarations are also hoisted to the top of the containing function and usable throughout the function.
Many languages, especially functional languages, offer a feature called let-expressions, which allow a declaration's scope to be a single expression. This is convenient if, for example, an intermediate value is needed for a computation. For example, in Standard ML, if f() returns 12, then let val x = f() in x * xend is an expression that evaluates to 144, using a temporary variable named x to avoid calling f() twice. Some languages with block scope approximate this functionality by offering syntax for a block to be embedded into an expression; for example, the aforementioned Standard ML expression could be written in Perlas do { my $x = f(); $x * $x }, or in GNU C as ({ int x = f(); x * x; }).
In Python, auxiliary variables in generator expressions and list comprehensions (in Python 3) have expression scope.
In C, variable names in a function prototype have expression scope, known in this context as function protocol scope. As the variable names in the prototype are not referred to (they may be different in the actual definition) – they are just dummies – these are often omitted, though they may be used for generating documentation, for instance.
Many, but not all, block-structured programming languages allow scope to be restricted to a block, which is known as block scope. This began with ALGOL 60, where "[e]very declaration ... is valid only for that block.",[4] and today is particularly associated with C and languages influenced by C. Most often this block is contained within a function, thus restricting the scope to a part of a function, but in some cases, such as Perl, the block may not be within a function.
unsigned int sum_of_squares(unsigned int n) { unsigned int ret = 0; for(unsigned int i = 0; i <= n; ++i) { unsigned int n_squared = i * i; ret += n_squared; } return ret; }
A representative example of the use of block scope is the C code at right, where two variables are scoped to the loop: the loop variable i, which is initialized once and incremented on each iteration of the loop, and the auxiliary variable n_squared, which is initialized at each iteration. The purpose is to avoid adding variables to the function scope that are only relevant to a particular block – for example, this prevents errors where the generic loop variable i has accidentally already been set to another value. In this example the expression i * i
would generally not be assigned to an auxiliary variable, and the body of the loop would simply be written ret += i * i
but in more complicated examples auxiliary variables are useful.
Blocks are primarily used for control flow, such as with if, while, and for loops, and in these cases block scope means the scope of variable depends on the structure of a function's flow of execution. However, languages with block scope typically also allow the use of "naked" blocks, whose sole purpose is to allow fine-grained control of variable scope. For example, an auxiliary variable may be defined in a block, then used (say, added to a variable with function scope) and discarded when the block ends, or a while loop might be enclosed in a block that initializes variables used inside the loop that should only be initialized once.
if (int y = f(x), y > x) { ... // statements involving x and y }
A subtlety of C, demonstrated in this example and standardized since C99, is that block-scope variables can be declared not only within the body of the block, but also within the control statement, if any. This is analogous to function parameters, which are declared in the function declaration (before the block of the function body starts), and in scope for the whole function body. This is primarily used in for loops, which have an initialization statement separate from the loop condition, unlike while loops, and is a common idiom. A rarer use is in an if statement, where the comma operator can be used to follow a variable declaration and initialization with a separate test, so the auxiliary variable has block scope.
Block scope can be used for shadowing. In this example, inside the block the auxiliary variable could also have been called n, shadowing the parameter name, but this is considered poor style due to the potential for errors. Furthermore, some descendants of C, such as Java and C#, despite having support for block scope (in that a local variable can be made to go out of scope before the end of a function), do not allow one local variable to hide another. In such languages, the attempted declaration of the second n would result in a syntax error, and one of the n variables would have to be renamed.
If a block is used to set the value of a variable, block scope requires that the variable be declared outside of the block. This complicates the use of conditional statements with single assignment. For example, in Python, which does not use block scope, one may initialize a variable as such:
if c: a = 'foo' else: a = ''
where a
is accessible after the if
statement.
In Perl, which has block scope, this instead requires declaring the variable prior to the block:
my $a; if (c) { $a = 'foo'; } else { $a = ''; }
Often this is instead rewritten using multiple assignment, initializing the variable to a default value. In Python (where it is not necessary) this would be:
a = '' if c: a = 'foo'
while in Perl this would be:
my $a = ''; if (c) { $a = 'foo'; }
In case of a single variable assignment, an alternative is to use the ternary operator to avoid a block, but this is not in general possible for multiple variable assignments, and is difficult to read for complex logic.
This is a more significant issue in C, notably for string assignment, as string initialization can automatically allocate memory, while string assignment to an already initialized variable requires allocating memory, a string copy, and checking that these are successful.
{ my $counter = 0; sub increment_counter() { $counter = $counter + 1; return $counter; } }
Some languages allow the concept of block scope to be applied, to varying extents, outside of a function. For example, in the Perl snippet at right, $counter is a variable name with block scope (due to the use of the my keyword), whileincrement_counter is a function name with global scope. Each call to increment_counter will increase the value of $counter by one, and return the new value. Code outside of this block can call increment_counter, but cannot otherwise obtain or alter the value of $counter. This idiom allows one to define closures in Perl.
Most of the commonly used programming languages offer a way to create a local variable in a function or subroutine: a variable whose scope ends (that goes out of context) when the function returns. In most cases the lifetime of the variable is the duration of the function call – it is an automatic variable, created when the function starts (or the variable is declared), destroyed when the function returns – while the scope of the variable is within the function, though the meaning of "within" depends on whether scoping is lexical or dynamic. However, some languages, such as C, also provide for static local variables, where the lifetime of the variable is the entire lifetime of the program, but the variable is only in context when inside the function. In the case of static local variables, the variable is created when the program initializes, and destroyed only when the program terminates, as with a static global variable, but is only in context within a function, like an automatic local variable.
Importantly, in lexical scoping a variable with function scope has scope only within the lexical context of the function: it moves out of context when another function is called within the function, and moves back into context when the function returns – called functions have no access to the local variables of calling functions, and local variables are only in context within the body of the function in which they are declared. By contrast, in dynamic scoping, the scope extends to the runtime context of the function: local variables stay in context when another function is called, only moving out of context when the defining function ends, and thus local variables are in context of the function is which they are defined and all called functions. In languages with lexical scoping and nested functions, local variables are in context for nested functions, since these are within the same lexical context, but not for other functions that are not lexically nested. A local variable of an enclosing function is known as a non-local variable for the nested function. Function scope is also applicable to anonymous functions.
def square(n): return n * n def sum_of_squares(n): total = 0 i = 0 while i <= n: total += square(i) i += 1 return total
For example, in the snippet of Python code on the right, two functions are defined: square and sum_of_squares. square computes the square of a number; sum_of_squares computes the sum of all squares up to a number. (For example, square(4) is 42 = 16, andsum_of_squares(4) is 02 + 12 + 22 + 32 + 42 = 30.)
Each of these functions has a variable named n that represents the argument to the function. These two n variables are completely separate and unrelated, despite having the same name, because they are lexically scoped local variables, with function scope: each one's scope is its own, lexically separate, function, so they don't overlap. Therefore, sum_of_squares can call square without its own n being altered. Similarly, sum_of_squares has variables named total and i; these variables, because of their limited scope, will not interfere with any variables named total or i that might belong to any other function. In other words, there is no risk of a name collision between these identifiers and any unrelated identifiers, even if they are identical.
Note also that no name masking is occurring: only one variable named n is in context at any given time, as the scopes do not overlap. By contrast, were a similar fragment to be written in a language with dynamic scope, the n in the calling function would remain in context in the called function – the scopes would overlap – and would be masked ("shadowed") by the new n in the called function.
Function scope is significantly more complicated if functions are first-class objects and can be created locally to a function and then returned. In this case any variables in the nested function that are not local to it (unbound variables in the function definition, that resolve to variables in an enclosing context) create a closure, as not only the function itself, but also its environment (of variables) must be returned, and then potentially called in a different context. This requires significantly more support from the compiler, and can complicate program analysis.
A scoping rule largely particular to C (and C++) is file scope, where scope of variables and functions declared at the top level of a file (not within any function) is for the entire file – or rather for C, from the declaration until the end of the source file, or more precisely translation unit (internal linking). This can be seen as a form of module scope, where modules are identified with files, and in more modern languages is replaced by an explicit module scope. Due to the presence of include statements, which add variables and functions to the internal context and may themselves call further include statements, it can be difficult to determine what is in context in the body of a file.
In the C code snippet above, the function name sum_of_squares has file scope.
In modular programming, the scope of a name can be an entire module, however it may be structured across various files. In this paradigm, modules are the basic unit of a complex program, as they allow information hiding and exposing a limited interface. Module scope was pioneered in the Modula family of languages, and Python (which was influenced by Modula) is a representative contemporary example.
In some object-oriented programming languages that lack direct support for modules, such as C++, a similar structure is instead provided by the class hierarchy, where classes are the basic unit of the program, and a class can have private methods. This is properly understood in the context of dynamic dispatch rather than name resolution and scope, though they often play analogous roles. In some cases both these facilities are available, such as in Python, which has both modules and classes, and code organization (as a module-level function or a conventionally private method) is a choice of the programmer.
A declaration has global scope if it has effect throughout an entire program. Variable names with global scope — called global variables — are frequently considered bad practice, at least in some languages, due to the possibility of name collisions and unintentional masking, together with poor modularity, and function scope or block scope are considered preferable. However, global scope is typically used (depending on the language) for various other sorts of identifiers, such as names of functions, and names of classes and other data types. In these cases mechanisms such as namespaces are used to avoid collisions.
In the Python code snippet above, the function names square and sum_of_squares have global scope.
In C, scope is traditionally known as linkage or visibility, particularly for variables. C is a lexically scoped language with global scope (known as external linkage), a form of module scope or file scope (known as internal linkage), and local scope (within a function); within a function scopes can further be nested via block scope. However, standard C does not support nested functions.
The lifetime and visibility of a variable are determined by its storage class. There are three types of lifetimes in C: static (program execution), automatic (block execution, allocated on the stack), and manual (allocated on the heap). Only static and automatic are supported for variables and handled by the compiler, while manually allocated memory must be tracked manually across different variables. There are three levels of visibility in C: external linkage (global), internal linkage (roughly file), and block scope (which includes functions); block scopes can be nested, and different levels of internal linkage is possible by use of includes. Internal linkage in C is visibility at the translation unit level, namely a source file after being processed by the C preprocessor, notably including all relevant includes.
C programs are compiled as separate object files, which are then linked into an executable or library via a linker. Thus name resolution is split across the compiler, which resolves names within a translation unit (more loosely, "compilation unit", but this is properly a different concept), and the linker, which resolves names across translation units; see linkage for further discussion.
In C, variables with block scope enter scope when they are declared (not at the top of the block), move out of scope if any (non-nested) function is called within the block, move back into scope when the function returns, and move out of scope at the end of the block. In the case of automatic local variables, they are also allocated on declaration and deallocated at the end of the block, while for static local variables, they are allocated at program initialization and deallocated at program termination.
The following program demonstrates a variable with block scope coming into scope partway through the block, then exiting scope (and in fact being deallocated) when the block ends:
#includeint main(void) { char x = 'm'; printf("%c\n", x); { printf("%c\n", x); char x = 'b'; printf("%c\n", x); } printf("%c\n", x); }
There are other levels of scope in C.[8] Variable names used in a function prototype have function prototype visibility, and exit scope at the end of the function prototype. Since the name is not used, this is not useful for compilation, but may be useful for documentation. Label names for GOTO statement have function scope, while case label names for switch statements have block scope (the block of the switch).
Static defined local variables do not lose their value between function calls. In other words they are global variables, but scoped to the local function they are defined in.
Static global variables are not visible outside of the C file they are defined in.
Static functions are not visible outside of the C file they are defined in.
As well as specifying static lifetime, declaring a variable as static
can have other effects depending on where the declaration occurs:
static
at the top level of a source file (outside any function definitions) is only visible throughout that file ("file scope", also known as "internal linkage").static
inside a function are statically allocated, thus keep their memory cell throughout all program execution, while having the same scope of visibility as automatic local variables, meaning remain local to the function. Hence whatever values the function puts into its static local variables during one call will still be present when the function is called again.static
inside class definitions are class variables (shared between all class instances, as opposed to instance variables).static
and extern
static const unsigned int VAL = 42;
const unsigned int ANOTHER_VAL = 37;
The static
and extern
tags on file-scoped variables determine whether they are accessible in other translation units (i.e. other .c
or .cpp
files).
static
gives the variable internal linkage, hiding it from other translation units. However, variables with internal linkage can be defined in multiple translation units.
extern
gives the variable external linkage, making it visible to other translation units. Typically this means that the variable must only be defined in one translation unit.
The default (when you don't specify static
or extern
) is one of those areas in which C and C++ differ.
In C, file-scoped variables are extern
(external linkage) by default. If you're using C, VAL
is static
and ANOTHER_VAL
is extern
.
In C++, file-scoped variables are static
(internal linkage) by default if they are const
, and extern
by default if they are not. If you're using C++, both VAL
and ANOTHER_VAL
are static
.
From a draft of the C specification:
6.2.2 Linkages of identifiers ... -5- If the declaration of an identifier for a function has no storage-class specifier, its linkage is determined exactly as if it were declared with the storage-class specifier extern. If the declaration of an identifier for an object has file scope and no storage-class specifier, its linkage is external.
From a draft of the C++ specification:
7.1.1 - Storage class specifiers [dcl.stc] ... -6- A name declared in a namespace scope without a storage-class-specifier has external linkage unless it has internal linkage because of a previous declaration and provided it is not declared const. Objects declared const and not explicitly declared extern have internal linkage.