Declaration is important part of C++. It is function declaration, variable declaration, type declaration, namespace declaration, etc. gives C++ power and flexibility. In the C++ compiler, the initialization of the mechanism of this part is very important but complex, which also builds the runtime environment of the language.
cxx_init (continue)
410 cxx_init_decl_processing ();
2942 void
2943 cxx_init_decl_processing (void) in decl.c
2944 {
2945 tree void_ftype;
2946 tree void_ftype_ptr;
2947
2948 /* Create all the identifiers we need. */
2949 initialize_predefined_identifiers ();
Besides the reserved words of the language, in compiler, to maintain runtime environment, the compiler needs to insert codes, during which the compiler needs predefined identifiers which have determined meaning known by compiler. No doubt, these names can’t be redeclared in user source code anymore. But can be used with care (without any forward declaration).
2900 static void
2901 initialize_predefined_identifiers (void) in decl.c
2902 {
2903 const predefined_identifier *pid;
2904
2905 /* A table of identifiers to create at startup. */
2906 static const predefined_identifier predefined_identifiers[] = {
2907 { "C++", &lang_name_cplusplus, 0 },
2908 { "C", &lang_name_c, 0 },
2909 { "Java", &lang_name_java, 0 },
2910 { CTOR_NAME, &ctor_identifier, 1 },
2911 { "__base_ctor", &base_ctor_identifier, 1 },
2912 { "__comp_ctor", &complete_ctor_identifier, 1 },
2913 { DTOR_NAME, &dtor_identifier, 1 },
2914 { "__comp_dtor", &complete_dtor_identifier, 1 },
2915 { "__base_dtor", &base_dtor_identifier, 1 },
2916 { "__deleting_dtor", &deleting_dtor_identifier, 1 },
2917 { IN_CHARGE_NAME, &in_charge_identifier, 0 },
2918 { "nelts", &nelts_identifier, 0 },
2919 { THIS_NAME, &this_identifier, 0 },
2920 { VTABLE_DELTA_NAME, &delta_identifier, 0 },
2921 { VTABLE_PFN_NAME, &pfn_identifier, 0 },
2922 { "_vptr", &vptr_identifier, 0 },
2923 { "__vtt_parm", &vtt_parm_identifier, 0 },
2924 { "::", &global_scope_name, 0 },
2925 { "std", &std_identifier, 0 },
2926 { NULL, NULL, 0 }
2927 };
2928
2929 for (pid = predefined_identifiers; pid->name; ++pid)
2930 {
2931 *pid->node = get_identifier (pid->name);
2932 if (pid->ctor_or_dtor_p)
2933 IDENTIFIER_CTOR_OR_DTOR_P (*pid->node) = 1;
2934 }
2935 }
For every predefined identifier, it is of type predefined_identifier as below. These predefined identifiers are not allowed redefined by programmer, so node in the type is declared as const.
2888 typedef struct predefined_identifier in decl.c
2889 {
2890 /* The name of the identifier. */
2891 const char *const name;
2892 /* The place where the IDENTIFIER_NODE should be stored. */
2893 tree *const node;
2894 /* Nonzero if this is the name of a constructor or destructor. */
2895 const int ctor_or_dtor_p;
2896 } predefined_identifier;
In the definition of predefined_identifiers at line 2906, the values assigned to field node of predefined_identifier are all member of cp_global_trees – they are all unique nodes across the system. Macro at line 2933 marks the node as being ctor/dctor.
436 #define IDENTIFIER_CTOR_OR_DTOR_P(NODE) / in cp-tree.h
437 TREE_LANG_FLAG_3 (NODE)
The flag got set is lang_flag_3 in tree_common.
Being C++, by default, we code under global namespace. This namespace, without specifying, is created automatically and ready before we feeding our code into the compiler. Its initialization is done by coming part of cxx_init_decl_processing.
cxx_init_decl_processing (continue)
2951 /* Fill in back-end hooks. */
2952 lang_missing_noreturn_ok_p = &cp_missing_noreturn_ok_p;
2953
2954 /* Create the global variables. */
2955 push_to_top_level ();
2956
2957 current_function_decl = NULL_TREE;
2958 current_binding_level = NULL;
2959 /* Enter the global namespace. */
2960 my_friendly_assert (global_namespace == NULL_TREE, 375);
2961 global_namespace = build_lang_decl (NAMESPACE_DECL, global_scope_name,
2962 void_type_node);
2963 begin_scope (sk_namespace, global_namespace);
2964
2965 current_lang_name = NULL_TREE;
To cope with name confliction and the pollution of the naming space, C++ introduces name space. Followed is more complex rule for determing binding scope of declarations. For example:
1 namespace A {
2 int f;
3 namespace B {
4 struct Z {
5 static float f;
6 void method () { f... }; // use Z::f
7 };
8 Z::f = 0.0f;
9 extern “C” int ci; // push into global namespace
10 void gFunc() { float f; ... } // use local f
11 void gFunc(int) { f.... } // use A::f
12 void gFunc(float) { Z::f... } // use Z::f
13 }
14 }
Variables of name “f” are declared in namespace, class and function definitions. These instances can coexist, because according to C++ standard, every place using “f”, only one definition is available. The one declared within function body is called local variable, as its visibilty is confined within the body. But those declared within namespace or class, may still be visited via appropriate statement from outside. So the compiler needsn’t maintain extra information (the data structure kept in function data structure is good enough). But for declarations within namespace and class scope, the compiler needes keep their information globally, by which in every referring place, right entity can be found or correct error information can be generated. The data structure is saved_scope below
688 struct saved_scope GTY(()) in cp-tree.h
689 {
690 cxx_saved_binding *old_bindings;
691 tree old_namespace;
692 tree decl_ns_list;
693 tree class_name;
694 tree class_type;
695 tree access_specifier;
696 tree function_decl;
697 varray_type lang_base;
698 tree lang_name;
699 tree template_parms;
700 tree x_previous_class_type;
701 tree x_previous_class_values;
702 tree x_saved_tree;
703
704 HOST_WIDE_INT x_processing_template_decl;
705 int x_processing_specialization;
706 bool x_processing_explicit_instantiation;
707 int need_pop_function_context;
708
709 struct stmt_tree_s x_stmt_tree;
710
711 struct cp_binding_level *class_bindings;
712 struct cp_binding_level *bindings;
713
714 struct saved_scope *prev;
715 };
Back to above example, see line 9, extern “C” int ci; makes compiler to add “ci” into global namespace, so the compiler needs to go back global namespace temperarily. But at line 10, the compiler needs to restore the runtime environment. Then, if the compiler needs to swap runtime environment, it is better to cache the former environment to be used for restore when return. Above cxx_saved_binding at line 690 is defined for this purpose. It forms a list, the declarations within are linked together.
4721 struct cxx_saved_binding GTY(()) in name-lookup.c
4722 {
4723 /* Link that chains saved C++ bindings for a given name into a stack. */
4724 cxx_saved_binding *previous;
4725 /* The name of the current binding. */
4726 tree identifier;
4727 /* The binding we're saving. */
4728 cxx_binding *binding;
4729 tree class_value;
4730 tree real_type_value;
4731 };
In the structure, field previous points to previous declaration, and so on, declarations within the scope are linked together by order of certain kind. Besides, several declarations of same name is allowed as long as every reference can be uniquely determined. So field binding at line 4728 is used to bind declaration with the scope. Its definition is:
73 struct cxx_binding GTY(()) in name-lookup.h
74 {
75 /* Link to chain together various bindings for this name. */
76 cxx_binding *previous;
77 /* The non-type entity this name is bound to. */
78 tree value;
79 /* The type entity this name is bound to. */
80 tree type;
81 /* The scope at which this binding was made. */
82 cxx_scope *scope;
83 unsigned value_is_inherited : 1;
84 unsigned is_local : 1;
85 };
Within, field previous at line 76 points to another binding structure of the name. Thus all declarations of the name are linked together via this previous field in every cxx_binding. The declaration may be of type, may be of non-type, for which field value and type at line 78, 79 match. Further, the link to data structure of the scope is neccessary, so scope at line 82 points to the entity standing for scope of below type.
63 typedef struct cp_binding_level cxx_scope; in name-lookup.h
Besides used for scope of namespace or class, it is also applied to other kind of scopes in C++ concept. For example: function (treated equivalently as {} block), tyy/catch block, function parameter, for-block-initializer, template parameter, specialized template parameter, these kinds information are kept in kind field at line 214. And field level_chain at line 200 refers to outer scope (note, the outer scopes are effective).
144 struct cp_binding_level GTY(()) in name-lookup.h
145 {
146 /* A chain of _DECL nodes for all variables, constants, functions,
147 and typedef types. These are in the reverse of the order
148 supplied. There may be OVERLOADs on this list, too, but they
149 are wrapped in TREE_LISTs; the TREE_VALUE is the OVERLOAD. */
150 tree names;
151
152 /* Count of elements in names chain. */
153 size_t names_size;
154
155 /* A chain of NAMESPACE_DECL nodes. */
156 tree namespaces;
157
158 /* An array of static functions and variables (for namespaces only) */
159 varray_type static_decls;
160
161 /* A chain of VTABLE_DECL nodes. */
162 tree vtables;
163
164 /* A dictionary for looking up user-defined-types. */
165 binding_table type_decls;
166
167 /* A list of USING_DECL nodes. */
168 tree usings;
169
170 /* A list of used namespaces. PURPOSE is the namespace,
171 VALUE the common ancestor with this binding_level's namespace. */
172 tree using_directives;
173
174 /* If this binding level is the binding level for a class, then
175 class_shadowed is a TREE_LIST. The TREE_PURPOSE of each node
176 is the name of an entity bound in the class. The TREE_TYPE is
177 the DECL bound by this name in the class. */
178 tree class_shadowed;
179
180 /* Similar to class_shadowed, but for IDENTIFIER_TYPE_VALUE, and
181 is used for all binding levels. In addition the TREE_VALUE is the
182 IDENTIFIER_TYPE_VALUE before we entered the class. */
183 tree type_shadowed;
184
185 /* A TREE_LIST. Each TREE_VALUE is the LABEL_DECL for a local
186 label in this scope. The TREE_PURPOSE is the previous value of
187 the IDENTIFIER_LABEL VALUE. */
188 tree shadowed_labels;
189
190 /* For each level (except not the global one),
191 a chain of BLOCK nodes for all the levels
192 that were entered and exited one level down. */
193 tree blocks;
194
195 /* The entity (namespace, class, function) the scope of which this
196 binding contour corresponds to. Otherwise NULL. */
197 tree this_entity;
198
199 /* The binding level which this one is contained in (inherits from). */
200 struct cp_binding_level *level_chain;
201
202 /* List of VAR_DECLS saved from a previous for statement.
203 These would be dead in ISO-conforming code, but might
204 be referenced in ARM-era code. These are stored in a
205 TREE_LIST; the TREE_VALUE is the actual declaration. */
206 tree dead_vars_from_for;
207
208 /* Binding depth at which this level began. */
209 int binding_depth;
210
211 /* The kind of scope that this object represents. However, a
212 SK_TEMPLATE_SPEC scope is represented with KIND set to
213 SK_TEMPALTE_PARMS and EXPLICIT_SPEC_P set to true. */
214 ENUM_BITFIELD (scope_kind) kind : 4;
215
216 /* True if this scope is an SK_TEMPLATE_SPEC scope. This field is
217 only valid if KIND == SK_TEMPLATE_PARMS. */
218 BOOL_BITFIELD explicit_spec_p : 1;
219
220 /* true means make a BLOCK for this level regardless of all else. */
221 unsigned keep : 1;
222
223 /* Nonzero if this level can safely have additional
224 cleanup-needing variables added to it. */
225 unsigned more_cleanups_ok : 1;
226 unsigned have_cleanups : 1;
227
228 /* 22 bits left to fill a 32-bit word. */
229 };
When declarations are within class scope, they are needed recorded in fields at line 178 and 183 so as to able to return to up class scope quickly. Field blocks at line 193 refers to blocks forming local scopes within the scope. And this_entity at line 197 refers to the entity the scope belonged to.
Function push_to_top_level swaps to global namespace. As long as no function under compiling, global variable scope_chain always refers to current scope.
At line 4796, current_binding_level has following definition. If cfun is non-null, means a function is under processing, macro cp_function_chain accesses field language of cfun; on the other hand, scope_chain maintains other non-function scopes.
233 #define current_binding_level / in name-lookup.h
234 (*(cfun && cp_function_chain->bindings /
235 ? &cp_function_chain->bindings /
236 : &scope_chain->bindings))
During compilation, it usually needs switch scope temperarly. For example, at line 9 in above example, statement “extern “C” int ci;”, the compiler needs go back global namespace, and create related tree nodes for ci. Then back to original scope. In this procedure, scope_chain is also used to cache the scope temperary retreated, and at time the compiler returns back, restores the original scope by the cached.
4785 void
4786 push_to_top_level (void) in name-lookup.c
4787 {
4788 struct saved_scope *s;
4789 struct cp_binding_level *b;
4790 cxx_saved_binding *old_bindings;
4791 int need_pop;
4792
4793 timevar_push (TV_NAME_LOOKUP);
4794 s = ggc_alloc_cleared (sizeof (struct saved_scope));
4795
4796 b = scope_chain ? current_binding_level : 0;
4797
4798 /* If we're in the middle of some function, save our state. */
4799 if (cfun)
4800 {
4801 need_pop = 1;
4802 push_function_context_to (NULL_TREE);
4803 }
4804 else
4805 need_pop = 0;
4806
4807 old_bindings = NULL;
4808 if (scope_chain && previous_class_type)
4809 old_bindings = store_bindings (previous_class_values, old_bindings);
4810
4811 /* Have to include the global scope, because class-scope decls
4812 aren't listed anywhere useful. */
4813 for (; b; b = b->level_chain)
4814 {
4815 tree t;
4816
4817 /* Template IDs are inserted into the global level. If they were
4818 inserted into namespace level, finish_file wouldn't find them
4819 when doing pending instantiations. Therefore, don't stop at
4820 namespace level, but continue until :: . */
4821 if (global_scope_p (b))
4822 break;
4823
4824 old_bindings = store_bindings (b->names, old_bindings);
4825 /* We also need to check class_shadowed to save class-level type
4826 bindings, since pushclass doesn't fill in b->names. */
4827 if (b->kind == sk_class)
4828 old_bindings = store_bindings (b->class_shadowed, old_bindings);
4829
4830 /* Unwind type-value slots back to top level. */
4831 for (t = b->type_shadowed; t; t = TREE_CHAIN (t))
4832 SET_IDENTIFIER_TYPE_VALUE (TREE_PURPOSE (t), TREE_VALUE (t));
4833 }
4834 s->prev = scope_chain;
4835 s->old_bindings = old_bindings;
4836 s->bindings = b;
4837 s->need_pop_function_context = need_pop;
4838 s->function_decl = current_function_decl;
4839
4840 scope_chain = s;
4841 current_function_decl = NULL_TREE;
4842 VARRAY_TREE_INIT (current_lang_base, 10, "current_lang_base");
4843 current_lang_name = lang_name_cplusplus;
4844 current_namespace = global_namespace;
4845 timevar_pop (TV_NAME_LOOKUP);
4846 }
Pay attention to line 4794, at time generating this global namespace, the compiler also allocates an instance of saved_scope. Thus, scope_chain always has an empty saved_scope node at tail. If currently processing function, cfun at line 4799 is null, push_function_context_to saves the runtime environment of the function into outer_function_chain, as if we are entering the nested function. This environment would be restored too when the compiler returns back.
We have known that declarations of same name are chained together via nodes of cxx_binding. Later, we will see that the compiler’s current implementation would remove the invalid declarations from this cxx_binding chain; and at time entering new scope, would insert in new cxx_binding nodes. Thus entering a new scope usually is not a cheap operation. Considering class definition, almost every program using class will have following method defintiion:
void A::m1 () { ... }
// back to global namespace
void A::m2 () { ... }
// back to global namespace
....
When finishing m1 processing, the compiler withdraws up scope (general is global namespace), and next statement introduces m2, the compiler needs to push the scope of A again. It is not uncommon for A possessing more methods, it worthes to cache the scope. Above at line 4808, previous_class_type points to the outmost class last exitted (at that time, the current scope either is the same as that of previous_class_type, or is the non-class scope outside previous_class_type; otherwise, previous_class_type and previous_class_values are both null), and previous_class_values refers to the declarations within.
FOR loop at line 4813 would cache all declarations belong to scopes from current scope to global namespace (not included). At line 4831, type_shadowed have recorded the declaraions shadowed by scope b, which returns to up level, these shadowed declarations should be recovered (via SET_IDENTIFIER_TYPE_VALUE to setup the corresponding type).
403 #define SET_IDENTIFIER_TYPE_VALUE(NODE,TYPE) (TREE_TYPE (NODE) = (TYPE))
Function store_bindings saves those declarations into the chain of cxx_saved_binding, which at last is merged in scope_chain at line 4835 above.
4740 static cxx_saved_binding *
4741 store_bindings (tree names, cxx_saved_binding *old_bindings) in name-lookup.c
4742 {
4743 tree t;
4744 cxx_saved_binding *search_bindings = old_bindings;
4745
4746 timevar_push (TV_NAME_LOOKUP);
4747 for (t = names; t; t = TREE_CHAIN (t))
4748 {
4749 tree id;
4750 cxx_saved_binding *saved;
4751 cxx_saved_binding *t1;
4752
4753 if (TREE_CODE (t) == TREE_LIST)
4754 id = TREE_PURPOSE (t);
4755 else
4756 id = DECL_NAME (t);
4757
4758 if (!id
4759 /* Note that we may have an IDENTIFIER_CLASS_VALUE even when
4760 we have no IDENTIFIER_BINDING if we have left the class
4761 scope, but cached the class-level declarations. */
4762 || !(IDENTIFIER_BINDING (id) || IDENTIFIER_CLASS_VALUE (id)))
4763 continue;
4764
4765 for (t1 = search_bindings; t1; t1 = t1->previous)
4766 if (t1->identifier == id)
4767 goto skip_it;
4768
4769 my_friendly_assert (TREE_CODE (id) == IDENTIFIER_NODE, 135);
4770 saved = cxx_saved_binding_make ();
4771 saved->previous = old_bindings;
4772 saved->identifier = id;
4773 saved->binding = IDENTIFIER_BINDING (id);
4774 saved->class_value = IDENTIFIER_CLASS_VALUE (id);;
4775 saved->real_type_value = REAL_IDENTIFIER_TYPE_VALUE (id);
4776 IDENTIFIER_BINDING (id) = NULL;
4777 IDENTIFIER_CLASS_VALUE (id) = NULL_TREE;
4778 old_bindings = saved;
4779 skip_it:
4780 ;
4781 }
4782 POP_TIMEVAR_AND_RETURN (TV_NAME_LOOKUP, old_bindings);
4783 }
Declarations can share one name, but the corresponding identifer is unique across the compilation. The declaration it matches for is determined by the scopes. At line 4773 id is the node for identifier, and IDENTIFIER_BINDING points to certain node upon the cxx_binding chain, which is the valid instance in current scope, and its previous field refers to valid instance in upper scope. Further, if current scope is also a class scope, IDENTIFIER_CLASS_VALUE will also points to the node of cxx_binding referred by IDENTIFIER_BINDING.
In our current scenario, after entering global namespace, it needesn’t return and nowhere return (so scope_chain keeps an empty node, and no longer null). But for other scenarios, it needs to return to original scope, which is done by following function.
4848 void
4849 pop_from_top_level (void) in name-lookup.c
4850 {
4851 struct saved_scope *s = scope_chain;
4852 cxx_saved_binding *saved;
4853
4854 timevar_push (TV_NAME_LOOKUP);
4855 /* Clear out class-level bindings cache. */
4856 if (previous_class_type)
4857 invalidate_class_lookup_cache ();
4858
4859 current_lang_base = 0;
4860
4861 scope_chain = s->prev;
4862 for (saved = s->old_bindings; saved; saved = saved->previous)
4863 {
4864 tree id = saved->identifier;
4865
4866 IDENTIFIER_BINDING (id) = saved->binding;
4867 IDENTIFIER_CLASS_VALUE (id) = saved->class_value;
4868 SET_IDENTIFIER_TYPE_VALUE (id, saved->real_type_value);
4869 }
4870
4871 /* If we were in the middle of compiling a function, restore our
4872 state. */
4873 if (s->need_pop_function_context)
4874 pop_function_context_from (NULL_TREE);
4875 current_function_decl = s->function_decl;
4876 timevar_pop (TV_NAME_LOOKUP);
4877 }
In other scenario, class referred by previous_class_type may have been changed and no longer needed, clearing this cache here.However, its original content would be recovered by FOR loop at line 4862.
5548 void
5549 invalidate_class_lookup_cache (void) in class.c
5550 {
5551 tree t;
5552
5553 /* The IDENTIFIER_CLASS_VALUEs are no longer valid. */
5554 for (t = previous_class_values; t; t = TREE_CHAIN (t))
5555 IDENTIFIER_CLASS_VALUE (TREE_PURPOSE (t)) = NULL_TREE;
5556
5557 previous_class_values = NULL_TREE;
5558 previous_class_type = NULL_TREE;
5559 }