Studying note of GCC-3.4.6 source (52)

4.3.1.7.    Initialize for declaration processing

Declaration is important part of C++. It is function declaration, variable declaration, type declaration, namespace declaration, etc. gives C++ power and flexibility. In the C++ compiler, the initialization of the mechanism of this part is very important but complex, which also builds the runtime environment of the language.

 

cxx_init (continue)

 

410    cxx_init_decl_processing ();

 

2942 void

2943 cxx_init_decl_processing (void)                                                                 in decl.c

2944 {

2945   tree void_ftype;

2946   tree void_ftype_ptr;

2947

2948   /* Create all the identifiers we need.  */

2949   initialize_predefined_identifiers ();

4.3.1.7.1.            Predefined identifiers

Besides the reserved words of the language, in compiler, to maintain runtime environment, the compiler needs to insert codes, during which the compiler needs predefined identifiers which have determined meaning known by compiler. No doubt, these names can’t be redeclared in user source code anymore. But can be used with care (without any forward declaration).

 

2900 static void

2901 initialize_predefined_identifiers (void)                                                               in decl.c

2902 {

2903   const predefined_identifier *pid;

2904

2905   /* A table of identifiers to create at startup.  */

2906   static const predefined_identifier predefined_identifiers[] = {

2907     { "C++", &lang_name_cplusplus, 0 },

2908     { "C", &lang_name_c, 0 },

2909     { "Java", &lang_name_java, 0 },

2910     { CTOR_NAME, &ctor_identifier, 1 },

2911     { "__base_ctor", &base_ctor_identifier, 1 },

2912     { "__comp_ctor", &complete_ctor_identifier, 1 },

2913     { DTOR_NAME, &dtor_identifier, 1 },

2914     { "__comp_dtor", &complete_dtor_identifier, 1 },

2915     { "__base_dtor", &base_dtor_identifier, 1 },

2916     { "__deleting_dtor", &deleting_dtor_identifier, 1 },

2917     { IN_CHARGE_NAME, &in_charge_identifier, 0 },

2918     { "nelts", &nelts_identifier, 0 },

2919     { THIS_NAME, &this_identifier, 0 },

2920     { VTABLE_DELTA_NAME, &delta_identifier, 0 },

2921     { VTABLE_PFN_NAME, &pfn_identifier, 0 },

2922     { "_vptr", &vptr_identifier, 0 },

2923     { "__vtt_parm", &vtt_parm_identifier, 0 },

2924     { "::", &global_scope_name, 0 },

2925     { "std", &std_identifier, 0 },

2926     { NULL, NULL, 0 }

2927   };

2928

2929   for (pid = predefined_identifiers; pid->name; ++pid)

2930   {

2931     *pid->node = get_identifier (pid->name);

2932     if (pid->ctor_or_dtor_p)

2933       IDENTIFIER_CTOR_OR_DTOR_P (*pid->node) = 1;

2934   }

2935 }

 

For every predefined identifier, it is of type predefined_identifier as below. These predefined identifiers are not allowed redefined by programmer, so node in the type is declared as const.

 

2888 typedef struct predefined_identifier                                                                    in decl.c

2889 {

2890   /* The name of the identifier.  */

2891   const char *const name;

2892   /* The place where the IDENTIFIER_NODE should be stored.  */

2893   tree *const node;

2894   /* Nonzero if this is the name of a constructor or destructor.  */

2895   const int ctor_or_dtor_p;

2896 } predefined_identifier;

 

In the definition of predefined_identifiers at line 2906, the values assigned to field node of predefined_identifier are all member of cp_global_trees – they are all unique nodes across the system. Macro at line 2933 marks the node as being ctor/dctor.

 

436    #define IDENTIFIER_CTOR_OR_DTOR_P(NODE) /                                in cp-tree.h

437      TREE_LANG_FLAG_3 (NODE)

 

The flag got set is lang_flag_3 in tree_common.

4.3.1.7.2.            Global namespace

Being C++, by default, we code under global namespace. This namespace, without specifying, is created automatically and ready before we feeding our code into the compiler. Its initialization is done by coming part of cxx_init_decl_processing.

 

cxx_init_decl_processing (continue)

 

2951   /* Fill in back-end hooks.  */

2952   lang_missing_noreturn_ok_p = &cp_missing_noreturn_ok_p;

2953

2954   /* Create the global variables.  */

2955   push_to_top_level ();

2956

2957   current_function_decl = NULL_TREE;

2958   current_binding_level = NULL;

2959   /* Enter the global namespace.  */

2960   my_friendly_assert (global_namespace == NULL_TREE, 375);

2961   global_namespace = build_lang_decl (NAMESPACE_DECL, global_scope_name,

2962                                       void_type_node);

2963   begin_scope (sk_namespace, global_namespace);

2964

2965   current_lang_name = NULL_TREE;

4.3.1.7.2.1.      Data structure

To cope with name confliction and the pollution of the naming space, C++ introduces name space. Followed is more complex rule for determing binding scope of declarations. For example:

1     namespace A {

2        int f;

3        namespace B {

4           struct Z {

5              static float f;

6              void method () { f... };              // use Z::f

7           };

8         Z::f = 0.0f;

9         extern “C” int ci;                            // push into global namespace

10          void gFunc() { float f; ... }              // use local f

11          void gFunc(int) { f.... }            // use A::f

12          void gFunc(float) { Z::f... }             // use Z::f

13       }

14    }

Variables of name “f” are declared in namespace, class and function definitions. These instances can coexist, because according to C++ standard, every place using “f”, only one definition is available. The one declared within function body is called local variable, as its visibilty is confined within the body. But those declared within namespace or class, may still be visited via appropriate statement from outside. So the compiler needsn’t maintain extra information (the data structure kept in function data structure is good enough). But for declarations within namespace and class scope, the compiler needes keep their information globally, by which in every referring place, right entity can be found or correct error information can be generated. The data structure is saved_scope below

 

688  struct saved_scope GTY(())                                                                         in cp-tree.h

689  {

690    cxx_saved_binding *old_bindings;

691    tree old_namespace;

692    tree decl_ns_list;

693    tree class_name;

694    tree class_type;

695    tree access_specifier;

696    tree function_decl;

697    varray_type lang_base;

698    tree lang_name;

699    tree template_parms;

700    tree x_previous_class_type;

701    tree x_previous_class_values;

702    tree x_saved_tree;

703 

704    HOST_WIDE_INT x_processing_template_decl;

705    int x_processing_specialization;

706    bool x_processing_explicit_instantiation;

707    int need_pop_function_context;

708 

709    struct stmt_tree_s x_stmt_tree;

710 

711    struct cp_binding_level *class_bindings;

712    struct cp_binding_level *bindings;

713 

714    struct saved_scope *prev;

715  };

 

Back to above example, see line 9, extern “C” int ci; makes compiler to add “ci” into global namespace, so the compiler needs to go back global namespace temperarily. But at line 10, the compiler needs to restore the runtime environment. Then, if the compiler needs to swap runtime environment, it is better to cache the former environment to be used for restore when return. Above cxx_saved_binding at line 690 is defined for this purpose. It forms a list, the declarations within are linked together.

 

4721 struct cxx_saved_binding GTY(())                                                       in name-lookup.c

4722 {

4723   /* Link that chains saved C++ bindings for a given name into a stack.  */

4724   cxx_saved_binding *previous;

4725   /* The name of the current binding.  */

4726   tree identifier;

4727   /* The binding we're saving.  */

4728   cxx_binding *binding;

4729   tree class_value;

4730   tree real_type_value;

4731 };

 

In the structure, field previous points to previous declaration, and so on, declarations within the scope are linked together by order of certain kind. Besides, several declarations of same name is allowed as long as every reference can be uniquely determined. So field binding at line 4728 is used to bind declaration with the scope. Its definition is:

 

73      struct cxx_binding GTY(())                                                                      in name-lookup.h

74      {

75        /* Link to chain together various bindings for this name.  */

76        cxx_binding *previous;

77        /* The non-type entity this name is bound to.  */

78        tree value;

79        /* The type entity this name is bound to.  */

80        tree type;

81        /* The scope at which this binding was made.  */

82        cxx_scope *scope;

83        unsigned value_is_inherited : 1;

84        unsigned is_local : 1;

85      };

 

Within, field previous at line 76 points to another binding structure of the name. Thus all declarations of the name are linked together via this previous field in every cxx_binding. The declaration may be of type, may be of non-type, for which field value and type at line 78, 79 match. Further, the link to data structure of the scope is neccessary, so scope at line 82 points to the entity standing for scope of below type.

 

63      typedef struct cp_binding_level cxx_scope;                                         in name-lookup.h

 

Besides used for scope of namespace or class, it is also applied to other kind of scopes in C++ concept. For example: function (treated equivalently as {} block), tyy/catch block, function parameter, for-block-initializer, template parameter, specialized template parameter, these kinds information are kept in kind field at line 214. And field level_chain at line 200 refers to outer scope (note, the outer scopes are effective).

 

144    struct cp_binding_level GTY(())                                                         in name-lookup.h

145    {

146      /* A chain of _DECL nodes for all variables, constants, functions,

147        and typedef types. These are in the reverse of the order

148        supplied. There may be OVERLOADs on this list, too, but they

149        are wrapped in TREE_LISTs; the TREE_VALUE is the OVERLOAD.  */

150        tree names;

151   

152        /* Count of elements in names chain.  */

153        size_t names_size;

154   

155        /* A chain of NAMESPACE_DECL nodes.  */

156        tree namespaces;

157   

158        /* An array of static functions and variables (for namespaces only) */

159        varray_type static_decls;

160   

161        /* A chain of VTABLE_DECL nodes.  */

162        tree vtables;

163   

164        /* A dictionary for looking up user-defined-types.  */

165        binding_table type_decls;

166   

167        /* A list of USING_DECL nodes.  */

168        tree usings;

169   

170        /* A list of used namespaces. PURPOSE is the namespace,

171          VALUE the common ancestor with this binding_level's namespace.  */

172        tree using_directives;

173   

174        /* If this binding level is the binding level for a class, then

175          class_shadowed is a TREE_LIST. The TREE_PURPOSE of each node

176          is the name of an entity bound in the class. The TREE_TYPE is

177          the DECL bound by this name in the class.  */

178        tree class_shadowed;

179   

180        /* Similar to class_shadowed, but for IDENTIFIER_TYPE_VALUE, and

181          is used for all binding levels. In addition the TREE_VALUE is the

182          IDENTIFIER_TYPE_VALUE before we entered the class.  */

183        tree type_shadowed;

184   

185        /* A TREE_LIST. Each TREE_VALUE is the LABEL_DECL for a local

186          label in this scope. The TREE_PURPOSE is the previous value of

187          the IDENTIFIER_LABEL VALUE.  */

188        tree shadowed_labels;

189   

190        /* For each level (except not the global one),

191          a chain of BLOCK nodes for all the levels

192          that were entered and exited one level down.  */

193        tree blocks;

194   

195        /* The entity (namespace, class, function) the scope of which this

196          binding contour corresponds to. Otherwise NULL.  */

197        tree this_entity;

198   

199       /* The binding level which this one is contained in (inherits from).  */

200        struct cp_binding_level *level_chain;

201   

202        /* List of VAR_DECLS saved from a previous for statement.

203          These would be dead in ISO-conforming code, but might

204          be referenced in ARM-era code. These are stored in a

205          TREE_LIST; the TREE_VALUE is the actual declaration.  */

206        tree dead_vars_from_for;

207   

208        /* Binding depth at which this level began.  */

209        int binding_depth;

210   

211         /* The kind of scope that this object represents. However, a

212          SK_TEMPLATE_SPEC scope is represented with KIND set to

213          SK_TEMPALTE_PARMS and EXPLICIT_SPEC_P set to true.  */

214        ENUM_BITFIELD (scope_kind) kind : 4;

215   

216       /* True if this scope is an SK_TEMPLATE_SPEC scope. This field is

217          only valid if KIND == SK_TEMPLATE_PARMS.  */

218        BOOL_BITFIELD explicit_spec_p : 1;

219   

220        /* true means make a BLOCK for this level regardless of all else.  */

221        unsigned keep : 1;

222   

223        /* Nonzero if this level can safely have additional

224          cleanup-needing variables added to it.  */

225        unsigned more_cleanups_ok : 1;

226        unsigned have_cleanups : 1;

227   

228        /* 22 bits left to fill a 32-bit word.  */

229      };

 

When declarations are within class scope, they are needed recorded in fields at line 178 and 183 so as to able to return to up class scope quickly. Field blocks at line 193 refers to blocks forming local scopes within the scope. And this_entity at line 197 refers to the entity the scope belonged to.

4.3.1.7.2.2.      Enter global namespace

Function push_to_top_level swaps to global namespace. As long as no function under compiling, global variable scope_chain always refers to current scope.

At line 4796, current_binding_level has following definition. If cfun is non-null, means a function is under processing, macro cp_function_chain accesses field language of cfun; on the other hand, scope_chain maintains other non-function scopes.

 

233    #define current_binding_level                  /                                         in name-lookup.h

234      (*(cfun && cp_function_chain->bindings     /

235         ? &cp_function_chain->bindings            /

236          : &scope_chain->bindings))

 

During compilation, it usually needs switch scope temperarly. For example, at line 9 in above example, statement “extern “C” int ci;”, the compiler needs go back global namespace, and create related tree nodes for ci. Then back to original scope. In this procedure, scope_chain is also used to cache the scope temperary retreated, and at time the compiler returns back, restores the original scope by the cached.

 

4785 void

4786 push_to_top_level (void)                                                                     in name-lookup.c

4787 {

4788   struct saved_scope *s;

4789   struct cp_binding_level *b;

4790   cxx_saved_binding *old_bindings;

4791   int need_pop;

4792

4793   timevar_push (TV_NAME_LOOKUP);

4794   s = ggc_alloc_cleared (sizeof (struct saved_scope));

4795

4796   b = scope_chain ? current_binding_level : 0;

4797

4798   /* If we're in the middle of some function, save our state.  */

4799   if (cfun)

4800   {

4801     need_pop = 1;

4802     push_function_context_to (NULL_TREE);

4803   }

4804   else

4805     need_pop = 0;

4806

4807   old_bindings = NULL;

4808   if (scope_chain && previous_class_type)

4809     old_bindings = store_bindings (previous_class_values, old_bindings);

4810

4811   /* Have to include the global scope, because class-scope decls

4812     aren't listed anywhere useful.  */

4813   for (; b; b = b->level_chain)

4814   {

4815     tree t;

4816

4817     /* Template IDs are inserted into the global level. If they were

4818       inserted into namespace level, finish_file wouldn't find them

4819       when doing pending instantiations. Therefore, don't stop at

4820       namespace level, but continue until :: .  */

4821     if (global_scope_p (b))

4822       break;

4823

4824     old_bindings = store_bindings (b->names, old_bindings);

4825     /* We also need to check class_shadowed to save class-level type

4826       bindings, since pushclass doesn't fill in b->names.  */

4827     if (b->kind == sk_class)

4828       old_bindings = store_bindings (b->class_shadowed, old_bindings);

4829

4830     /* Unwind type-value slots back to top level.  */

4831     for (t = b->type_shadowed; t; t = TREE_CHAIN (t))

4832       SET_IDENTIFIER_TYPE_VALUE (TREE_PURPOSE (t), TREE_VALUE (t));

4833   }

4834   s->prev = scope_chain;

4835   s->old_bindings = old_bindings;

4836   s->bindings = b;

4837   s->need_pop_function_context = need_pop;

4838   s->function_decl = current_function_decl;

4839

4840   scope_chain = s;

4841   current_function_decl = NULL_TREE;

4842   VARRAY_TREE_INIT (current_lang_base, 10, "current_lang_base");

4843   current_lang_name = lang_name_cplusplus;

4844   current_namespace = global_namespace;

4845   timevar_pop (TV_NAME_LOOKUP);

4846 }

 

Pay attention to line 4794, at time generating this global namespace, the compiler also allocates an instance of saved_scope. Thus, scope_chain always has an empty saved_scope node at tail. If currently processing function, cfun at line 4799 is null, push_function_context_to saves the runtime environment of the function into outer_function_chain, as if we are entering the nested function. This environment would be restored too when the compiler returns back.

We have known that declarations of same name are chained together via nodes of cxx_binding. Later, we will see that the compiler’s current implementation would remove the invalid declarations from this cxx_binding chain; and at time entering new scope, would insert in new cxx_binding nodes. Thus entering a new scope usually is not a cheap operation. Considering class definition, almost every program using class will have following method defintiion:

void A::m1 () { ... }

// back to global namespace

void A::m2 () { ... }

// back to global namespace

....

When finishing m1 processing, the compiler withdraws up scope (general is global namespace), and next statement introduces m2, the compiler needs to push the scope of A again. It is not uncommon for A possessing more methods, it worthes to cache the scope. Above at line 4808, previous_class_type points to the outmost class last exitted (at that time, the current scope either is the same as that of previous_class_type, or is the non-class scope outside previous_class_type; otherwise, previous_class_type and previous_class_values are both null), and previous_class_values refers to the declarations within.

FOR loop at line 4813 would cache all declarations belong to scopes from current scope to global namespace (not included). At line 4831, type_shadowed have recorded the declaraions shadowed by scope b, which returns to up level, these shadowed declarations should be recovered (via SET_IDENTIFIER_TYPE_VALUE to setup the corresponding type).

 

403  #define SET_IDENTIFIER_TYPE_VALUE(NODE,TYPE) (TREE_TYPE (NODE) = (TYPE))

 

Function store_bindings saves those declarations into the chain of cxx_saved_binding, which at last is merged in scope_chain at line 4835 above.

 

4740 static cxx_saved_binding *

4741 store_bindings (tree names, cxx_saved_binding *old_bindings)              in name-lookup.c

4742 {

4743   tree t;

4744   cxx_saved_binding *search_bindings = old_bindings;

4745

4746   timevar_push (TV_NAME_LOOKUP);

4747   for (t = names; t; t = TREE_CHAIN (t))

4748   {

4749     tree id;

4750     cxx_saved_binding *saved;

4751     cxx_saved_binding *t1;

4752

4753     if (TREE_CODE (t) == TREE_LIST)

4754       id = TREE_PURPOSE (t);

4755     else

4756       id = DECL_NAME (t);

4757

4758     if (!id

4759        /* Note that we may have an IDENTIFIER_CLASS_VALUE even when

4760          we have no IDENTIFIER_BINDING if we have left the class

4761          scope, but cached the class-level declarations.  */

4762        || !(IDENTIFIER_BINDING (id) || IDENTIFIER_CLASS_VALUE (id)))

4763       continue;

4764

4765     for (t1 = search_bindings; t1; t1 = t1->previous)

4766       if (t1->identifier == id)

4767         goto skip_it;

4768

4769     my_friendly_assert (TREE_CODE (id) == IDENTIFIER_NODE, 135);

4770     saved = cxx_saved_binding_make ();

4771     saved->previous = old_bindings;

4772     saved->identifier = id;

4773     saved->binding = IDENTIFIER_BINDING (id);

4774     saved->class_value = IDENTIFIER_CLASS_VALUE (id);;

4775     saved->real_type_value = REAL_IDENTIFIER_TYPE_VALUE (id);

4776     IDENTIFIER_BINDING (id) = NULL;

4777     IDENTIFIER_CLASS_VALUE (id) = NULL_TREE;

4778     old_bindings = saved;

4779   skip_it:

4780       ;

4781   }

4782   POP_TIMEVAR_AND_RETURN (TV_NAME_LOOKUP, old_bindings);

4783 }

 

Declarations can share one name, but the corresponding identifer is unique across the compilation. The declaration it matches for is determined by the scopes. At line 4773 id is the node for identifier, and IDENTIFIER_BINDING points to certain node upon the cxx_binding chain, which is the valid instance in current scope, and its previous field refers to valid instance in upper scope. Further, if current scope is also a class scope, IDENTIFIER_CLASS_VALUE will also points to the node of cxx_binding referred by IDENTIFIER_BINDING.

4.3.1.7.2.3.      Return from global namespace

In our current scenario, after entering global namespace, it needesn’t return and nowhere return (so scope_chain keeps an empty node, and no longer null). But for other scenarios, it needs to return to original scope, which is done by following function.

 

4848 void

4849 pop_from_top_level (void)                                                                 in name-lookup.c

4850 {

4851   struct saved_scope *s = scope_chain;

4852   cxx_saved_binding *saved;

4853

4854   timevar_push (TV_NAME_LOOKUP);

4855   /* Clear out class-level bindings cache.  */

4856   if (previous_class_type)

4857     invalidate_class_lookup_cache ();

4858

4859   current_lang_base = 0;

4860

4861   scope_chain = s->prev;

4862   for (saved = s->old_bindings; saved; saved = saved->previous)

4863   {

4864     tree id = saved->identifier;

4865

4866     IDENTIFIER_BINDING (id) = saved->binding;

4867     IDENTIFIER_CLASS_VALUE (id) = saved->class_value;

4868     SET_IDENTIFIER_TYPE_VALUE (id, saved->real_type_value);

4869   }

4870

4871   /* If we were in the middle of compiling a function, restore our

4872     state.  */

4873   if (s->need_pop_function_context)

4874     pop_function_context_from (NULL_TREE);

4875   current_function_decl = s->function_decl;

4876   timevar_pop (TV_NAME_LOOKUP);

4877 }

 

In other scenario, class referred by previous_class_type may have been changed and no longer needed, clearing this cache here.However, its original content would be recovered by FOR loop at line 4862.

 

5548 void

5549 invalidate_class_lookup_cache (void)                                                          in class.c

5550 {

5551   tree t;

5552   

5553   /* The IDENTIFIER_CLASS_VALUEs are no longer valid.  */

5554   for (t = previous_class_values; t; t = TREE_CHAIN (t))

5555     IDENTIFIER_CLASS_VALUE (TREE_PURPOSE (t)) = NULL_TREE;

5556

5557   previous_class_values = NULL_TREE;

5558   previous_class_type = NULL_TREE;

5559 }

你可能感兴趣的:(Studying note of GCC-3.4.6 source (52))