4.1.3.1.2.3.1.1. #pragma interface & #pragma implementation
As extension to C++ provided by GCC, #pragma interface “filename” (op) and #pragma implementation “filename” (op) are described by [6].
6.3 Vague Linkage
There are several constructs in C++ which require space in the object file but are not clearly tied to a single translation unit. We say that these constructs have “vague linkage”. Typically such constructs are emitted wherever they are needed, though sometimes we can be cleverer.
Inline Functions
Inline functions are typically defined in a header file which can be included in many different compilations. Hopefully they can usually be inlined, but sometimes an out-of-line copy is necessary, if the address of the function is taken or if inlining fails. In general, we emit an out-of-line copy in all translation units where one is needed. As an exception, we only emit inline virtual functions with the vtable, since it will always require a copy.
Local static variables and string constants used in an inline function are also considered to have vague linkage, since they must be shared between all inlined and out-of-line instances of the function.
VTables
C++ virtual functions are implemented in most compilers using a lookup table, known as a vtable. The vtable contains pointers to the virtual functions provided by a class, and each object of the class contains a pointer to its vtable (or vtables, in some multiple-inheritance situations). If the class declares any noninline, non-pure virtual functions, the first one is chosen as the “key method” for the class, and the vtable is only emitted in the translation unit where the key method is defined.
Note: If the chosen key method is later defined as inline, the vtable will still be emitted in every translation unit which defines it. Make sure that any inline virtuals are declared inline in the class body, even if they are not defined there.
type info objects
C++ requires information about types to be written out in order to implement ‘dynamic_cast’, ‘typeid’ and exception handling. For polymorphic classes (classes with virtual functions), the type info object is written out along with the vtable so that ‘dynamic_cast’ can determine the dynamic type of a class object at runtime. For all other types, we write out the type info object when it is used: when applying ‘typeid’ to an expression, throwing an object, or referring to a type in a catch clause or exception specification.
Template Instantiations
Most everything in this section also applies to template instantiations, but there are other options as well.
When used with GNU ld version 2.8 or later on an ELF system such as GNU/Linux or Solaris 2, or on Microsoft Windows, duplicate copies of these constructs will be discarded at link time. This is known as COMDAT support. On targets that don’t support COMDAT, but do support weak symbols, GCC will use them. This way one copy will override all the others, but the unused copies will still take up space in the executable.
For targets which do not support either COMDAT or weak symbols, most entities with vague linkage will be emitted as local symbols to avoid duplicate definition errors from the linker. This will not happen for local statics in inlines, however, as having multiple copies will almost certainly break things.
6.4 #pragma interface and implementation
#pragma interface and #pragma implementation provide the user with a way of explicitly directing the compiler to emit entities with vague linkage (and debugging information) in a particular translation unit.
Note: As of GCC 2.7.2, these #pragmas are not useful in most cases, because of COMDAT support and the “key method” heuristic mentioned in Section 6.3 [Vague Linkage].
Using them can actually cause your program to grow due to unnecessary out-of-line copies of inline functions. Currently (3.4) the only benefit of these #pragmas is reduced duplication of debugging information, and that should be addressed soon on DWARF 2 targets with the use of COMDAT groups.
#pragma interface
#pragma interface "subdir/objects.h"
Use this directive in header files that define object classes, to save space in most of the object files that use those classes. Normally, local copies of certain information (backup copies of inline member functions, debugging information, and the internal tables that implement virtual functions) must be kept in each object file that includes class definitions. You can use this pragma to avoid such duplication. When a header file containing ‘#pragma interface’ is included in a compilation, this auxiliary information will not be generated (unless the main input source file itself uses ‘#pragma implementation’). Instead, the object files will contain references to be resolved at link time.
The second form of this directive is useful for the case where you have multiple headers with the same name in different directories. If you use this form, you must specify the same string to ‘#pragma implementation’.
#pragma implementation
#pragma implementation "objects.h"
Use this pragma in a main input file, when you want full output from included header files to be generated (and made globally visible). The included header file, in turn, should use ‘#pragma interface’. Backup copies of inline member functions, debugging information, and the internal tables used to implement virtual functions are all generated in implementation files.
If you use ‘#pragma implementation’ with no argument, it applies to an include file with the same basename as your source file. For example, in ‘allclass.cc’, giving just ‘#pragma implementation’ by itself is equivalent to ‘#pragma implementation "allclass.h"’.
In versions of GNU C++ prior to 2.6.0 ‘allclass.h’ was treated as an implementation file whenever you would include it from ‘allclass.cc’ even if you never specified ‘#pragma implementation’. This was deemed to be more trouble than it was worth, however, and disabled.
Use the string argument if you want a single implementation file to include code from multiple header files. (You must also use ‘#include’ to include the header file; ‘#pragma implementation’ only specifies how to use the file—it doesn’t actually include it).
There is no way to split up the contents of a single header file into multiple implementation files.
‘#pragma implementation’ and ‘#pragma interface’ also have an effect on function inlining.
If you define a class in a header file marked with ‘#pragma interface’, the effect on an inline function defined in that class is similar to an explicit extern declaration—the compiler emits no code at all to define an independent version of the function. Its definition is used only for inlining with its callers.
Conversely, when you include the same header file in a main source file that declares it as ‘#pragma implementation’, the compiler emits code for the function itself; this defines a version of the function that can be found via pointers (or by callers compiled without inlining). If all calls to the function can be inlined, you can avoid emitting the function by compiling with ‘-fno-implement-inlines’. If any calls were not inlined, you will get linker errors.
So in GCC, interface_only if nonzero, means that we are in an "interface" section of the compiler. And interface_unknown if nonzero means we cannot trust the value of interface_only.
The corresponding handler of lexer can be helpful for the understanding, which is invoked when seeing “#pragma interface” and “#pragma implementation” respectively.
526 static void
527 handle_pragma_interface (cpp_reader* dfile ATTRIBUTE_UNUSED ) in lex.c
528 {
529 tree fname = parse_strconst_pragma ("interface", 1);
530 struct c_fileinfo *finfo;
531 const char *main_filename;
532
533 if (fname == (tree)-1)
534 return;
535 else if (fname == 0)
536 main_filename = lbasename (input_filename);
537 else
538 main_filename = TREE_STRING_POINTER (fname);
539
540 finfo = get_fileinfo (input_filename);
541
542 if (impl_file_chain == 0)
543 {
544 /* If this is zero at this point, then we are
545 auto-implementing. */
546 if (main_input_filename == 0)
547 main_input_filename = input_filename;
548 }
549
550 interface_only = interface_strcmp (main_filename);
551 #ifdef MULTIPLE_SYMBOL_SPACES
552 if (! interface_only)
553 #endif
554 interface_unknown = 0;
555
556 finfo->interface_only = interface_only;
557 finfo->interface_unknown = interface_unknown;
558 }
As we know, GCC uses splay tree to store the constructed path of each file we try to open, and assoicated with the constructed path is the value of type c_fileinfo as below.
1284 struct c_fileinfo in c-common.h
1285 {
1286 int time; /* Time spent in the file. */
1287 short interface_only; /* Flags - used only by C++ */
1288 short interface_unknown;
1289 };
Pay attention to interface_only and interface_unknown below at line 125 & 126.
113 struct c_fileinfo *
114 get_fileinfo (const char *name) in lex.c
115 {
116 splay_tree_node n;
117 struct c_fileinfo *fi;
118
119 n = splay_tree_lookup (file_info_tree, (splay_tree_key) name);
120 if (n)
121 return (struct c_fileinfo *) n->value;
122
123 fi = xmalloc (sizeof (struct c_fileinfo));
124 fi->time = 0;
125 fi->interface_only = 0;
126 fi->interface_unknown = 1;
127 splay_tree_insert (file_info_tree, (splay_tree_key) name,
128 (splay_tree_value) fi);
129 return fi;
130 }
Then at line 550, interface_strcmp returns zero if the same basename is declared as implementation too. In fact interface_strcmp searches within impl_file_chain, which indicates by its name is the chain of implementation files. It is of course filled by handler of “#pragma implementation”.
568 static void
569 handle_pragma_implementation (cpp_reader* dfile ATTRIBUTE_UNUSED ) in lex.c
570 {
571 tree fname = parse_strconst_pragma ("implementation", 1);
572 const char *main_filename;
573 struct impl_files *ifiles = impl_file_chain;
574
575 if (fname == (tree)-1)
576 return;
577
578 if (fname == 0)
579 {
580 if (main_input_filename)
581 main_filename = main_input_filename;
582 else
583 main_filename = input_filename;
584 main_filename = lbasename (main_filename);
585 }
586 else
587 {
588 main_filename = TREE_STRING_POINTER (fname);
589 if (cpp_included (parse_in, main_filename))
590 warning ("#pragma implementation for %s appears after file is included",
591 main_filename);
592 }
593
594 for (; ifiles; ifiles = ifiles->next)
595 {
596 if (! strcmp (ifiles->filename, main_filename))
597 break;
598 }
599 if (ifiles == 0)
600 {
601 ifiles = xmalloc (sizeof (struct impl_files));
602 ifiles->filename = main_filename;
603 ifiles->next = impl_file_chain;
604 impl_file_chain = ifiles;
605 }
606 }
Then have a look back handle_pragma_interface. As invoking interface_strcmp to check if it is only interface, it means when declaring a file being implementation part, “#pragma implementation” should always appear at head of the file before “#include” directives. Also see line 590 in handle_pragma_implementation, warning will be given out for the case “#pragma implementation” appears after “#include” directives.
These pragmas are handled during on-fly macro expansion when read in the source file. So when finishing read in file, it can retrieve interface related information from file assoicated c_fileinfo by extract_interface_info.
436 void
437 extract_interface_info (void) in lex.c
438 {
439 struct c_fileinfo *finfo;
440
441 finfo = get_fileinfo (input_filename);
442 interface_only = finfo->interface_only;
443 interface_unknown = finfo->interface_unknown;
444 }
Back to cb_file_change, as we are now creating ENTER block for the main file, condition at line 1509 is not satisfied.