ClearCase supports a well-organized, controlled development environment by maintaining two kinds of data storage:
Permanent data repository — a globally-accessible, shared data resource that can be modified only by ClearCase commands. The repository contains both historical and current development data.
Working data storage — any number of distinct areas which provide “scratchpad storage” for day-to-day development activities. A typical area belongs to an individual user, or to a small group working on the same task.
This chapter and the next describe the contents of versioned object bases, which implement the permanent data repository. Working data storage, implemented by views, is discussed briefly in this chapter and more fully in Chapter 4, “ClearCase Views.”
A versioned object base, or VOB, is the permanent data repository for a development tree or subtree. A VOB stores file system objects: directories, files, symbolic links, and hard links. (It also stores non-file-system information, meta-data, which we discuss in Chapter 3, “ClearCase Meta-Data.”
Many version-control systems require users to perform their day-to-day work on copies of data, only occasionally accessing the permanent data repository. ClearCase allows users to work directly with the repository — that is, directly with VOBs. Direct access is implemented coherently and securely in the multiple-user, multiple-host environment by combining several mechanisms (Figure 2-1):
View context — Any program, not just a ClearCase program, can read a VOB's data. But a program must use a ClearCase view to access a VOB; otherwise, the VOB appears to be empty. Through the view, the VOB appears to be a standard directory tree.
Client programs — A VOB can be modified only by special ClearCase client programs. Most version-control operations are implemented by cleartool (command-line interface), and by (graphical user interface). Audited builds are performed with clearmake and clearaudit.
VOB activation — Typically, a VOB is located on a remote host, rather than on the user's own host. A VOB is made available on the user's host by activating it there, through operating system networking facilities. For example, on a UNIX system, a VOB is activated by mounting it as a file system of type MVFS — ClearCase's multiversion file system type.
Server programs — Only ClearCase server programs, running on the host where the VOB physically resides, perform the “real” work: retrieving data from the VOB and updating it. Client programs communicate with server programs using operating system remote-procedure-call (RPC) facilities.
Typically, a user works on his or her own client host, accessing VOBs that are physically located on one or more remote VOB server hosts.
A VOB is implemented as a VOB storage directory, a directory tree whose principal contents are a set of storage pools and an embedded database (Figure 2-2).
A VOB storage pool is a subdirectory that stores users' file system data. Some storage pools provide a repository for historical and currently-used versions of source files. Other storage pools provide a repository for object modules, executables, and other derived objects created during the development process.
Each VOB is created with an initial set of pools, located within the VOB storage directory. These can be supplemented (or replaced) with pools located on the same disk, on another local disk, or on a remote host. This affords administrators great flexibility, enabling data storage to be placed on hosts where ClearCase itself is not installed (for example, a very fast file server machine).
ClearCase can store individual versions of files in several ways, using such techniques as data compression and line-by-line deltas. The data storage/retrieval facility is extensible — users can supply their own type managers, implementing customized methods for storing and retrieving development data. (See “Element Types and Type Managers”.)
Each VOB has a VOB database, a subdirectory containing information managed by the database management system embedded in ClearCase. The VOB database stores version-control information, along with a wealth of other information, including:
user-defined annotations on source file versions
complete “bill-of-materials” records of software builds
event records that chronicle the changes that users have made to the VOB: creation of new versions, adding of annotations, renaming of source files, and so on
This information is termed meta-data, to distinguish it from file system data. We describe it in more detail in Chapter 3, “ClearCase Meta-Data.”.
Each ClearCase VOB stores version-controlled file system objects, termed elements. An element is a file or directory for which ClearCase maintains multiple versions. The versions of an element are logically organized into a hierarchical version tree, which can include multiple branches and subbranches (Figure 2-3).
Some elements may have version trees with a single branch — the versions form a simple linear sequence. But typically, users define additional branches in some of the elements, in order to isolate work on such tasks as bug fixing, code reorganization, experimentation, and platform-specific development.
Figure 2-3 illustrates several features of ClearCase version trees:
Each element is created with a single branch, named main, which has an empty version, numbered 0.
ClearCase automatically assigns integer version numbers to versions. Each version can also have one or more user-defined version labels (for example, REL1, REL2_BETA, REL2).
One or more branches can be created at any version, each with a user-defined name. (Branches are not numbered in any way.)
ClearCase supports multiple branching levels.
Version 0 on a branch is identical to the version at the branch point.
Each version of an element has a unique version-ID, which indicates its location in the version tree. A version-ID takes the form of a branch-pathname followed by a version-number (Figure 2-4).
For example, the shaded versions at the end of the branches in Figure 2-3 have these version-IDs:
/main/5 /main/motif/2 /main/bugs/3 /main/bugs/bug404/1 /main/bugs/bug417/1 |
Version-IDs look like pathnames because an element's version tree has the same hierarchical structure as a directory tree. Version-IDs work like pathnames, too, as described below.
Typically, users wish to access only one version of an element at a time. A ClearCase view provides a work environment in which at most one version (or perhaps no version at all) of each version-controlled object appears. Thus, a particular view might resolve a simple file name or a full pathname:
util.c ... or ... /vobs/vega/src/util.c |
... to version /main/5 of file element util.c. Another view might resolve the same pathname to version /main/motif/2.
The ability to reference any version with a standard operating system pathname is a very important ClearCase feature, termed transparency. This is what makes a VOB appear to be a standard directory tree, as illustrated in Figure 2-1. Standard operating system utilities and third-party development tools “just work” with VOB data, without requiring any modification, conversion, or wrapper routines. For more on transparency, see Chapter 4, “ClearCase Views.”.
Any version of an element can be uniquely specified by appending its version-ID to its standard pathname, forming a version-extended pathname. Figure 2-5 shows an example.
A version-extended pathname can also use a version label; this enables users to reference versions mnemonically:
% cat util.c@@/REL2 % diff messages.c messages.c@@/REL3_BSLVL2 % /usr/local/bin/validate bgrs.h@@/NEW_STRUCT_TEST |
Version-extended pathnames can be used in standard commands, either alone or in conjunction with standard pathnames. For example, this command searches for a character string in the “current” version and in two “historical” versions:
% grep "InLine" monet.c monet.c@@/main/13 monet.c@@/RLS2.5 |
The extended naming symbol (by default, @@) suppresses ClearCase's automatic version-selection mechanism. It causes the MVFS to interpret the pathname as a reference to the element itself, or to some location in its version tree:
util.c (the version selected by the view) util.c@@ (the element itself) util.c@@/main (one of the element's branches) util.c@@/main/2 (a specific version of the element) |
The entire version tree of an element is embedded under its standard pathname. This enables users to access any (or all) historical versions through pathnames, using any program. The following variant of the above grep command shows the power of this file system extension:
% grep "InLine" monet.c@@/main/* |
In this command, a standard shell wildcard character (*) specifies all the versions on an element's main branch.
This section describes ClearCase facilities for managing the creation of new versions and branches in version-controlled elements.
Like many version-control systems, ClearCase uses a checkout-edit-checkin model to manage the growth of elements' version trees:
In the “steady-state”, an element is read-only — users can neither modify it or remove it.
To modify an element, a user establishes a view context, then enters a checkout command. This seems simply to change the element from read-only to read-write; in actuality, it makes an editable copy — a checked-out version. (For details, see “Revising a Source File / Checkout-Edit-Checkin”.)
The user revises the checked-out version using any available system-supplied or third-party tools.
The user enters a checkin command. This creates a new, permanent version of the element, which then reverts to the “steady-state” of being read-only.
When entering a checkout command, the user implicitly expresses an intention to create a new version of the element at some future time. ClearCase supports either a “first-come-first-served” approach or an “exclusive privilege” approach to managing the creation of new versions.
A reserved checkout grants the exclusive privilege to create the next version on the branch; a branch can have only one reserved checkout at a time. But a branch can have any number of unreserved checkouts. If several users, in different views, have unreserved checkouts of a branch, the first one to perform a checkin “wins”; the others must combine the changes in the newly-created version into their own work before they can perform a checkin. (See “Merging Versions of an Element”.)
In a parallel development environment, the same element can be modified by several projects concurrently; each project creates versions on its own branch, isolated from the changes being made by other projects on other branches.
Figure 2-6 shows an element with two subbranches: the r2_fix branch is for maintenance work; the gopher_port branch is for porting the application to a new architecture. This figure also shows how the checkout command supports parallel development, by localizing its effects:
The checkout command operates on a particular branch of an element, not on the entire element. Several different branches might be checked out at the same time, each for a different programming task. (checkout always checks out the last version on a branch; revising an intermediate version requires creation of a subbranch at that version.)
An element's branch is checked out to one particular view. The checked-out version is visible only in that view; the checkout command leaves all other views unaffected. Each view can checkout only one branch at a time of any given element, because the view can see only one object at the element's pathname.
In a parallel development environment, different branches of an element can remain active for arbitrarily long periods. The contents of the branches inevitably diverge more and more as time passes. In some cases, this may be acceptable; but most organizations wish to keep an element's main branch up-to-date — with bugfixes, with architecture-dependent constructs, and so on. This means that changes made on subbranches must periodically be merged back into the main branch. It is also typical for data to be merged onto other branches — for example, a branch used to develop a port of an application.
ClearCase includes flexible and powerful merge tools. Often, the process of merging all the work performed for a particular development project is completely automatic. When conflicting changes to the same section of code prevent a merge from being completely automatic, users can invoke the xcleardiff graphical merge utility to resolve the conflicts and perform any necessary cleanup (Figure 2-7).
ClearCase supports many variants of the basic merge scenario. For example:
Any version of an element can be merged into any other version; the target need not be on the main branch.
A selective merge can incorporate some, but not all, of the changes made on one branch into another branch.
A subtractive merge can remove the changes made in a set of versions.
A version is actually a two-part compound object:
Object in the VOB database — There is a version object in the VOB database. Pointers to branch and element objects establish the location of the version object in some element's version tree. The version object can have user annotations (for example, version labels); it is referenced by various ClearCase-generated event records and build configuration records.
File in a VOB storage pool — There is a data container file in one of the VOB's source storage pools. This file contains the version's file system data.
Figure 2-8 illustrates the relationship between the two components of a version. In general, ClearCase commands operate directly on the database object; standard programs operate on the associated data container.
In general, users need not be concerned with the dual nature of a version. Some ClearCase features, however, are best understood by keeping the duality in mind. For example:
Users can create versions that consist only of a VOB database object (along with its user annotations and ClearCase-generated records). Similarly, the data container of an existing version can be discarded, leaving only the database object.
Administrators can move the data containers of existing versions to different storage pools.
A view can be configured so that the file system data of some versions disappear, but the corresponding database objects remain accessible to ClearCase client programs.
Each element has an element type, which can be used for either or both of these purposes:
to provide a logical or functional classification for the element
to determine how the element's versions are to be stored and retrieved
ClearCase has several predefined element types, each of which represents a different physical file format for data container file(s) in a source storage pool. Each element type has its own data-access method, which optimizes either performance or storage requirements:
file | arbitrary sequence of bytes; each version is stored in whole-copy format (“as-is”) in a separate data container file |
|
text_file | sequence of non-NULL bytes, separated into “text lines” by <NL> characters; all versions are stored efficiently as deltas (incremental differences) in a single structured data container file |
|
compressed_file and compressed_text_file |
|
The element types file and compressed_file can be used to version-control any file, including executables, programming libraries, databases, bitmap images, and non-ASCII desktop-publishing documents.
ClearCase implements a performance optimization for all file elements whose versions are not stored in whole-copy format. This includes text files (stored in delta format) and compressed files. ClearCase attempts to amortize the cost of “extracting” a particular version (by applying deltas, or by uncompressing) over multiple accesses to the version:
The first time an existing version is extracted from its data container, the “clear text” of the version is stored in a data container in a cleartext storage pool.
On subsequent accesses to the same version, the cleartext data container is used again, eliminating the need for another “extraction”.
Thus, a cleartext storage pool acts as a cache of the file system data for recently-accessed versions.
Users can define their own element types to classify elements functionally, enabling such operations as:
easily identifying and defining operations on all of a project's header files
using a particular icon to represent header files in the ClearCase GUI
using different version-selection criteria for different element types
requiring that C-language source files (but not header files) be subjected to a lint(1) check when they are modified
ClearCase includes an automatic file typing mechanism, which enables element types to be applied efficiently and consistently. For example, new file elements whose names end with .c might automatically be assigned the user-defined element type c_source.
Another reason for ClearCase sites to define their own element types is to handle different physical or logical file formats. Each element type has an associated suite of programs, its type manager, which handles the task of accessing the element's versions in a VOB storage pool. For example, the type manager text_file_delta implements the efficient storage of version-to-version changes as deltas in a single, structured data container file.
A user-defined element type can have its own, user-supplied type manager suite. For example:
A type manager for elements of type bitmap might compute incremental differences between versions of a bitmap file, and store all the versions in a single, structured data container file.
A type manager for elements of type frame_file might use special routines to compare two versions of a FrameMaker® document.
A type manager for elements of type manual_page might compare two versions of a manual page source file by first formatting them with the standard nroff program.
Most implementations of UNIX support two kinds of links:
A hard link is a directory entry that provides an alternative name for an existing file. The file's “original” name and all its additional hard links are completely equivalent.
A symbolic link is a directory entry whose contents is a character string that specifies a full or relative pathname. The link acts as a pointer to a file or directory (or another symbolic link). Many UNIX functions and commands “chase” this pointer: a reference to a symbolic link becomes a reference to the object at the full or partial pathname specified by the link.
As counterparts to standard UNIX links, ClearCase supports VOB hard links and VOB symbolic links:
A VOB hard link is a directory entry that provides an additional name for an existing element in the same VOB.
A VOB symbolic link is an object whose contents is a character string that specifies a pathname. The pathname can specify a location in the same VOB, in a different VOB, or a non-VOB location.
An essential difference between these kinds of objects is that VOB symbolic links do not have version trees (since they do not name elements). Version-control of VOB symbolic links is accomplished through directory versioning, as described in the next section .
Some version-control systems are good at tracking the changes to the contents of files, but have little or no ability to handle changes to the names of files. Name changes are a fact of life in long-lived projects. So are such changes as creating new files, deleting obsolete files, moving a file to a different directory, merging source files together, and completely reorganizing a multiple-level directory structure. ClearCase addresses these needs by providing version-control of directories.
Each version of a ClearCase directory element is analogous to a standard UNIX directory:
A UNIX directory is a list of names, each of which points (through an inode number) to a file system object — file, directory, and so on.
A version of a ClearCase directory element is a list of names, each of which points (through a ClearCase-internal object-ID, or OID) to a file element, a directory element, or a VOB symbolic link (Figure 2-9).
In many respects, a directory element resembles a file element:
It has a version tree.
Any directory version can be accessed with a version-extended pathname. For example, /vobs/vega/src@@/main/3 references a particular version of directory element src.
A ClearCase view selects a particular version of each directory element. The selected versions of all directory elements in a VOB constitute a namespace; different views can instantiate different namespaces.
Figure 2-10 shows the critical difference: each version of a file element is a regular file; each version of a directory element is a list of element and VOB symbolic link names.
A version of a file element is accessed by first traversing a hierarchy of directory versions. Accessing a particular version of element /vobs/vega/src/util.c involves:
accessing some version of /vobs/vega (the VOB's top-level or root directory)
accessing a version of directory element src
accessing the desired version of util.c
Although directory versions are essential for managing long-lived projects, users need not think about them on a day-to-day basis. ClearCase's transparency feature, discussed for file elements in “Letting the View Select Versions”, works for directory elements, also. Thus, a view would resolve the standard pathname /vobs/vega/src/util.c automatically by selecting a version of directory vega, a version of directory src, and a version of file util.c. (For more, see “Transparency and Its Alternatives”.)