A graph is a set of vertices and a collection of edges that each connect a pair of vertices. Vertex names are not important to the definition, but we need a way to refer to vertices. By convention, we use the names 0 through V-1 for the vertices in a V-vertex graph.
Glossary
A substantial amount of nomenclature is associated with graphs. Most of the terms have straightforward definitions, and, for reference, we consider them in one place: here.
When there is an edge connecting two vertices, we say that the vertices are adjacent to one another and that the edge is incident to both vertices. The degree of a vertex is the number of edges incident to it. A subgraph is a subset of a graph’s edges (and associated vertices) that constitutes a graph. Many computational tasks involve identifying subgraphs of various types. Of particular interest are edges that take us through a sequence of vertices in a graph.
A path in a graph is a sequence of vertices connected by edges. A simple path is one with no repeated vertices. A cycle is a path with at least one edge whose first and last vertices are the same. A simple cycle is a cycle with no repeated edges or vertices (except the requisite repetition of the first and last vertices). The length of a path or a cycle is its number of edges.
A graph is connected if there is a path from every vertex to everyother vertex in the graph. A graph that is not connected consists of a set of connected components, which are maximal connected subgraphs.
An acyclic graph is a graph with no cycles. Several of the algorithms that we consider are concerned with finding acyclic subgraphs of a given graph that satisfy certain properties. We need additional terminology to refer to these structures: A tree is an acyclic connected graph. A disjoint set of trees is called a forest. A spanning tree of a connected graph is a subgraph that contains all of that graph’s vertices and is a single tree. A spanning forest of a graph is the union of spanning trees of its connected components.
This definition of tree is quite general: with suitable refine- ments it embraces the trees that we typically use to model pro- gram behavior (function-call hierarchies) and data structures (BSTs, 2-3 trees, and so forth). A graph G with V vertices is a tree if and only if it satisfies any of the following five conditions:
■ G has V - 1 edges and no cycles.
■ G has V - 1 edges and is connected.
■ G is connected, but removing any edge disconnects it.
■ G is acyclic, but adding any edge creates a cycle.
■ Exactly one simple path connects each pair of vertices in G.
Several of the algorithms that we consider find spanning trees and forests, and these properties play an important role in their analysis and implementation.
The density of a graph is the proportion of possible pairs of vertices that are connected by edges. A sparse graph has relatively few of the possible edges present; a dense graph has relatively few of the possible edges missing. Generally, we think of a graph as being sparse if its number of different edges is within a small constant factor of V and as being dense otherwise.
A bipartite graph is a graph whose vertices we can divide into two sets such that all edges connect a vertex in one set with a vertex in the other set. The figure below gives an example of a bipartite graph, where one set of vertices is colored red and the other set of vertices is colored black. Bipartite graphs arise in a natural way in many situations.
Undirected graph data type
Representation alternatives.
■ An adjacency matrix, where we maintain a V-by-V boolean array, with the entry in row v and column w defined
to be true if there is an edge adjacent 0 to both vertex v and vertex w in the graph, and to be false otherwise. This representation fails on the first count -- graphs with millions of vertices are common and the space cost for the V 2 boolean values needed is prohibitive.
■ An array of edges, using an Edge class with two instance variables of type int. This direct representation is simple, but it fails on the second count -- implementing adj() would involve examining all the edges in the graph.
■ An array of adjacency lists, where we maintain a vertex-indexed array of lists of the vertices adjacent to each vertex. This data structure satisfies both requirements for typical applications and is the one that we will use throughout this chapter.
Beyond these performance objectives, a detailed examination reveals other considerations that can be important in some applications. For example, allowing parallel edges precludes the use of an adjacency matrix, since the adjacency matrix has no way to represent them.
public class Graph { private final int V; private int E; private Bag<Integer>[] adj; /** * Create an empty graph with V vertices. * * @throws java.lang.IllegalArgumentException if V < 0 */ public Graph(int V) { if (V < 0) throw new IllegalArgumentException("Number of vertices must be nonnegative"); this.V = V; this.E = 0; adj = (Bag<Integer>[]) new Bag[V]; for (int v = 0; v < V; v++) { adj[v] = new Bag<Integer>(); } } /** * Create a random graph with V vertices and E edges. * Expected running time is proportional to V + E. * * @throws java.lang.IllegalArgumentException if either V < 0 or E < 0 */ public Graph(int V, int E) { this(V); if (E < 0) throw new IllegalArgumentException("Number of edges must be nonnegative"); for (int i = 0; i < E; i++) { int v = (int) (Math.random() * V); int w = (int) (Math.random() * V); addEdge(v, w); } } /** * Create a digraph from input stream. */ public Graph(In in) { this(in.readInt()); int E = in.readInt(); for (int i = 0; i < E; i++) { int v = in.readInt(); int w = in.readInt(); addEdge(v, w); } } /** * Copy constructor. */ public Graph(Graph G) { this(G.V()); this.E = G.E(); for (int v = 0; v < G.V(); v++) { // reverse so that adjacency list is in same order as original Stack<Integer> reverse = new Stack<Integer>(); for (int w : G.adj[v]) { reverse.push(w); } for (int w : reverse) { adj[v].add(w); } } } /** * Return the number of vertices in the graph. */ public int V() { return V; } /** * Return the number of edges in the graph. */ public int E() { return E; } /** * Add the undirected edge v-w to graph. * * @throws java.lang.IndexOutOfBoundsException unless both 0 <= v < V and 0 <= w < V */ public void addEdge(int v, int w) { if (v < 0 || v >= V) throw new IndexOutOfBoundsException(); if (w < 0 || w >= V) throw new IndexOutOfBoundsException(); E++; adj[v].add(w); adj[w].add(v); } /** * Return the list of neighbors of vertex v as in Iterable. * * @throws java.lang.IndexOutOfBoundsException unless 0 <= v < V */ public Iterable<Integer> adj(int v) { if (v < 0 || v >= V) throw new IndexOutOfBoundsException(); return adj[v]; } /** * Return a string representation of the graph. */ public String toString() { StringBuilder s = new StringBuilder(); String NEWLINE = System.getProperty("line.separator"); s.append(V + " vertices, " + E + " edges " + NEWLINE); for (int v = 0; v < V; v++) { s.append(v + ": "); for (int w : adj[v]) { s.append(w + " "); } s.append(NEWLINE); } return s.toString(); } }
Adjacency-lists data structure.
The standard graph representation for graphs that are not dense is called the adjacency-lists data structure, where we keep track of all the vertices adjacent to each vertex on a linked list that is associated with that vertex. This Graph implementation achieves the following performance characteristics:
■ Space usage proportional to V + E
■ Constant time to add an edge
■ Time proportional to the degree of v to iterate through vertices adjacent to v (constant time per adjacent vertex processed)