Best-First Search Algorithm (Draft by Andrew Jungwirth)

Best-First Search Algorithm (Draft by Andrew Jungwirth)

Objectives

  • To show how the Best-First Search Algorithm uses a heuristic function to find a path to the goal node in a graph.

  • To emphasize that the solution found by the Best-First Search is a quick estimate of the optimal solution.

Preparation

To understand this activity, you must be familiar with the concept of a graph as a collection of nodes/vertices and edges connecting these nodes. It is also helpful to know how a tree is used to show the progress of a search on a graph. Furthermore, you should understand how algorithms such as the Breadth-First and Depth-First searchesuse both an open list and a closed list because the Best-First Search Algorithm also uses this two-list structure. A knowledge of the A* Algorithm (insert link here) may also be helpful in working with this algorithm.

Best-First Search Algorithm

The Best-First Search Algorithm is a fast algorithm for finding a path (usually not optimal) from a start node to a goal node in a weighted graph. Similar to the Breadth-First and Depth-First search algorithms, the Best-First Search maintains a closed list of nodes to which it has found paths and an open list of nodes connected to these closed nodes. However, unlike these "uninformed" search algorithms, the Best-First Algorithm uses a heuristic function (or quickly computed estimate of the cost to reach the goal from each node), called ĥ, to guide its search for the goal node. This heuristic allows the Best-First Search to find the goal more quickly than these "uninformed" searches.

Much like in Dijkstra's Algorithm and the A* Algorithm, the Best-First Search implements the open list as a priority queue. The difference is that the Best-First Search Algorithm sorts the nodes in the open queue only by their heuristic values, whereas the A* Algorithm uses both heuristic values and costs to order the nodes in the open queue, and Dijkstra's Algorithm sorts the open queue only by the nodes' costs. This usually allows the Best-First Search to find the goal node in the graph quickly, but the path it finds to the goal node may not be the optimal path to reach the goal.

Each time through its main loop, the Best-First Algorithm moves the node at the front of the open list to the closed list. It then adds to the open list all of the nodes that are connected to this closing node and not already on the closed list. For each of these nodes added to the open list, the new open node's predecessor is set as the closed node, and its cost is assigned the closing node's cost plus the weight of the edge from the closing node to the new open node.

However, because the Best-First Algorithm only considers the heuristic values and not the node costs, it may close a node with a cost that is greater than optimal. Figure 1 illustrates this shortcoming:

 make use of  relaxing the weight to update  the elements in the open list;

Figure 1

In this simple example, when node D is closed, node A will be added to the open list with a cost of 3 and a heuristic value of 1, and node C will be added to the open list with a cost of 1 and a heuristic value of 2. Since the Best-First Search only considers heuristic values, node A will be taken off the open list first and will be closed with a cost of 3. This is not the optimal cost because the path DCA reaches node A with a cost of 2, but the Best-First Algorithm does not find this shortest path because it does not consider the costs of the paths it explores.

Notice that the method in which the nodes are added to the open list can cause the same node to be added to the open queue multiple times from different predecessors. This means the algorithm must check to make sure the node taken off the front of the open queue is not already closed before adding the node to the closed list. If the node is already in the closed list, then it has already been closed so the algorithm continues with the next iteration of the main loop without attempting to close the node again.

These traits of the Best-First Search Algorithm are shown by the pseudocode below:

Pseudocode for the Best-First Search Algorithm

/* initialization */
startVertex.cost = 0;
startVertex.pred = null;
HashTable closed = { };                   // the empty set
PriorityQueue open = { startVertex };

/* main loop */
while(!open.empty()){
  closingVertex = open.remove_front();    // open vertex with minimum ĥ value
  
  if(closingVertex == goalVertex){
    closed.add(closingVertex);
    use pred values and closed list to construct a backtrace of the shortest path;
    return the backtrace of the shortest path;
  }

  if(closingVertex ∉ closed){             // closingVertex has not already been closed and is not the goal
    closed.add(closingVertex);

    /* inner loop - generate new open vertices */
    for(each connectingVertex with an edge from closingVertex){
      if(connectingVertex ∉ closed){
        // connectingVertex is not on the closed list
        connectingVertex.cost = (closingVertex.cost + edge weight from closingVertex to connectingVertex);
        connectingVertex.pred = closingVertex;
        open.add(connectingVertex);
      }
    }
  }
}

// goalVertex was not found, and open list is empty - Best-First failed to find the goal
return failure;

Example Trace of the Best-First Search Algorithm

The following trace of the Best-First Search Algorithm shows the vertex closed in every loop of the algorithm's execution and displays the current states of the open and closed lists. Nodes in the open list are displayed as node_name(predecessor_name cost+ĥ_value), and closed nodes are shown as node_name(predecessor_name cost). Notice that the nodes in the open list are only sorted by their ĥ values; the cost is included only to help associate the nodes with the accompanying search tree. Also, whenever two nodes have the same ĥ values, the node that has the lower cost will be placed nearer to the front of the queue. If they have the same costs, the node that comes first in alphabetical order will appear first in the queue. Each line of the trace is numbered, and this number appears next to the corresponding closed node in the accompanying search tree. In the graph, the numbers under the node names are the heuristic values, and the numbers on the edges are the edge weights. The numbers beneath the node names in the accompanying search tree are the costs to reach the nodes on the corresponding paths in the tree. This trace of the Best-First Algorithm finds a path from node A to node E in the graph shown in Figure 2:

Figure 2

loop
number
closing
vertex
open list closed list
    A(null 0+6)  
1 A D(A 6+5), G(A 5+6) A(null 0)
2 D F(D 9+3), H(D 9+4), C(D 10+6), G(A 5+6) A(null 0), D(A 6)
3 F B(F 12+1), H(D 9+4), H(F 14+4), C(D 10+6), G(A 5+6) A(null 0), D(A 6), F(D 9)
4 B E(B 14+0), H(D 9+4), H(F 14+4), H(B 14+4), C(D 10+6), G(A 5+6) A(null 0), D(A 6), F(D 9), B(F 12)
5 E H(D 9+4), H(F 14+4), H(B 14+4), C(D 10+6), G(A 5+6) A(null 0), D(A 6), F(D 9), B(F 12), E(B 14)

Figure 3

Figure 3 shows the search tree that represents this trace of the Best-First Search Algorithm. The numbers to the left of the nodes in the tree indicate which node in the tree was closed in each numbered loop of the trace above. The numbers under the node names show the costs to reach those nodes in the corresponding paths in the tree. Note that the Best-First Algorithm finds the solution ADFBE, which has a cost of 14. However, the shortest path from A to E is really ADHBE with a cost of 13 so the Best-First Algorithm failed to find the optimal solution in this example.

Efficiency/Algorithm Analysis

It is generally effective to analyze graph-search algorithms by considering four traits:

  • Completeness: A search algorithm is complete if it will find a solution (goal node) when a solution exists.

  • Optimality: A search algorithm is optimal if it finds the optimal solution. In the case of the Best-First Search Algorithm, this means that the algorithm must find the shortest path from the start node to the goal node.

  • Time complexity: This is an order-of-magnitude estimate of the speed of the algorithm. The time complexity is determined by analyzing the number of nodes that are generated during the algorithm's execution.

  • Space complexity: This is an order-of-magnitude estimate of the memory consumption of the algorithm. The space complexity is determined by the maximum number of nodes that must be stored at any one time during the algorithm's execution.

Completeness

The Best-First Search Algorithm is a complete algorithm. This means that, given unlimited time and memory, the algorithm will always find the goal state if the goal can possibly be found in the graph. Even if the heuristic function is highly inaccurate, the goal state will eventually be added to the open list and will be closed in some finite amount of time.

Optimality

In general, the Best-First Search Algorithm is not optimal. For example, in the trace above, the Best-First Search failed to find the optimal solution. The Best-First Search Algorithm is not even guaranteed to find the shortest path from the start node to the goal node when the heuristic function perfectly estimates the remaining cost to reach the goal from each node. Therefore, the solutions found by this algorithm must be considered to be quick estimates of the optimal solutions.

Time Complexity

For the best performance, the open list is implemented as a min-heap, and the closed list is usually stored as a hash table so that a node can be quickly checked to see if it has been closed. Using these structures causes insertions/deletions into/from the open list to take O(log V) time, where V is the number of vertices in the graph, and closing nodes and checking if a node has been closed occurs in O(1) time. Even if these structures are used, the time complexity of the Best-First Search Algorithm is largely dependent on the accuracy of the heuristic function. An inaccurate ĥ does not guide the algorithm toward the goal quickly, increasing the time required to find the goal. For this reason, in the worst case, the Best-First Search runs in exponential time because it must expand many nodes at each level. This is expressed as O(bm), where b is the branching factor (i.e., the average number of nodes added to the open list at each level), and m is the maximum length of any path in the search space. If the heuristic closely estimates the true cost to reach the goal from each node, the Best-First Algorithm can run in polynomial time because the accurate heuristic guides it to the goal much more quickly. Generally, the Best-First Algorithm finds the goal more quickly than the A* Algorithm, but this solution is usually not as optimal as the one found by the A* Algorithm.

Space Complexity

The memory consumption of the Best-First Algorithm tends to be a bigger restriction than its time complexity. Like many graph-search algorithms, the Best-First Search rapidly increases the number of nodes that are stored in memory as the search moves deeper into the graph. One modification that can improve the memory consumption of the algorithm is to only store a node in the open list once, keeping only the best cost and predecessor. This reduces the number of nodes stored in memory but requires more time to search the open list when nodes are inserted into it. Even after this change, the space complexity of the Best-First Algorithm is exponential. This is stated asO(bm), where b is the branching factor, and m is the maximum length of any path in the search space. To use this algorithm on large problems, significant modifications are required to prevent the algorithm from quickly running out of memory.

Exploring the Algorithm's Dynamic Behavior

Explore the Algorithm within JHAVÉ

You can practice with the Best-First Search Algorithm using the algorithm visualization system JHAVÉ. If you have not used JHAVÉ before, please take the time to view the instructions on using JHAVÉ first.

Launch a visualization of Best First Search

Step through the examples generated by the visualization to practice with the Best-First Algorithm. Answer the questions that appear during the visualization to assess your understanding of the algorithm. When you can consistently answer the questions correctly, try the exercises below.

Exercises

  1. Trace the Best-First Search Algorithm on the graph in Figure 4 from start vertex E to goal vertex A.

    Figure 4

  2. Did your trace of the Best-First Search Algorithm on the graph above (Figure 4) find the optimal solution? If not, change the heuristic values so that the Best-First Search will find the shortest path from the start node E to the goal node A.

Designing Data Sets

Creating input for an algorithm is an effective way to demonstrate your understanding of the algorithm's behavior.

  1. Construct an example graph in which the Best-First Search Algorithm fails to find the shortest path from the start node to the goal node.

  2. Now, modify the heuristic values in the graph you created above so that the Best-First Search will find the optimal path.

Modifying the Algorithm

Consider changing the Best-First Search Algorithm described above so that the open list is ordered by each node's (cost + ĥ value). Is this modification of the Best-First Search Algorithm guaranteed to find the optimal path from the start node to the goal node? If yes, explain why; if no, give an example in which this modified algorithm fails to find the shortest path. (Hint: This new algorithm may seem like the A* Algorithm, but it is not quite the A* Algorithm. What change is missing?)

Create Your Own Visualization

Using your own source code, presentation software, or manually produced drawings, create your own visualization of the Best-First Search Algorithm.

Presentation

Develop a ten-minute presentation on the Best-First Search Algorithm that uses the visualization you developed above to explain the important traits of the algorithm and emphasize its weaknesses as a graph-search algorithm.

你可能感兴趣的:(Best-First Search Algorithm (Draft by Andrew Jungwirth))