Neural Computation of Decisions in Optimization Problems
J. J. Hopfield, D. W. Tank
Biological Cybernetics, 1985
Introduction
From the perspective of neuroscience, the authors first review their high-level goal, i.e. understanding the structure and connection of neurons.
One of the central goals of research in neuroscience is to understand how the biophysical properties of neurons and neural organization combine to provide such impressive computing power and speed.
Then they propose their opinion on the key point of neuron organization.
Parallel analog computation in a network of neurons is thus a natural way to organize a nervous system to solve optimization problems.
Specifically, in this paper, they showed the power of the proposed analog computational networks by testing on TSP.
We quantitatively demonstrate the computational power and speed of collective analog networks of neurons in solving optimization problems rapidly.
Background on Hopfield Network
The figure below illustrates an unit neuron in a Hopfield network.
According to Kirchhoff's Current Law, we have that
where , is the current bias for neural , is the activation function for neuron , and is a parallel combination of and :
For simplicity, we can further set , and independent of , define and as and , which leads to the following equation:
where , and .
Given an initial value for , this equation of motion provides a full description of the time evolution of the state of the circuit.
There are two key properties of the Hopfield network as shown in earlier work:
- The equations of motion for a network with symmetric connections, i.e. , always lead to a convergence to stable states;
- When the width of the amplifier gain curve is narrow - the high-gain limit - the stable states of a network comprised of neurons are the local minima of the quantity
And in the high-gain limit, the minima only occur at corners of the feasible space, which is a N-dimensional hypercube defined by or .
Consequently,
Networks of neurons with this basic organization can be used to compute solutions to specific optimization poblems by first choosing connectivities and input bias currents which appropriately represent the function to be minimized.
Network Formulation for TSP
For a TSP with cities, the representation scheme in this work has a complexity of , as the route is represented as a permutation matrix, in which the location of each city is represented as the output states of neurons in an one-hot encoding fashion. Each neuron is labelled by two indices, in which the first index represents the city, and the seond index represents its order in the route.
To solve TSP with the Hopfield network, the energy function should include two parts:
- The first part favors solutions in the form of a permutation matrix, which guanrantees the feasibility of the solution;
- The second part favors solutions with shorter distance.
The first part of the energy function is defined as
The first term and second term mean that there is only one non-zero element in each row and each column of the matrix respectively, and the third term means that there is exactly non-zero elements in the matrix. Only a permutation matrix can reach the minimum of this energy function with .
(Two questions: 1. Why not use the row sum and column sum to represent the first and second term? 2. How can this function guanrantee that there won't be several distinct loops in the matrix?)
The second part of the energy function is defined as
This function represents the total length of the tour. For notational convenience, subscripts are defined as modulo to represent the tour loop.
The final energy function is .
If , and are sufficiently large, all the really low energy states of a network described by this function will have the form of a valid tour. The total energy of that state will be the length of the tour, and the states with the shortest path will be the lowest energy states.
Next, we map the energy function of TSP to the standard form of the energy function in the Hopfield network:
where
Simulation Results
- The activation function is set as ;
- The instance is generated by sampling points uniformly in an unit square;
- The choice of the random initial point leads to different convergence solution;
- The proposed method achieves close-to-best solutions for 10 cities, good solutions for 30 cities. No larger instance size is considered. The method performs much better than random search, but is not compared with other human-designed heuristics.
Conclusion
I suppose the key idea of this paper is that we can relate some kinds of optimization problems with the dynamics of a specific type of dynamic system, such as the Hopfield network. But I'm not convinced that this is a scalable approach to solving combinatorial optimization problems.
In BP networks, our goal is encoded in the objective function, and we update the network weights to optimize this function. In Hopfield networks, our goal is encoded in the connection weights of the network, and the optimal solution is acquired by the system dynamic.
It deserves to investigate what is the current status of the Hopfiled network and how can we combine it with the deep learning paradigm.