基于强化学习求解组合优化问题TSP

A Note on Learning Algorithms for Quadratic Assignment with Graph Neural Networks:
使用图神经网络解TSP

Optimization on a Budget A Reinforcement Learning Approach:
介绍强化学习方法在预算优化中的应用

 

Pointer Network:
首次提出了pointer decoding的方式,求解TSP问题。
https://github.com/devsisters/pointer-network-tensorflow

Neural combinatorial optimization with reinforcement learning:
Google的这篇借用pointer network加上attention mechanism,用policy gradient优化,actor-critic训练。也求解了knapsack的问题。
https://github.com/pemami4911/neural-combinatorial-rl-pytorch
https://github.com/higgsfield/np-hard-deep-reinforcement-learning


Reinforcement learning for solving vehicle routing problem:
Leigh发的这篇的基础是前面两篇,简化的pointer network的encoding过程,直接进行embedding。
主要延伸到解VRP问题,也解了TSP问题,和之前的进行了对比。
https://github.com/mveres01/pytorch-drl4vrp

Learning Combinatorial Optimization Algorithms over Graphs:
这篇先graph embedding的思路(structure to vector),然后Reinforce to train。
从small scale training transfer到large scale表现也还不错。作者是用c++写的,后来也发了pytorch版本,但是底层还是c++。
但是原文中graph embedding也是属于训练的部分,在pytorch中backward会有问题。
https://github.com/Hanjun-Dai/graph_comb_opt

Attention Learn to solve routing problems!:
ICLR2019的一篇。这篇的整体思路也是encoder(一个复杂的attention)+decoder。
这篇包罗万象,解了各种tsp和vrp变种,还有其它,也和pointer network做了对比 。
https://github.com/wouterkool/attention-learn-to-route

A Deep Q-Network for the Beer Game:
用deep reinforcement learning的方法,分别对4个agent(manufacturing, distributor, warehouse, retailer),
建立network,然后用一个feedback scheme去让agent向一个目标前进。

 

你可能感兴趣的:(RL)