论文学习:Decoupling Value and Policy for Generalization in Reinforcement Learning(强化学习中泛化的解耦价值和策略)
摘要:Standarddeepreinforcementlearningalgorithmsuseasharedrepresentationforthepolicyandvaluefunction,especiallywhentrainingdirectlyfromimages.However,wearguethatmoreinformationisneededtoaccuratelyestima