前端时间搞一个airsim的学习,通过ddqn的方式,然后这两天在尝试一些增强学习的训练;
在Papers With Code上看到如下游戏
FPS Games | Papers With Code
GitHub - mwydmuch/ViZDoom: Doom-based AI Research Platform for Reinforcement Learning from Raw Visual Information.
然后在增强学习页面reinforcement-learning | Papers With Code看到项目 DLR-RM/stable-baselines3,
搭建环境如下:
首先要安装pytorch,1.8版本以上
然后安装stable-baselines3
pip install stable-baselines3
然后安装游戏
sudo apt install cmake libboost-all-dev libsdl2-dev libfreetype6-dev libgl1-mesa-dev libglu1-mesa-dev libpng-dev libjpeg-dev libbz2-dev libfluidsynth-dev libgme-dev libopenal-dev zlib1g-dev timidity tar nasm
pip install vizdoom
当然安装过程肯定有很多依赖,相关依赖可从上面链接中看到
然后开始训练
import gym
from vizdoom import gym_wrapper
from stable_baselines3 import PPO
# env = gym.make("CartPole-v1")
env = gym.make("VizdoomHealthGatheringSupreme-v0")
model = PPO("MultiInputPolicy", env, verbose=1)
model.learn(total_timesteps=10_000)
obs = env.reset()
for i in range(10000):
action, _states = model.predict(obs, deterministic=True)
obs, reward, done, info = env.step(action)
env.render()
if done:
obs = env.reset()
env.close()
啊——,到现在位置,看起来没有语法错误,至于是否可以训练出预期结果,尚不可知(我在虚拟机里调试的)