论文笔记:Interactive Gibson

Interactive Gibson: A Benchmark for Interactive Navigation in Cluttered Environments

这篇文章介绍了intereactive gibson这个环境

Abstract

这篇文章的benchmark基础有两个新的部分

  • a new experimental setup
  • a set of Interactive Navigation metrics

Introduction

Contribution

  • a novel simulation environment, retains the photo-realism and scale of the original Gibson V1, but allows for interactions with objects.
  • study Interactive Navigation in a general form, as navigating in a cluttered environment where interacting with objects is allowed and even needed in order to reach the goal. All of the segmented objects can be pushed subject to their mass and friction.
  • the definition of a benchmark, called Interactive Gibson Benchmark, a performance metric which unifies two criteria

贡献点:

  • 一个新的仿真环境,渲染效果好,物体允许interaction
  • 学习interactive navigation。
  • 基准的定义,和指标的定义,指标有两个,一个是navigation success和path quality,一个是effort associated with the degree of disturbance to the surroundings.
  • 最后还提供了两个平台上不同RL算法的baseline。

Interactive Gibson Environment

    1. a new rendering engine(渲染引擎)
    1. a new set of assets which are objects of relevants classes for Interactive navigation(106 scenes with 1984 interactable CAD model alignments of 5 different object categories: chairs, desks, doors, sofas, and tables)(数据集),在106个场景中,加入了1984个CAD模型,CAD模型是5个类别的物体:椅子,桌子,门,沙发,餐桌。
      annotation产生方式,也有人工部分。首先是产生了object region proposals(用的是一个shape-based semantic segmentation approach)。接下来是object alignment,从shapenet中选择相似的CAD模型,将CAD模型与场景中的物体对齐。然后每个模型最少手工对齐6个keypoint,(scale and pose alignment 是通过最小化point-to-point distance)。
      然后将图片中的纹理转换到对齐的CAD模型。
      论文笔记:Interactive Gibson_第1张图片
    1. Interactive Gibson Agents
      八种真实机器人平台模型
      Eight models of real robot platforms:
      two widely used simulation agents (the Mujoco [3] humanoid and ant), four wheeled navigation agents (Freight, JackRabbot v1, Husky and TurtleBot v2), a legged robot (Minitaur), two mobile manipulators with an arm (Fetch and JackRabbot v2), and a quadrocopter (Quadrotor)

INTERACTIVE GIBSON EVALUATION SETUP

Interactive Gibson Benchmark
这个任务是从一个随机开始的地方到一个随机目的地,这两个地方在地板上。
物体一共有5+10类
对于每个episode,每次随机选取一个环境,10类物体,还有开始的位置和结束的位置。当超过一定时间或者收敛接近目标位置时

Interactive Navigation Score

Path Efficiency:The most efficient path is the shortest path assuming no interactable obstacles are in the way.(最短路径是由没有障碍得到)
Effort Efficiency: disturbing the environment or interacting with the objects(和环境、其他物体的互动得到)
Path and Effort Efficiency are measured by scores, P e f f P_{eff} Peff E e f f E_eff Eeff,在区间[0,1]。
I N S α = α P e f f + ( 1 − α ) E e f f INS_{\alpha} = \alpha P_{eff} + (1 - \alpha)E_{eff} INSα=αPeff+(1α)Eeff

假设场景中有K个物体, indexed by i ∈ 1 , . . . , K i \in {1,...,K} i1,...,K, i = 0 i = 0 i=0 for robot. 每个物体被移动距离为 l i l_i li,最终机器到达目标标值为1,否则为0.
P e f f = 1 s u c L ∗ L 0 P_{eff} = 1_{suc} \frac{L^*}{L_0} Peff=1sucL0L
L ∗ L^* L是最短路径,实际中 P e f f ∗ = 1 P^*_{eff} = 1 Peff=1不可达。
Effort Efficiency Score定义:定义物体的质量 m i m_i mi, G = m 0 g G=m_0g G=m0g是机器的重力, F t F_t Ft是t时刻机械臂作用的力
E e f f = 0.5 ( m 0 l 0 ∑ i = 0 K m i l i + T G T G + ∑ t = 0 T F t ) E_{eff} = 0.5(\frac{m_0l_0}{\sum_{i=0}^Km_il_i}+\frac{TG}{TG+\sum_{t=0}^TF_t}) Eeff=0.5(i=0Kmilim0l0+TG+t=0TFtTG)
E e f f ∗ = 1 E_{eff}^* = 1 Eeff=1

EVALUATING BASELINES ON INTERACTIVE GIBSON

用了3个强化学习的算法

Reward Function

R = R s u c + R p o t + R i n t R = R_{suc} + R_{pot} + R_{int} R=Rsuc+Rpot+Rint

R s u c R_{suc} Rsuc (suc from success) is a one-time sparse reward of value 10
$R_{pot} $ is the difference in geodesic distance between the agent and the goal in current and previous time steps$R_{pot} = GD_{t−1}−GD_t $,机器和目标前后时刻的距离
R i n t = − k i n t 1 i n t . 1 i n t R_{int} = -k_{int}1_{int}.1_{int} Rint=kint1int.1int
k i n t k_{int} kint是超参,设置了 k i n t = 0 , 0.1 , 1.0 k_{int} = {0, 0.1, 1.0} kint=0,0.1,1.0

你可能感兴趣的:(论文笔记,算法,人工智能,python,深度学习)