计算机视觉和短时长记忆:学习预测施工过程中的不安全行为

目录

  • 前言
  • 摘要
  • 1.Introduction
  • 2.Computer vision and safety
    • 2.1. Vision-based object detection and segmentation
    • 2.2. Vision-based object tracking
  • 3.Proposed framework
    • 3.1. Tracking
    • 3.2 Trajectory prediction
    • 3.3. Prediction of unsafe behaviour
  • 4.Other


前言

Computer vision and long short-term memory: Learning to predict unsafe behaviour in construction笔记

  • 发表于Advanced Engineering Informatics/Volume 50, October 2021
  • 期刊2022年影响因子/JCR分区:7.862/Q1

摘要

  • In this paper, we combine computer vision with Long-Short Term Memory (LSTM) to predict unsafe behaviours from videos automatically.
  • 在本文中,我们将计算机视觉与长短期记忆(LSTM)相结合,自动预测视频中的不安全行为。
  • Our proposed approach for predicting unsafe behaviour is based on: (1) tracking people using a SiamMask; (2) predicting the trajectory of people using an improved Social-LSTM; and (3) predicting unsafe behaviour using Franklin’s point inclusion polygon (PNPoly) algorithm.
  • 我们提出的预测不安全行为的方法是基于:(1)使用SiamMask追踪人群;(2)利用改进的Social-LSTM预测人群的运动轨迹;(3)利用富兰克林点包含多边形(PNPoly)算法预测不安全行为。

1.Introduction

  • Traditionally root cause models juxtaposed with the psychological theories have formed the basis to predict people’ unsafe behaviour as they help us understand how people’s behaviour changes under differing working conditions.
  • 与心理学理论并列的根本原因模型已成为预测人们不安全行为的基础,因为它们有助于我们了解人们在不同工作条件下的行为如何变化。
  • The models assume that people’s behaviour is planned; hence, it predicts deliberate behaviour . However, they are “widely applied without sufficient attention paid to what makes [them] work in its contexts of origin, and without adequate customisation for the specifics”. To this end, root cause models provide a “flawed reductionist view” of safety issues.
  • 模型假设人们的行为是有计划的;因此,它预测了蓄意行为。然而,它们“被广泛应用,但没有充分注意到是什么使它们在其起源环境中发挥作用,也没有针对具体情况进行适当的定制”。为此,根本原因模型提供了安全问题的“有缺陷的简化论观点”。
  • The data used for predictive modelling is usually derived from studies of unsafe behaviour rendering their predictability and relevance to represent practice questionable.
  • 用于预测建模的数据通常来自对不安全行为的研究,因此它们的可预测性和代表实践的相关性值得怀疑。

2.Computer vision and safety

2.1. Vision-based object detection and segmentation

Object Detection

  1. You Only Look Once (Yolo)
  2. RetinaNet
  3. Yolov3
  4. Faster R-CNN
  5. Fast R-CNN

Object Segmentation

  1. Fully Convolutional Network
  2. Mask R-CNN

2.2. Vision-based object tracking

  • Visual object tracking (VOT) is used to generate an object’s trajectory over time by locating its position obtained from videos.
  • 视觉物体跟踪(VOT)是通过定位从视频中获得的物体的位置,生成物体随时间的轨迹。

VOT tracking approaches

  1. deep learning-based
  2. point tracking
  3. kernel tracking
  4. Tracking objects using a segmentation mask requires more computational power than a simple bounding box-based approach. A fully convolutional Siamese framework (SiamFC) developed to track objects with speed in real-time, dubbed the SiamMask, can be used to overcome the requirement for increased computational power.

3.Proposed framework

This paper focuses on the use of a SiamMask, Social LTSM and a PNPoly algorithm.

3.1. Tracking

SiamMask学习传送门:目标跟踪–Siammask从头到尾的详解

3.2 Trajectory prediction

  • The Social-LSTM considers the people-people interaction in predicting their trajectory, improving the robustness and accuracy of multi-people tracking. The Kalman filter is adopted to correct the Social-LSTM results to enhance the robustness of the prediction method. A Kalman filter is an optimal estimator that can infer parameters of interest from indirect, inaccurate and uncertain observations, which can estimate the people’s walking based on their historical trajectory.
  • Social-LSTM在预测人的运动轨迹时考虑了人的相互作用,提高了多人跟踪的鲁棒性和准确性。采用卡尔曼滤波对Social- LSTM结果进行校正,增强了预测方法的鲁棒性。卡尔曼滤波是一种最优估计器,它可以从间接的、不准确的和不确定的观察中推断出感兴趣的参数,它可以根据人们的历史轨迹估计出他们的行走轨迹。
  • Therefore, we correct the Social-LSTM results with the Kalman filter once a person’s predicted walking speed is inconsistent with the recorded speed.
  • 一旦一个人的预测步行速度与记录的速度不一致,我们就用卡尔曼滤波修正Social-LSTM的结果。

注:
卡尔曼滤波:最终值=k*观察值+(1-k)*预测值,其中的k需要调
卡尔曼滤波就是平均的一种更优算法,卡尔曼滤波的权值是根据系统状态和噪音计算出的最优权值。

Social-LSTM学习传送门:Social LSTM:一个预测未来路径轨迹的深度学习模型
原文:Social LSTM:Human Trajectory Prediction in Crowded Spaces
code(pytorch):https://github.com/quancore/social-lstm

3.3. Prediction of unsafe behaviour

  • The PNPoly algorithm tests whether a point is inside a polygon (convex or concave) by counting how many times the ray from the test point crosses its edge. If the count is an odd number, the point is in the polygon area; otherwise, it is outside.
  • PNPoly算法通过计算自测试点发出的射线穿过多边形边缘的次数,来测试点是否在多边形(凸或凹)内。如果计数为奇数,则该点位于多边形区域内;否则,它就在外面。

4.Other

你可能感兴趣的:(工程机械,Construction,Machinery,学习,人工智能)