本专栏是计算机视觉方向论文收集积累,时间:2021年6月21日,来源:paper digest
欢迎关注原创公众号 【计算机视觉联盟】,回复 【西瓜书手推笔记】 可获取我的机器学习纯手推笔记!
直达笔记地址:机器学习手推笔记(GitHub地址)
1, TITLE: GEM: A General Evaluation Benchmark for Multimodal Tasks
AUTHORS: LIN SU et. al.
CATEGORY: cs.CL [cs.CL, cs.CV, cs.MM]
HIGHLIGHT: In this paper, we present GEM as a General Evaluation benchmark for Multimodal tasks.
2, TITLE: Dual-Teacher Class-Incremental Learning With Data-Free Generative Replay
AUTHORS: Yoojin Choi ; Mostafa El-Khamy ; Jungwon Lee
CATEGORY: cs.CV [cs.CV, cs.LG, cs.NE]
HIGHLIGHT: This paper proposes two novel knowledge transfer techniques for class-incremental learning (CIL).
3, TITLE: Learning and Meshing from Deep Implicit Surface Networks Using An Efficient Implementation of Analytic Marching
AUTHORS: Jiabao Lei ; Kui Jia ; Yi Ma
CATEGORY: cs.CV [cs.CV, cs.GR]
HIGHLIGHT: In this paper, we study a fundamental problem in this context about recovering a surface mesh from an implicit field function whose zero-level set captures the underlying surface.
4, TITLE: How to Train Your ViT? Data, Augmentation, and Regularization in Vision Transformers
AUTHORS: ANDREAS STEINER et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: As one result of this study we find that the combination of increased compute and AugReg can yield models with the same performance as models trained on an order of magnitude more training data: we train ViT models of various sizes on the public ImageNet-21k dataset which either match or outperform their counterparts trained on the larger, but not publicly available JFT-300M dataset.
5, TITLE: End-to-end Temporal Action Detection with Transformer
AUTHORS: XIAOLONG LIU et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Here, we construct an end-to-end framework for TAD upon Transformer, termed \textit{TadTR}, which simultaneously predicts all action instances as a set of labels and temporal locations in parallel.
6, TITLE: VSAC: Efficient and Accurate Estimator for H and F
AUTHORS: Maksym Ivashechkin ; Daniel Barath ; Jiri Matas
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We present VSAC, a RANSAC-type robust estimator with a number of novelties.
7, TITLE: Light Pollution Reduction in Nighttime Photography
AUTHORS: Chang Liu ; Xiaolin Wu
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: In this paper we develop a physically-based light pollution reduction (LPR) algorithm that can substantially alleviate the aforementioned degradations of perceptual quality and restore the pristine state of night sky.
8, TITLE: Training or Architecture? How to Incorporate Invariance in Neural Networks
AUTHORS: Kanchana Vaishnavi Gandikota ; Jonas Geiping ; Zorah L�hner ; Adam Czapli?ski ; Michael Moeller
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we propose a method for provably invariant network architectures with respect to group actions by choosing one element from a (possibly continuous) orbit based on a fixed criterion.
9, TITLE: Medical Image Analysis on Left Atrial LGE MRI for Atrial Fibrillation Studies: A Review
AUTHORS: Lei Li ; Veronika A. Zimmer ; Julia A. Schnabel ; Xiahai Zhuang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper aims to provide a systematic review on computing methods for LA cavity, wall, scar and ablation gap segmentation and quantification from LGE MRI, and the related literature for AF studies.
10, TITLE: Effective Model Sparsification By Scheduled Grow-and-Prune Methods
AUTHORS: XIAOLONG MA et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG, cs.NE]
HIGHLIGHT: In this paper, we propose a novel scheduled grow-and-prune (GaP) methodology without pre-training the dense models.
11, TITLE: RSG: A Simple But Effective Module for Learning Imbalanced Datasets
AUTHORS: Jianfeng Wang ; Thomas Lukasiewicz ; Xiaolin Hu ; Jianfei Cai ; Zhenghua Xu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, wepropose a new rare-class sample generator (RSG) to solvethis problem.
12, TITLE: Combined Person Classification with Airborne Optical Sectioning
AUTHORS: Indrajit Kurmi ; David C. Schedl ; Oliver Bimber
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We demonstrate that false detections can be significantly suppressed and true detections boosted by combining classifications from multiple AOS rather than single integral images.
13, TITLE: Quantized Neural Networks Via {-1, +1} Encoding Decomposition and Acceleration
AUTHORS: QIGONG SUN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To address this issue, we propose a novel encoding scheme using {-1, +1} to decompose quantized neural networks (QNNs) into multi-branch binary networks, which can be efficiently implemented by bitwise operations (i.e., xnor and bitcount) to achieve model compression, computational acceleration, and resource saving.
14, TITLE: Medical Matting: A New Perspective on Medical Segmentation with Uncertainty
AUTHORS: LIN WANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Inspired by image matting, we introduce matting as a soft segmentation method and a new perspective to deal with and represent uncertain regions into medical scenes, namely medical matting.
15, TITLE: Light Lies: Optical Adversarial Attack
AUTHORS: KYU-LIM KIM et. al.
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: This paper, for the first time, introduces an optical adversarial attack, which physically alters the light field information arriving at the image sensor so that the classification model yields misclassification.
16, TITLE: Discerning Generic Event Boundaries in Long-Form Wild Videos
AUTHORS: AYUSH K RAI et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper we present a technique forgeneric event boundary detection based on a two stream in-flated 3D convolutions architecture, which can learn spatio-temporal features from videos.
17, TITLE: Towards Clustering-friendly Representations: Subspace Clustering Via Graph Filtering
AUTHORS: Zhengrui Ma ; Zhao Kang ; Guangchun Luo ; Ling Tian
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: To recover the ``clustering-friendly'' representation and facilitate the subsequent clustering, we propose a graph filtering approach by which a smooth representation is achieved.
18, TITLE: Advanced Hough-based Method for On-device Document Localization
AUTHORS: D. V. Tropin ; A. M. Ershov ; D. P. Nikolaev ; V. V. Arlazarov
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In accordance with the published works, at least 5 systems offer solutions for on-device document location.
19, TITLE: Smoothed Multi-View Subspace Clustering
AUTHORS: Peng Chen ; Liang Liu ; Zhengrui Ma ; Zhao Kang
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: In this work, we propose a novel multi-view clustering method named smoothed multi-view subspace clustering (SMVSC) by employing a novel technique, i.e., graph filtering, to obtain a smooth representation for each view, in which similar data points have similar feature values.
20, TITLE: A Coarse-to-Fine Instance Segmentation Network with Learning Boundary Representation
AUTHORS: Feng Luo ; Bin-Bin Gao ; Jiangpeng Yan ; Xiu Li
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper, we propose a coarse-to-fine module to address the problem.
21, TITLE: Towards Interpreting Computer Vision Based on Transformation Invariant Optimization
AUTHORS: CHEN LI et. al.
CATEGORY: cs.CV [cs.CV, 68T45, I.2.10]
HIGHLIGHT: In this work, visualized images that can activate the neural network to the target classes are generated by back-propagation method.
22, TITLE: Analyzing Adversarial Robustness of Deep Neural Networks in Pixel Space: A Semantic Perspective
AUTHORS: LINA WANG et. al.
CATEGORY: cs.CV [cs.CV, physics.soc-ph, 68T-06]
HIGHLIGHT: In this work, we fill this gap and explore the pixel space of the adversarial image by proposing an algorithm to looking for possible perturbations pixel by pixel in different regions of the segmented image.
23, TITLE: A Framework for Real-time Traffic Trajectory Tracking, Speed Estimation, and Driver Behavior Calibration at Urban Intersections Using Virtual Traffic Lanes
AUTHORS: Awad Abdelhalim ; Montasir Abbas ; Bhavi Bharat Kotha ; Alfred Wicks
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In a previous study, we presented VT-Lane, a three-step framework for real-time vehicle detection, tracking, and turn movement classification at urban intersections.
24, TITLE: Bridging The Gap Between Object Detection and User Intent Via Query-Modulation
AUTHORS: MARCO FORNONI et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: In this paper we investigate techniques to modulate standard object detectors to explicitly account for the user intent, expressed as an embedding of a simple query.
25, TITLE: All You Can Embed: Natural Language Based Vehicle Retrieval with Spatio-Temporal Transformers
AUTHORS: Carmelo Scribano ; Davide Sapienza ; Giorgia Franchini ; Micaela Verucchi ; Marko Bertogna
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present All You Can Embed (AYCE), a modular solution to correlate single-vehicle tracking sequences with natural language.
26, TITLE: Deep Reinforcement Learning with Automated Label Extraction from Clinical Reports Accurately Classifies 3D MRI Brain Volumes
AUTHORS: Joseph Stember ; Hrithwik Shalu
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: Methods: For Part 1, we trained SBERT with 90 radiology report impressions.
27, TITLE: Multi-Granularity Network with Modal Attention for Dense Affective Understanding
AUTHORS: BAOMING YAN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a multi-granularity network with modal attention (MGN-MA), which employs multi-granularity features for better description of the target frame.
28, TITLE: HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping
AUTHORS: YUHAN WANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we propose a high fidelity face swapping method, called HifiFace, which can well preserve the face shape of the source face and generate photo-realistic results.
29, TITLE: Toward Fault Detection in Industrial Welding Processes with Deep Learning and Data Augmentation
AUTHORS: Jibinraj Antony ; Dr. Florian Schlather ; Georgij Safronov ; Markus Schmitz ; Prof. Dr. Kristof Van Laerhoven
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: We use object detection algorithms from the TensorFlow object detection API and adapt them to our use case using transfer learning.
30, TITLE: Shape Prior Non-Uniform Sampling Guided Real-time Stereo 3D Object Detection
AUTHORS: A. Gao ; J. Cao ; Y. Pang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To encode more information from the outer region, we propose a shape prior non-uniform sampling strategy that performs dense sampling in outer region and sparse sampling in inner region.
31, TITLE: Novelty Detection Via Contrastive Learning with Negative Data Augmentation
AUTHORS: CHENGWEI CHEN et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: We overcome such problems by introducing a novel decoder-encoder framework.
32, TITLE: A Dynamic Spatial-temporal Attention Network for Early Anticipation of Traffic Accidents
AUTHORS: Muhammad Monjurul Karim ; Yu Li ; Ruwen Qin ; Zhaozheng Yin
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: To this end, the paper presents a dynamic spatial-temporal attention (DSTA) network for early anticipation of traffic accidents from dashcam videos.
33, TITLE: HSMAL: Detailed Horse Shape and Pose Reconstruction for Motion Pattern Recognition
AUTHORS: CI LI et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper we present our preliminary work on model-based behavioral analysis of horse motion.
34, TITLE: Discovering Relationships Between Object Categories Via Universal Canonical Maps
AUTHORS: Natalia Neverova ; Artsiom Sanakoyeu ; Patrick Labatut ; David Novotny ; Andrea Vedaldi
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we show that improved correspondences can be learned automatically as a natural byproduct of learning category-specific dense pose predictors.
35, TITLE: EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition 2021: Team M3EM Technical Report
AUTHORS: Lijin Yang ; Yifei Huang ; Yusuke Sugano ; Yoichi Sato
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this report, we describe the technical details of our submission to the 2021 EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition.
36, TITLE: Equivariance-bridged SO(2)-Invariant Representation Learning Using Graph Convolutional Network
AUTHORS: Sungwon Hwang ; Hyungtae Lim ; Hyun Myung
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, another progressive vision of research direction is highlighted to encourage less dependence on data augmentation by achieving structural rotational invariance of a network.
37, TITLE: Self-supervised Video Representation Learning with Cross-Stream Prototypical Contrasting
AUTHORS: Martine Toering ; Ioannis Gatopoulos ; Maarten Stol ; Vincent Tao Hu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper we propose "Video Cross-Stream Prototypical Contrasting", a novel method which predicts consistent prototype assignments from both RGB and optical flow views, operating on sets of samples.
38, TITLE: Contrastive Learning of Generalized Game Representations
AUTHORS: Chintan Trivedi ; Antonios Liapis ; Georgios N. Yannakakis
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper we build on recent advances in contrastive learning and showcase its benefits for representation learning in games.
39, TITLE: DeepLab2: A TensorFlow Library for Deep Labeling
AUTHORS: MARK WEBER et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: DeepLab2: A TensorFlow Library for Deep Labeling
40, TITLE: Guided Integrated Gradients: An Adaptive Path Method for Removing Noise
AUTHORS: ANDREI KAPISHNIKOV et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this work, we show that one of the causes of the problem is the accumulation of noise along the IG path.
41, TITLE: Residual Contrastive Learning for Joint Demosaicking and Denoising
AUTHORS: NANQING DONG et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: To bridge this methodological gap, we present a novel CL approach on RAW images, residual contrastive learning (RCL), which aims to learn meaningful representations for JDD.
42, TITLE: Efficient Self-supervised Vision Transformers for Representation Learning
AUTHORS: CHUNYUAN LI et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: This paper investigates two techniques for developing efficient self-supervised vision transformers (EsViT) for visual representation learning.
43, TITLE: Towards Distraction-Robust Active Visual Tracking
AUTHORS: Fangwei Zhong ; Peng Sun ; Wenhan Luo ; Tingyun Yan ; Yizhou Wang
CATEGORY: cs.CV [cs.CV, cs.AI, cs.MA, cs.RO]
HIGHLIGHT: To address this issue, we propose a mixed cooperative-competitive multi-agent game, where a target and multiple distractors form a collaborative team to play against a tracker and make it fail to follow.
44, TITLE: Virtual Temporal Samples for Recurrent Neural Networks: Applied to Semantic Segmentation in Agriculture
AUTHORS: Alireza Ahmadi ; Michael Halstead ; Chris McCool
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: This paper explores the potential for performing temporal semantic segmentation in the context of agricultural robotics without temporally labelled data.
45, TITLE: A Distance-based Separability Measure for Internal Cluster Validation
AUTHORS: Shuyue Guan ; Murray Loew
CATEGORY: cs.LG [cs.LG, cs.CV, cs.DM]
HIGHLIGHT: In this paper, we propose a novel internal CVI -- the Distance-based Separability Index (DSI), based on a data separability measure.
46, TITLE: A Unified Generative Adversarial Network Training Via Self-Labeling and Self-Attention
AUTHORS: Tomoki Watanabe ; Paolo Favaro
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: We propose a novel GAN training scheme that can handle any level of labeling in a unified manner.
47, TITLE: Residual Error: A New Performance Measure for Adversarial Robustness
AUTHORS: Hossein Aboutalebi ; Mohammad Javad Shafiee ; Michelle Karg ; Christian Scharfenberger ; Alexander Wong
CATEGORY: cs.LG [cs.LG, cs.CV, stat.ML]
HIGHLIGHT: Residual Error: A New Performance Measure for Adversarial Robustness
48, TITLE: World-GAN: A Generative Model for Minecraft Worlds
AUTHORS: Maren Awiszus ; Frederik Schubert ; Bodo Rosenhahn
CATEGORY: cs.LG [cs.LG, cs.CV, cs.NE]
HIGHLIGHT: This work introduces World-GAN, the first method to perform data-driven Procedural Content Generation via Machine Learning in Minecraft from a single example.
49, TITLE: Steerable Partial Differential Operators for Equivariant Neural Networks
AUTHORS: Erik Jenner ; Maurice Weiler
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: In this work, we derive a $G$-steerability constraint that completely characterizes when a PDO between feature vector fields is equivariant, for arbitrary symmetry groups $G$.
50, TITLE: Evolving GANs: When Contradictions Turn Into Compliance
AUTHORS: Sauptik Dhar ; Javad Heydari ; Samarth Tripathi ; Unmesh Kurup ; Mohak Shah
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV, stat.ML]
HIGHLIGHT: In this paper, we propose a GAN game which provides improved discriminator accuracy under limited data settings, while generating realistic synthetic data.
51, TITLE: Indicators of Attack Failure: Debugging and Improving Optimization of Adversarial Examples
AUTHORS: MAURA PINTOR et. al.
CATEGORY: cs.LG [cs.LG, cs.CR, cs.CV]
HIGHLIGHT: In this work, we overcome these limitations by (i) defining a set of quantitative indicators which unveil common failures in the optimization of gradient-based attacks, and (ii) proposing specific mitigation strategies within a systematic evaluation protocol.
52, TITLE: PyKale: Knowledge-Aware Machine Learning from Multiple Sources in Python
AUTHORS: HAIPING LU et. al.
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV, stat.ML]
HIGHLIGHT: We present Pykale - a Python library for knowledge-aware machine learning on graphs, images, texts, and videos to enable and accelerate interdisciplinary research.
53, TITLE: Accumulative Poisoning Attacks on Real-time Data
AUTHORS: Tianyu Pang ; Xiao Yang ; Yinpeng Dong ; Hang Su ; Jun Zhu
CATEGORY: cs.LG [cs.LG, cs.CR, cs.CV]
HIGHLIGHT: In this paper, we focus on the real-time settings and propose a new attacking strategy, which affiliates an accumulative phase with poisoning attacks to secretly (i.e., without affecting accuracy) magnify the destructive effect of a (poisoned) trigger batch.
54, TITLE: Improved Radar Localization on Lidar Maps Using Shared Embedding
AUTHORS: Huan Yin ; Yue Wang ; Rong Xiong
CATEGORY: cs.RO [cs.RO, cs.CV]
HIGHLIGHT: We present a heterogeneous localization framework for solving radar global localization and pose tracking on pre-built lidar maps.
55, TITLE: Development of A Conversing and Body Temperature Scanning Autonomously Navigating Robot to Help Screen for COVID-19
AUTHORS: Ryan Kim
CATEGORY: cs.RO [cs.RO, cs.CV, cs.HC]
HIGHLIGHT: An autonomously navigating mobile robot is used with a manipulator controlled using a face tracking algorithm, and an end effector consisting of a thermal camera, smartphone, and chatbot.
56, TITLE: AI-Enabled Ultra-Low-Dose CT Reconstruction
AUTHORS: WEIWEN WU et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG, 68T07]
HIGHLIGHT: In this paper, we demonstrate that AI-powered CT reconstruction offers diagnostic image quality at an ultra-low-dose level comparable to that of radiography.
57, TITLE: Hybrid Graph Convolutional Neural Networks for Landmark-based Anatomical Segmentation
AUTHORS: Nicol�s Gaggion ; Lucas Mansilla ; Diego Milone ; Enzo Ferrante
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: In this work we address the problem of landmark-based segmentation for anatomical structures.
58, TITLE: Debiased Subjective Assessment of Real-World Image Enhancement
AUTHORS: Cao Peibei. Wang Zhangyang ; Ma Kede
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: We demonstrate our subjective assessment method using three popular and practically demanding image enhancement tasks: dehazing, super-resolution, and low-light enhancement.
59, TITLE: Non-Iterative Phase Retrieval With Cascaded Neural Networks
AUTHORS: Tobias Uelwer ; Tobias Hoffmann ; Stefan Harmeling
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this paper, we want to push the limits of these learned methods by means of a deep neural network cascade that reconstructs the image successively on different resolutions from its non-oversampled Fourier magnitude.
60, TITLE: Synthetic COVID-19 Chest X-ray Dataset for Computer-Aided Diagnosis
AUTHORS: Hasib Zunair ; A. Ben Hamza
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: We introduce a new dataset called Synthetic COVID-19 Chest X-ray Dataset for training machine learning models.