本专栏是计算机视觉方向论文收集积累,时间:2021年6月2日,来源:paper digest
欢迎关注原创公众号 【计算机视觉联盟】,回复 【西瓜书手推笔记】 可获取我的机器学习纯手推笔记!
直达笔记地址:机器学习手推笔记(GitHub地址)
1, TITLE: Incorporating Visual Layout Structures for Scientific Text Classification
AUTHORS: ZEJIANG SHEN et. al.
CATEGORY: cs.CL [cs.CL, cs.CV]
HIGHLIGHT: We introduce new methods for incorporating VIsual LAyout structures (VILA), e.g., the grouping of page texts into text lines or text blocks, into language models to further improve performance.
2, TITLE: Volta at SemEval-2021 Task 6: Towards Detecting Persuasive Texts and Images Using Textual and Multimodal Ensemble
AUTHORS: Kshitij Gupta ; Devansh Gautam ; Radhika Mamidi
CATEGORY: cs.CL [cs.CL, cs.CV]
HIGHLIGHT: In this paper, we propose a transfer learning approach to fine-tune BERT-based models in different modalities.
3, TITLE: ViTA: Visual-Linguistic Translation By Aligning Object Tags
AUTHORS: Kshitij Gupta ; Devansh Gautam ; Radhika Mamidi
CATEGORY: cs.CL [cs.CL, cs.CV]
HIGHLIGHT: In this paper, we propose our system for the Multimodal Translation Task of WAT 2021 from English to Hindi.
4, TITLE: Rethinking Re-Sampling in Imbalanced Semi-Supervised Learning
AUTHORS: JU HE et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we consider the problem of SSL on class-imbalanced data, which better reflects real-world situations but has only received limited attention so far.
5, TITLE: PanoDR: Spherical Panorama Diminished Reality for Indoor Scenes
AUTHORS: V. Gkitsas ; V. Sterzentsenko ; N. Zioulis ; G. Albanis ; D. Zarpalas
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To ensure structure-aware counterfactual inpainting, we propose a model that initially predicts the structure of an indoor scene and then uses it to guide the reconstruction of an empty -- background only -- representation of the same scene.
6, TITLE: Fidelity Estimation Improves Noisy-Image Classification with Pretrained Networks
AUTHORS: Xiaoyu Lin ; Deblina Bhattacharjee ; Majed El Helou ; Sabine S�sstrunk
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: We instead propose a method that can be applied on a pretrained classifier.
7, TITLE: Closer Look at The Uncertainty Estimation in Semantic Segmentation Under Distributional Shift
AUTHORS: Sebastian Cygert ; Bart?omiej Wr�blewski ; Karol Wo?niak ; Rados?aw S?owi?ski ; Andrzej Czy?ewski
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, uncertainty estimation for the task of semantic segmentation is evaluated under a varying level of domain shift: in a cross-dataset setting and when adapting a model trained on data from the simulation.
8, TITLE: Bootstrap Your Own Correspondences
AUTHORS: Mohamed El Banani ; Justin Johnson
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose BYOC: a self-supervised approach that learns visual and geometric features from RGB-D video without relying on ground-truth pose or correspondence.
9, TITLE: Detecting Anomalies in Semantic Segmentation with Prototypes
AUTHORS: Dario Fontanel ; Fabio Cermelli ; Massimiliano Mancini ; Barbara Caputo
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper we take a different route and we propose to address anomaly segmentation through prototype learning.
10, TITLE: VA-GCN: A Vector Attention Graph Convolution Network for Learning on Point Clouds
AUTHORS: Haotian Hu ; Fanyi Wang ; Huixiao Le
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To verify the efficiency of the VAConv, we connect the VAConvs with different receptive fields in parallel to obtain a Multi-scale graph convolutional network, VA-GCN.
11, TITLE: Semi-Supervised Domain Generalization with Stochastic StyleMatch
AUTHORS: Kaiyang Zhou ; Chen Change Loy ; Ziwei Liu
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: In this work, we investigate semi-supervised domain generalization (SSDG), a more realistic and practical setting.
12, TITLE: Hardness Sampling for Self-Training Based Transductive Zero-Shot Learning
AUTHORS: Liu Bo ; Qiulei Dong ; Zhanyi Hu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We introduce two typical ZSL methods into the STHS framework and extensive experiments demonstrate that the derived T-ZSL methods outperform many state-of-the-art methods on three public benchmarks.
13, TITLE: EV-VGCNN: A Voxel Graph CNN for Event-based Object Classification
AUTHORS: Yongjian Deng ; Hao Chen ; Huiying Chen ; Youfu Li
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Event-based learning methods have recently achieved massive success on object recognition by integrating events into dense frame-based representations to apply traditional 2D learning algorithms.
14, TITLE: Robust Mutual Learning for Semi-supervised Semantic Segmentation
AUTHORS: Pan Zhang ; Bo Zhang ; Ting Zhang ; Dong Chen ; Fang Wen
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we propose robust mutual learning that improves the prior approach in two aspects.
15, TITLE: TransVOS: Video Object Segmentation with Transformers
AUTHORS: Jianbiao Mei ; Mengmeng Wang ; Yeneng Lin ; Yong Liu
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper, we propose a new transformer-based framework, termed TransVOS, introducing a vision transformer to fully exploit and model both the temporal and spatial relationships.
16, TITLE: Dual Normalization Multitasking for Audio-Visual Sounding Object Localization
AUTHORS: Tokuhiro Nishikawa ; Daiki Shimada ; Jerry Jun Yokono
CATEGORY: cs.CV [cs.CV, cs.SD, eess.AS]
HIGHLIGHT: To tackle this new AVSOL problem, we propose a novel multitask training strategy and architecture called Dual Normalization Multitasking (DNM), which aggregates the Audio-Visual Correspondence (AVC) task and the classification task for video events into a single audio-visual similarity map. We also created the evaluation dataset (AVSOL-E dataset) by manually annotating the test set of well-known Audio-Visual Event (AVE) dataset.
17, TITLE: Quantification of Carbon Sequestration in Urban Forests
AUTHORS: Levente Klein ; Wang Zhou ; Conrad Albrecht
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: Here we present an approach to estimate the carbon storage in trees based on fusing multispectral aerial imagery and LiDAR data to identify tree coverage, geometric shape, and tree species, which are crucial attributes in carbon storage quantification.
18, TITLE: Reconciliation of Statistical and Spatial Sparsity For Robust Image and Image-Set Classification
AUTHORS: Hao Cheng ; Kim-Hui Yap ; Bihan Wen
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we propose a novel Joint Statistical and Spatial Sparse representation, dubbed \textit{J3S}, to model the image or image-set data for classification, by reconciling both their local patch structures and global Gaussian distribution mapped into Riemannian manifold.
19, TITLE: Anti-aliasing Semantic Reconstruction for Few-Shot Semantic Segmentation
AUTHORS: Binghao Liu ; Yao Ding ; Jianbin Jiao ; Xiangyang Ji ; Qixiang Ye
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we reformulate few-shot segmentation as a semantic reconstruction problem, and convert base class features into a series of basis vectors which span a class-level semantic space for novel class reconstruction.
20, TITLE: Natural Statistics of Network Activations and Implications for Knowledge Distillation
AUTHORS: Michael Rotman ; Lior Wolf
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: As a direct implication of our discoveries, we present a method for performing Knowledge Distillation (KD).
21, TITLE: Prior-Enhanced Few-Shot Segmentation with Meta-Prototypes
AUTHORS: JIAN-WEI ZHANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose the Prior-Enhanced network with Meta-Prototypes to tackle these limitations.
22, TITLE: Towards Real-time and Light-weight Line Segment Detection
AUTHORS: GEONMO GU et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we propose a real-time and light-weight line segment detector for resource-constrained environments named Mobile LSD (M-LSD).
23, TITLE: You Only Look at One Sequence: Rethinking Transformer in Vision Through Object Detection
AUTHORS: YUXIN FANG et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: To answer this question, we present You Only Look at One Sequence (YOLOS), a series of object detection models based on the na\"ive Vision Transformer with the fewest possible modifications as well as inductive biases.
24, TITLE: Language-Driven Image Style Transfer
AUTHORS: Tsu-Jui Fu ; Xin Eric Wang ; William Yang Wang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We introduce a new task -- language-driven image style transfer (\texttt{LDIST}) -- to manipulate the style of a content image, guided by a text.
25, TITLE: Continual 3D Convolutional Neural Networks for Real-time Processing of Videos
AUTHORS: Lukas Hedegaard ; Alexandros Iosifidis
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: This paper introduces Continual 3D Convolutional Neural Networks (Co3D CNNs), a new computational formulation of spatio-temporal 3D CNNs, in which videos are processed frame-by-frame rather than by clip.
26, TITLE: A Novel Graph-Theoretic Deep Representation Learning Method for Multi-Label Remote Sensing Image Retrieval
AUTHORS: Gencer Sumbul ; Beg�m Demir
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper presents a novel graph-theoretic deep representation learning method in the framework of multi-label remote sensing (RS) image retrieval problems.
27, TITLE: Comprehensive Validation of Automated Whole Body Skeletal Muscle, Adipose Tissue, and Bone Segmentation from 3D CT Images for Body Composition Analysis: Towards Extended Body Composition
AUTHORS: Da Ma ; Vincent Chow ; Karteek Popuri ; Mirza Faisal Beg
CATEGORY: cs.CV [cs.CV, q-bio.TO]
HIGHLIGHT: Comprehensive Validation of Automated Whole Body Skeletal Muscle, Adipose Tissue, and Bone Segmentation from 3D CT Images for Body Composition Analysis: Towards Extended Body Composition
28, TITLE: Look Wide and Interpret Twice: Improving Performance on Interactive Instruction-following Tasks
AUTHORS: Van-Quang Nguyen ; Masanori Suganuma ; Takayuki Okatani
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper proposes a new method, which outperforms the previous methods by a large margin.
29, TITLE: Towards Efficient Cross-Modal Visual Textual Retrieval Using Transformer-Encoder Deep Features
AUTHORS: Nicola Messina ; Giuseppe Amato ; Fabrizio Falchi ; Claudio Gennaro ; St�phane Marchand-Maillet
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we focus on the image-sentence retrieval task, where the objective is to efficiently find relevant images for a given sentence (image-retrieval) or the relevant sentences for a given image (sentence-retrieval).
30, TITLE: Independent Prototype Propagation for Zero-Shot Compositionality
AUTHORS: Frank Ruis ; Gertjan Burghours ; Doina Bucur
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: To be able to deal with underspecified datasets while still leveraging contextual clues during classification, we propose ProtoProp, a novel prototype propagation graph method.
31, TITLE: Deep Learning for Prediction of Hepatocellular Carcinoma Recurrence After Resection or Liver Transplantation: A Discovery and Validation Study
AUTHORS: ZHIKUN LIU et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Pathological review showed that the tumoral areas most predictive of recurrence were characterized by presence of stroma, high degree of cytological atypia, nuclear hyperchomasia, and a lack of immune infiltration.
32, TITLE: Analysis of Vision-based Abnormal Red Blood Cell Classification
AUTHORS: ANNIKA WONG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper presents an automated process utilising the advantages of machine learning to increase capacity and standardisation of cell abnormality detection, and its performance is analysed.
33, TITLE: Exploring The Diversity and Invariance in Yourself for Visual Pre-Training Task
AUTHORS: Longhui Wei ; Lingxi Xie ; Wengang Zhou ; Houqiang Li ; Qi Tian
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To alleviate this issue, this paper introduces a simple but effective mechanism, called Exploring the Diversity and Invariance in Yourself E-DIY.
34, TITLE: Consistent Two-Flow Network for Tele-Registration of Point Clouds
AUTHORS: ZIHAO YAN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present a learning-based technique that alleviates this problem, and allows registration between point clouds, presented in arbitrary poses, and having little or even no overlap, a setting that has been referred to as tele-registration.
35, TITLE: Dense Nested Attention Network for Infrared Small Target Detection
AUTHORS: BOYANG LI et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To handle this problem, we propose a dense nested attention network (DNANet) in this paper. Moreover, we develop an infrared small target dataset (namely, NUDT-SIRST) and propose a set of evaluation metrics to conduct comprehensive performance evaluation.
36, TITLE: Adversarial VQA: A New Benchmark for Evaluating The Robustness of VQA Models
AUTHORS: Linjie Li ; Jie Lei ; Zhe Gan ; Jingjing Liu
CATEGORY: cs.CV [cs.CV, cs.CL]
HIGHLIGHT: To study this, we introduce Adversarial VQA, a new large-scale VQA benchmark, collected iteratively via an adversarial human-and-model-in-the-loop procedure.
37, TITLE: DLA-Net: Learning Dual Local Attention Features for Semantic Segmentation of Large-Scale Building Facade Point Clouds
AUTHORS: YANFEI SU et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Addressing this problem, we propose a learnable attention module that learns Dual Local Attention features, called DLA in this paper. As there is a lack of 3D point clouds datasets related to the fine-grained building facade, we construct the first large-scale building facade point clouds benchmark dataset for semantic segmentation.
38, TITLE: Full-Resolution Encoder-Decoder Networks with Multi-Scale Feature Fusion for Human Pose Estimation
AUTHORS: Jie Ou ; Mingjian Chen ; Hong Wu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To achieve more accurate 2D human pose estimation, we extend the successful encoder-decoder network, simple baseline network (SBN), in three ways.
39, TITLE: Rethinking Pseudo Labels for Semi-Supervised Object Detection
AUTHORS: Hengduo Li ; Zuxuan Wu ; Abhinav Shrivastava ; Larry S. Davis
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we introduce certainty-aware pseudo labels tailored for object detection, which can effectively estimate the classification and localization quality of derived pseudo labels.
40, TITLE: Predicting Vehicles Trajectories in Urban Scenarios with Transformer Networks and Augmented Information
AUTHORS: A. Quintanar ; D. Fern�ndez-Llorca ; I. Parra ; R. Izquierdo ; M. A. Sotelo
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Our model exploits these simple structures by adding augmented data (position and heading), and adapting their use to the problem of vehicle trajectory prediction in urban scenarios in prediction horizons up to 5 seconds.
41, TITLE: Semi-Supervised Disparity Estimation with Deep Feature Reconstruction
AUTHORS: Julia Guerrero-Viu ; Sergio Izquierdo ; Philipp Schr�ppel ; Thomas Brox
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose a semi-supervised pipeline that successfully adapts DispNet to a real-world domain by joint supervised training on labeled synthetic data and self-supervised training on unlabeled real data.
42, TITLE: Integrative Use of Computer Vision and Unmanned Aircraft Technologies in Public Inspection: Foreign Object Debris Image Collection
AUTHORS: Travis J. E. Munyer ; Daniel Brinkman ; Chenyu Huang ; Xin Zhong
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: The purpose of this study is to expand this research area by integrating computer vision and UAS technology to automate public inspection.
43, TITLE: Clustering-friendly Representation Learning Via Instance Discrimination and Feature Decorrelation
AUTHORS: Yaling Tao ; Kentaro Takagi ; Kouta Nakata
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: In this paper, we propose a clustering-friendly representation learning method using instance discrimination and feature decorrelation.
44, TITLE: Analysis of Classifiers Robust to Noisy Labels
AUTHORS: Alex D�az ; Damian Steele
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV, cs.DS]
HIGHLIGHT: We explore contemporary robust classification algorithms for overcoming class-dependant labelling noise: Forward, Importance Re-weighting and T-revision.
45, TITLE: Quantifying Predictive Uncertainty in Medical Image Analysis with Deep Kernel Learning
AUTHORS: Zhiliang Wu ; Yinchong Yang ; Jindong Gu ; Volker Tresp
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: We propose an uncertainty-aware deep kernel learning model which permits the estimation of the uncertainty in the prediction by a pipeline of a Convolutional Neural Network and a sparse Gaussian Process.
46, TITLE: Markpainting: Adversarial Machine Learning Meets Inpainting
AUTHORS: David Khachaturov ; Ilia Shumailov ; Yiren Zhao ; Nicolas Papernot ; Ross Anderson
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CR, cs.CV, cs.CY]
HIGHLIGHT: In this paper we study how to manipulate it using our markpainting technique.
47, TITLE: Exposing Previously Undetectable Faults in Deep Neural Networks
AUTHORS: Isaac Dunn ; Hadrien Pouget ; Daniel Kroening ; Tom Melham
CATEGORY: cs.LG [cs.LG, cs.CV, cs.SE, I.2.6; D.2.5]
HIGHLIGHT: In this paper, we introduce a novel DNN testing method that is able to find faults in DNNs that other methods cannot.
48, TITLE: Effect of Large-scale Pre-training on Full and Few-shot Transfer Learning for Natural and Medical Images
AUTHORS: Mehdi Cherti ; Jenia Jitsev
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: In this work we conduct large-scale pre-training on large source datasets of either natural (ImageNet-21k/1k) or medical chest X-Ray images and compare full and few-shot transfer using different target datasets from both natural and medical imaging domains.
49, TITLE: Learning Football Body-Orientation As A Matter of Classification
AUTHORS: Adri� Arbu�s-Sang�esa ; Adri�n Mart�n ; Paulino Granero ; Coloma Ballester ; Gloria Haro
CATEGORY: cs.LG [cs.LG, cs.CV, eess.IV]
HIGHLIGHT: To the best of our knowledge, this article presents the first deep learning model for estimating orientation directly from video footage.
50, TITLE: GANs Can Play Lottery Tickets Too
AUTHORS: Xuxi Chen ; Zhenyu Zhang ; Yongduo Sui ; Tianlong Chen
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: In this work, we for the first time study the existence of such trainable matching subnetworks in deep GANs.
51, TITLE: Markov Localisation Using Heatmap Regression and Deep Convolutional Odometry
AUTHORS: Oscar Mendez ; Simon Hadfield ; Richard Bowden
CATEGORY: cs.RO [cs.RO, cs.CV, cs.LG]
HIGHLIGHT: In this work, we present a novel CNN-based localisation approach that can leverage modern deep learning hardware.
52, TITLE: What Can I Do Here? Learning New Skills By Imagining Visual Affordances
AUTHORS: Alexander Khazatsky ; Ashvin Nair ; Daniel Jing ; Sergey Levine
CATEGORY: cs.RO [cs.RO, cs.CV, cs.LG]
HIGHLIGHT: In this paper, we aim to study how generative models of possible outcomes can allow a robot to learn visual representations of affordances, so that the robot can sample potentially possible outcomes in new situations, and then further train its policy to achieve those outcomes.
53, TITLE: 3D Map Creation Using Crowdsourced GNSS Data
AUTHORS: Terence Lines ; Ana Basiri
CATEGORY: cs.RO [cs.RO, cs.CV, eess.SP]
HIGHLIGHT: This paper proposes and implements a novel approach to generate 2.5D (otherwise known as 3D level-of-detail (LOD) 1) maps for free using Global Navigation Satellite Systems (GNSS) signals, which are globally available and are blocked only by obstacles between the satellites and the receivers.
54, TITLE: Hybrid Deep Neural Network for Brachial Plexus Nerve Segmentation in Ultrasound Images
AUTHORS: Juul P. A. van Boxtel ; Vincent R. J. Vousten ; Josien Pluim ; Nastaran Mohammadian Rad
CATEGORY: eess.IV [eess.IV, cs.CV, I.4.6]
HIGHLIGHT: In this paper, we propose a hybrid model consisting of a classification model followed by a segmentation model to segment BP nerve regions in ultrasound images.
55, TITLE: COV-ECGNET: COVID-19 Detection Using ECG Trace Images with Deep Convolutional Neural Network
AUTHORS: TAWSIFUR RAHMAN et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: In this work, COVID-19 and other cardiovascular diseases (CVDs) were detected using deep-learning techniques.
56, TITLE: Hyperspectral Band Selection for Multispectral Image Classification with Convolutional Networks
AUTHORS: Giorgio Morales ; John Sheppard ; Riley Logan ; Joseph Shaw
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: We propose a novel band selection method to select a reduced set of wavelengths, obtained from an HSI system in the context of image classification.
57, TITLE: Decoupling Shape and Density for Liver Lesion Synthesis Using Conditional Generative Adversarial Networks
AUTHORS: Dario Augusto Borges Oliveira
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: This paper presents a method for decoupling shape and density for liver lesion synthesis, creating a framework that allows straight-forwardly driving the synthesis.
58, TITLE: RAI-Net: Range-Adaptive LiDAR Point Cloud Frame Interpolation Network
AUTHORS: LILI ZHAO et. al.
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this paper, we propose a novel LiDAR point cloud frame interpolation method, which exploits range images (RIs) as an intermediate representation with CNNs to conduct the frame interpolation process.
59, TITLE: 3D WaveUNet: 3D Wavelet Integrated Encoder-Decoder Network for Neuron Segmentation
AUTHORS: Qiufu Li ; Linlin Shen
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this paper, we propose a 3D wavelet and deep learning based 3D neuron segmentation method.