本专栏是计算机视觉方向论文收集积累,时间:2021年6月10日,来源:paper digest
欢迎关注原创公众号 【计算机视觉联盟】,回复 【西瓜书手推笔记】 可获取我的机器学习纯手推笔记!
直达笔记地址:机器学习手推笔记(GitHub地址)
1, TITLE: A Machine Learning Pipeline for Aiding School Identification from Child Trafficking Images
AUTHORS: SUMIT MUKHERJEE et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we focus on images that contain children wearing school uniforms to identify the school of origin.
2, TITLE: Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation
AUTHORS: Ho Kei Cheng ; Yu-Wing Tai ; Chi-Keung Tang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper presents a simple yet effective approach to modeling space-time correspondences in the context of video object segmentation.
3, TITLE: Distilling Image Classifiers in Object Detectors
AUTHORS: Shuxuan Guo ; Jose M. Alvarez ; Mathieu Salzmann
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: To this end, we study the case of object detection and, instead of following the standard detector-to-detector distillation approach, introduce a classifier-to-detector knowledge transfer framework.
4, TITLE: Dual-Modality Vehicle Anomaly Detection Via Bilateral Trajectory Tracing
AUTHORS: JINGYUAN CHEN et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper, we proposed a dual-modality modularized methodology for the robust detection of abnormal vehicles.
5, TITLE: Knowledge Distillation: A Good Teacher Is Patient and Consistent
AUTHORS: LUCAS BEYER et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: In this paper we address this issue and significantly bridge the gap between these two types of models.
6, TITLE: Salient Object Ranking with Position-Preserved Attention
AUTHORS: HAO FANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we study the Salient Object Ranking (SOR) task, which manages to assign a ranking order of each detected object according to its visual saliency.
7, TITLE: VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation
AUTHORS: LINJIE LI et. al.
CATEGORY: cs.CV [cs.CV, cs.CL]
HIGHLIGHT: To facilitate the evaluation of such systems, we introduce Video-And-Language Understanding Evaluation (VALUE) benchmark, an assemblage of 11 VidL datasets over 3 popular tasks: (i) text-to-video retrieval; (ii) video question answering; and (iii) video captioning.
8, TITLE: Towards Defending Against Adversarial Examples Via Attack-Invariant Features
AUTHORS: DAWEI ZHOU et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To solve this problem, in this paper, we propose to remove adversarial noise by learning generalizable invariant features across attacks which maintain semantic classification information.
9, TITLE: Tracking By Joint Local and Global Search: A Target-aware Attention Based Approach
AUTHORS: XIAO WANG et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we propose a novel and general target-aware attention mechanism (termed TANet) and integrate it with tracking-by-detection framework to conduct joint local and global search for robust tracking.
10, TITLE: Check It Again: Progressive Visual Question Answering Via Visual Entailment
AUTHORS: Qingyi Si ; Zheng Lin ; Mingyu Zheng ; Peng Fu ; Weiping Wang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a select-and-rerank (SAR) progressive framework based on Visual Entailment.
11, TITLE: Towards Training Stronger Video Vision Transformers for EPIC-KITCHENS-100 Action Recognition
AUTHORS: ZIYUAN HUANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present empirical results for training a stronger video vision transformer on the EPIC-KITCHENS-100 Action Recognition dataset.
12, TITLE: Semi-supervised Lane Detection with Deep Hough Transform
AUTHORS: Yancong Lin ; Silvia-Laura Pintea ; Jan van Gemert
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose a novel loss function exploiting geometric knowledge of lanes in Hough space, where a lane can be identified as a local maximum.
13, TITLE: ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation
AUTHORS: Lihe Yang ; Wei Zhuo ; Lei Qi ; Yinghuan Shi ; Yang Gao
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we investigate if we could make the self-training -- a simple but popular framework -- work better for semi-supervised segmentation.
14, TITLE: PAM: Understanding Product Images in Cross Product Category Attribute Extraction
AUTHORS: RONGMEI LIN et. al.
CATEGORY: cs.CV [cs.CV, cs.CL, cs.LG]
HIGHLIGHT: This work proposes a more inclusive framework that fully utilizes these different modalities for attribute extraction.
15, TITLE: Agile Wide-field Imaging with Selective High Resolution
AUTHORS: Lintao Peng ; Liheng Bian ; Tiexin Liu ; Jun Zhang
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: In this work, we report an agile wide-field imaging framework with selective high resolution that requires only two detectors.
16, TITLE: CoAtNet: Marrying Convolution and Attention for All Data Sizes
AUTHORS: Zihang Dai ; Hanxiao Liu ; Quoc V. Le ; Mingxing Tan
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this work, we show that while Transformers tend to have larger model capacity, their generalization can be worse than convolutional networks due to the lack of the right inductive bias.
17, TITLE: Salient Positions Based Attention Network for Image Classification
AUTHORS: Sheng Fang ; Kaiyu Li ; Zhe Li
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Aimed at both questions this paper proposes the salient positions-based attention scheme SPANet, which is inspired by some interesting observations on the attention maps and affinity matrices generated in self-attention scheme.
18, TITLE: Geometry-Consistent Neural Shape Representation with Implicit Displacement Fields
AUTHORS: Wang Yifan ; Lukas Rahmann ; Olga Sorkine-Hornung
CATEGORY: cs.CV [cs.CV, cs.GR, cs.LG, I.3.6; I.2; I.4]
HIGHLIGHT: We present implicit displacement fields, a novel representation for detailed 3D geometry.
19, TITLE: NeRF in Detail: Learning to Sample for View Synthesis
AUTHORS: Relja Arandjelovi? ; Andrew Zisserman
CATEGORY: cs.CV [cs.CV, cs.GR, cs.LG]
HIGHLIGHT: In this work we address a clear limitation of the vanilla coarse-to-fine approach -- that it is based on a heuristic and not trained end-to-end for the task at hand.
20, TITLE: We Can Always Catch You: Detecting Adversarial Patched Objects WITH or WITHOUT Signature
AUTHORS: Bin Liang ; Jiachun Li ; Jianjun Huang
CATEGORY: cs.CV [cs.CV, cs.CR]
HIGHLIGHT: In this paper, we deeply explore the detection problems about the adversarial patch attacks to the object detection.
21, TITLE: Deep Tiny Network for Recognition-Oriented Face Image Quality Assessment
AUTHORS: Baoyun Peng ; Min Liu ; Heng Yang ; Zhaoning Zhang ; Dongsheng Li
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we present an efficient non-reference image quality assessment for FR that directly links image quality assessment (IQA) and FR.
22, TITLE: Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time
AUTHORS: Shaowei Liu ; Hanwen Jiang ; Jiarui Xu ; Sifei Liu ; Xiaolong Wang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To tackle these challenges, we propose a unified framework for estimating the 3D hand and object poses with semi-supervised learning.
23, TITLE: Towards Explainable Abnormal Infant Movements Identification: A Body-part Based Prediction and Visualisation Framework
AUTHORS: KEVIN D. MCCAY et. al.
CATEGORY: cs.CV [cs.CV, cs.LG, I.4.9; I.5.0; J.3; I.2.1]
HIGHLIGHT: We quantitatively compare the proposed framework's classification performance with several other methods from the literature and qualitatively evaluate the visualization's veracity.
24, TITLE: Generative Models As A Data Source for Multiview Representation Learning
AUTHORS: Ali Jahanian ; Xavier Puig ; Yonglong Tian ; Phillip Isola
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper suggests several techniques for dealing with visual representation learning in such a future.
25, TITLE: SHARP: Shape-Aware Reconstruction of People In Loose Clothing
AUTHORS: Sai Sagar Jinka ; Rohan Chacko ; Astitva Srivastava ; Avinash Sharma ; P. J. Narayanan
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose SHARP, a novel end-to-end trainable network that accurately recovers the detailed geometry and appearance of 3D people in loose clothing from a monocular image.
26, TITLE: Point Cloud Upsampling Via Disentangled Refinement
AUTHORS: Ruihui Li ; Xianzhi Li ; Pheng-Ann Heng ; Chi-Wing Fu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: After revisiting the task, we propose to disentangle the task based on its multi-objective nature and formulate two cascaded sub-networks, a dense generator and a spatial refiner.
27, TITLE: Real Time Egocentric Object Segmentation: THU-READ Labeling and Benchmarking Results
AUTHORS: E. Gonzalez-Sosa ; G. Robledo ; D. Gonzalez-Morin ; P. Perez-Garcia ; A. Villegas
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Due to the lack of datasets of pixel-wise annotations of egocentric objects, in this paper we contribute with a semantic-wise labeling of a subset of 2124 images from the RGB-D THU-READ Dataset.
28, TITLE: Self-supervised Feature Enhancement: Applying Internal Pretext Task to Supervised Learning
AUTHORS: Yuhang Yang ; Zilin Ding ; Xuan Cheng ; Xiaomin Wang ; Ming Liu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we show that feature transformations within CNNs can also be regarded as supervisory signals to construct the self-supervised task, called \emph{internal pretext task}.
29, TITLE: Grounding Inductive Biases in Natural Images:invariance Stems from Variations in Data
AUTHORS: Diane Bouchacourt ; Mark Ibrahim ; Ari S. Morcos
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We show standard augmentation relies on a precise combination of translation and scale, with translation recapturing most of the performance improvement -- despite the (approximate) translation invariance built in to convolutional architectures, such as residual networks.
30, TITLE: Self-supervision of Feature Transformation for Further Improving Supervised Learning
AUTHORS: Zilin Ding ; Yuhang Yang ; Xuan Cheng ; Xiaomin Wang ; Ming Liu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper we find that features in CNNs can be also used for self-supervision.
31, TITLE: PCNet: A Structure Similarity Enhancement Method for Multispectral and Multimodal Image Registration
AUTHORS: Si-Yuan Cao ; Hui-Liang Shen ; Lun Luo ; Shu-Jie Chen ; Chunguang Li
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To cope with this challenge, we propose the phase congruency network (PCNet), which is able to enhance the structure similarity and alleviate the non-linear intensity and gradient variation.
32, TITLE: Exploiting Learned Symmetries in Group Equivariant Convolutions
AUTHORS: Attila Lengyel ; Jan C. van Gemert
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We investigate the filter parameters learned by GConvs and find certain conditions under which they become highly redundant.
33, TITLE: Cervical Cytology Classification Using PCA & GWO Enhanced Deep Features Selection
AUTHORS: Hritam Basak ; Rohit Kundu ; Sukanta Chakraborty ; Nibaran Das
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: Thus, to augment the effort of the clinicians, in this paper, we propose a fully automated framework that utilizes Deep Learning and feature selection using evolutionary optimization for cytology image classification.
34, TITLE: CLCC: Contrastive Learning for Color Constancy
AUTHORS: YI-CHEN LO et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present CLCC, a novel contrastive learning framework for color constancy.
35, TITLE: More Than Meets The Eye: Self-supervised Depth Reconstruction from Brain Activity
AUTHORS: Guy Gaziv ; Michal Irani
CATEGORY: cs.CV [cs.CV, cs.LG, q-bio.NC]
HIGHLIGHT: We propose two main approaches: Depth-only recovery and joint image-depth RGBD recovery.
36, TITLE: An Efficient Point of Gaze Estimator for Low-Resolution Imaging Systems Using Extracted Ocular Features Based Neural Architecture
AUTHORS: Atul Sahay ; Imon Mukherjee ; Kavi Arya
CATEGORY: cs.CV [cs.CV, cs.HC, cs.LG]
HIGHLIGHT: The threefold objective of this paper is - 1.
37, TITLE: Learning to Rank Words: Optimizing Ranking Metrics for Word Spotting
AUTHORS: Pau Riba ; Adri� Molina ; Lluis Gomez ; Oriol Ramos-Terrades ; Josep Llad�s
CATEGORY: cs.CV [cs.CV, cs.IR]
HIGHLIGHT: In this paper, we explore and evaluate the use of ranking-based objective functions for learning simultaneously a word string and a word image encoder.
38, TITLE: No Fear of Heterogeneity: Classifier Calibration for Federated Learning with Non-IID Data
AUTHORS: MI LUO et. al.
CATEGORY: cs.LG [cs.LG, cs.CV, cs.DC, stat.ML]
HIGHLIGHT: Motivated by the above findings, we propose a novel and simple algorithm called Classifier Calibration with Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated gaussian mixture model.
39, TITLE: I Don't Need $\mathbf{u}$: Identifiable Non-Linear ICA Without Side Information
AUTHORS: Matthew Willetts ; Brooks Paige
CATEGORY: cs.LG [cs.LG, cs.CV, stat.ML]
HIGHLIGHT: In this work we introduce a new approach for identifiable non-linear ICA models.
40, TITLE: Densely Connected Normalizing Flows
AUTHORS: Matej Grci? ; Ivan Grubi?i? ; Sini?a ?egvi?
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: We precondition the noise in accordance with previous invertible units, which we describe as cross-unit coupling.
41, TITLE: OODIn: An Optimised On-Device Inference Framework for Heterogeneous Mobile Devices
AUTHORS: Stylianos I. Venieris ; Ioannis Panopoulos ; Iakovos S. Venieris
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: This paper proposes OODIn, a framework for the optimised deployment of DL apps across heterogeneous mobile devices.
42, TITLE: Accelerating Neural Architecture Search Via Proxy Data
AUTHORS: Byunggook Na ; Jisoo Mok ; Hyeokjun Choe ; Sungroh Yoon
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: By analyzing proxy data constructed using various selection methods through data entropy, we propose a novel proxy data selection method tailored for NAS.
43, TITLE: Uncovering Closed-form Governing Equations of Nonlinear Dynamics from Videos
AUTHORS: Lele Luan ; Yang Liu ; Hao Sun
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: To this end, we introduce a novel end-to-end unsupervised deep learning framework to uncover the mathematical structure of equations that governs the dynamics of moving objects in videos.
44, TITLE: It Takes Two to Tango: Mixup for Deep Metric Learning
AUTHORS: SHASHANKA VENKATARAMANAN et. al.
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: In this work, we aim to bridge this gap and improve representations using mixup, which is a powerful data augmentation approach interpolating two or more examples and corresponding target labels at a time.
45, TITLE: AdaMatch: A Unified Approach to Semi-Supervised Learning and Domain Adaptation
AUTHORS: David Berthelot ; Rebecca Roelofs ; Kihyuk Sohn ; Nicholas Carlini ; Alex Kurakin
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: With the goal of generality, we introduce AdaMatch, a method that unifies the tasks of unsupervised domain adaptation (UDA), semi-supervised learning (SSL), and semi-supervised domain adaptation (SSDA).
46, TITLE: Ex Uno Plures: Splitting One Model Into An Ensemble of Subnetworks
AUTHORS: Zhilu Zhang ; Vianne R. Gao ; Mert R. Sabuncu
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: Motivated by this perspective, we propose a strategy to compute an ensemble of subnetworks, each corresponding to a non-overlapping dropout mask computed via a pruning strategy and trained independently.
47, TITLE: Tiplines to Combat Misinformation on Encrypted Platforms: A Case Study of The 2019 Indian Election on WhatsApp
AUTHORS: Ashkan Kazemi ; Kiran Garimella ; Gautam Kishore Shahi ; Devin Gaffney ; Scott A. Hale
CATEGORY: cs.SI [cs.SI, cs.CL, cs.CV]
HIGHLIGHT: In this paper, we analyze the usefulness of a crowd-sourced system on WhatsApp through which users can submit "tips" containing messages they want fact-checked.
48, TITLE: Multi-Facet Clustering Variational Autoencoders
AUTHORS: FABIAN FALCK et. al.
CATEGORY: stat.ML [stat.ML, cs.CV, cs.LG, stat.ME]
HIGHLIGHT: In this paper, we introduce Multi-Facet Clustering Variational Autoencoders (MFCVAE), a novel class of variational autoencoders with a hierarchy of latent variables, each with a Mixture-of-Gaussians prior, that learns multiple clusterings simultaneously, and is trained fully unsupervised and end-to-end.
49, TITLE: Self-Supervised Learning with Data Augmentations Provably Isolates Content from Style
AUTHORS: JULIUS VON K�GELGEN et. al.
CATEGORY: stat.ML [stat.ML, cs.AI, cs.CV, cs.LG]
HIGHLIGHT: We formulate the augmentation process as a latent variable model by postulating a partition of the latent representation into a content component, which is assumed invariant to augmentation, and a style component, which is allowed to change.
50, TITLE: Gaussian Mixture Estimation from Weighted Samples
AUTHORS: Daniel Frisch ; Uwe D. Hanebeck
CATEGORY: stat.ML [stat.ML, cs.CV, cs.SY, eess.SY, 62G07]
HIGHLIGHT: In order to speed up computation, an expectation-maximization method is proposed that properly considers not only the sample locations, but also the corresponding weights.
51, TITLE: Implicit Field Learning for Unsupervised Anomaly Detection in Medical Images
AUTHORS: Sergio Naval Marimont ; Giacomo Tarroni
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: We propose a novel unsupervised out-of-distribution detection method for medical images based on implicit fields image representations.
52, TITLE: A Multi-stage GAN for Multi-organ Chest X-ray Image Generation and Segmentation
AUTHORS: Giorgio Ciano ; Paolo Andreini ; Tommaso Mazzierli ; Monica Bianchini ; Franco Scarselli
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this paper, we present a novel multi-stage generation algorithm based on Generative Adversarial Networks (GANs) that can produce synthetic images along with their semantic labels and can be used for data augmentation.
53, TITLE: Fast Computational Ghost Imaging Using Unpaired Deep Learning and A Constrained Generative Adversarial Network
AUTHORS: Fatemeh Alishahi ; Amirhossein Mohajerin-Ariaei
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: This paper explores the capabilities of deep learning to leverage computational ghost imaging when there is a lack of paired training images.
54, TITLE: Rethink Transfer Learning in Medical Image Classification
AUTHORS: Le Peng ; Hengyue Liang ; Taihui Li ; Ju Sun
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: In this paper, we perform careful experimental comparisons between shallow and deep networks for classification on two chest x-ray datasets, using different TL strategies.
55, TITLE: Spatio-Temporal Dual-Stream Neural Network for Sequential Whole-Body PET Segmentation
AUTHORS: Kai-Chieh Liang ; Lei Bi ; Ashnil Kumar ; Michael Fulham ; Jinman Kim
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this study, we propose a spatio-temporal 'dual-stream' neural network (ST-DSNN) to segment sequential whole-body PET scans.
56, TITLE: TED-net: Convolution-free T2T Vision Transformer-based Encoder-decoder Dilation Network for Low-dose CT Denoising
AUTHORS: Dayang Wang ; Zhan Wu ; Hengyong Yu
CATEGORY: eess.IV [eess.IV, cs.CV, physics.med-ph]
HIGHLIGHT: Here, we propose a convolution-free T2T vision transformer-based Encoder-decoder Dilation net-work (TED-net) to enrich the family of LDCT denoising algorithms.