本专栏是计算机视觉方向论文收集积累,时间:2021年4月1日,来源:paper digest
欢迎关注原创公众号 【计算机视觉联盟】,回复 【西瓜书手推笔记】 可获取我的机器学习纯手推笔记!
直达笔记地址:机器学习手推笔记(GitHub地址)
1, TITLE: Fast and Accurate Emulation of The SDO/HMI Stokes Inversion with Uncertainty Quantification
AUTHORS: RICHARD E. L. HIGGINS et. al.
CATEGORY: astro-ph.SR [astro-ph.SR, astro-ph.IM, cs.CV]
HIGHLIGHT: In this paper, we introduce a deep learning-based approach that can emulate the existing HMI pipeline results two orders of magnitude faster than the current pipeline algorithms.
2, TITLE: A Study of Latent Monotonic Attention Variants
AUTHORS: Albert Zeyer ; Ralf Schl�ter ; Hermann Ney
CATEGORY: cs.CL [cs.CL, cs.AI, cs.CV]
HIGHLIGHT: In this paper, we present a mathematically clean solution to introduce monotonicity, by introducing a new latent variable which represents the audio position or segment boundaries.
3, TITLE: Semi-supervised Synthesis of High-Resolution Editable Textures for 3D Humans
AUTHORS: Bindita Chaudhuri ; Nikolaos Sarafianos ; Linda Shapiro ; Tony Tung
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We introduce a novel approach to generate diverse high fidelity texture maps for 3D human meshes in a semi-supervised setup.
4, TITLE: Efficient Large-Scale Face Clustering Using An Online Mixture of Gaussians
AUTHORS: David Montero ; Naiara Aginako ; Basilio Sierra ; Marcos Nieto
CATEGORY: cs.CV [cs.CV, cs.LG, I.5.3]
HIGHLIGHT: In this work, we address the problem of large-scale online face clustering: given a continuous stream of unknown faces, create a database grouping the incoming faces by their identity.
5, TITLE: DCVNet: Dilated Cost Volume Networks for Fast Optical Flow
AUTHORS: Huaizu Jiang ; Erik Learned-Miller
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose dilated cost volumes to capture small and large displacements simultaneously, allowing optical flow estimation without the need for the sequential estimation strategy.
6, TITLE: SRA-LSTM: Social Relationship Attention LSTM for Human Trajectory Prediction
AUTHORS: Yusheng Peng ; Gaofeng Zhang ; Jun Shi ; Benzhu Xu ; Liping Zheng
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: Motivated by this idea, we propose a Social Relationship Attention LSTM (SRA-LSTM) model to predict future trajectories.
7, TITLE: An Effective and Friendly Tool for Seed Image Analysis
AUTHORS: ANDREA LODDO et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This work aims to present a software that performs an image analysis by feature extraction and classification starting from images containing seeds through a brand new and unique framework.
8, TITLE: Scale-aware Automatic Augmentation for Object Detection
AUTHORS: YUKANG CHEN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose Scale-aware AutoAug to learn data augmentation policies for object detection.
9, TITLE: Long-Term Temporally Consistent Unpaired Video Translation from Simulated Surgical 3D Data
AUTHORS: DOMINIK RIVOIR et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose a novel approach which combines unpaired image translation with neural rendering to transfer simulated to photorealistic surgical abdominal scenes.
10, TITLE: GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection
AUTHORS: Abhinav Kumar ; Garrick Brazil ; Xiaoming Liu
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we present and integrate GrooMeD-NMS -- a novel Grouped Mathematically Differentiable NMS for monocular 3D object detection, such that the network is trained end-to-end with a loss on the boxes after NMS.
11, TITLE: Rectification-based Knowledge Retention for Continual Learning
AUTHORS: Pravendra Singh ; Pratik Mazumder ; Piyush Rai ; Vinay P. Namboodiri
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we propose a novel approach to address the task incremental learning problem, which involves training a model on new tasks that arrive in an incremental manner.
12, TITLE: Prototypical Cross-domain Self-supervised Learning for Few-shot Unsupervised Domain Adaptation
AUTHORS: XIANGYU YUE et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose an end-to-end Prototypical Cross-domain Self-Supervised Learning (PCS) framework for Few-shot Unsupervised Domain Adaptation (FUDA).
13, TITLE: Rainbow Memory: Continual Learning with A Memory of Diverse Samples
AUTHORS: Jihwan Bang ; Heesu Kim ; YoungJoon Yoo ; Jung-Woo Ha ; Jonghyun Choi
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: To enhance the sample diversity in the memory, we propose a novel memory management strategy based on per-sample classification uncertainty and data augmentation, named Rainbow Memory (RM).
14, TITLE: Going Deeper with Image Transformers
AUTHORS: Hugo Touvron ; Matthieu Cord ; Alexandre Sablayrolles ; Gabriel Synnaeve ; Herv� J�gou
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we build and optimize deeper transformer networks for image classification.
15, TITLE: Weakly-Supervised Image Semantic Segmentation Using Graph Convolutional Networks
AUTHORS: Shun-Yi Pan ; Cheng-You Lu ; Shih-Po Lee ; Wen-Hsiao Peng
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To overcome this issue, we propose a Graph Convolutional Network (GCN)-based feature propagation framework.
16, TITLE: Facial Masks and Soft-Biometrics: Leveraging Face Recognition CNNs for Age and Gender Prediction on Mobile Ocular Images
AUTHORS: Fernando Alonso-Fernandez ; Kevin Hernandez Diaz ; Silvia Ramis ; Francisco J. Perales ; Josef Bigun
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We address the use of selfie ocular images captured with smartphones to estimate age and gender.
17, TITLE: Evaluation of Multimodal Semantic Segmentation Using RGB-D Data
AUTHORS: Jiesi Hu ; Ganning Zhao ; Suya You ; C. C. Jay Kuo
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: Our goal is to develop stable, accurate, and robust semantic scene understanding methods for wide-area scene perception and understanding, especially in challenging outdoor environments.
18, TITLE: FANet: A Feedback Attention Network for Improved Biomedical Image Segmentation
AUTHORS: NIKHIL KUMAR TOMAR et. al.
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: In this work, we leverage the information of each training epoch to prune the prediction maps of the subsequent epochs.
19, TITLE: Joint Deep Multi-Graph Matching and 3D Geometry Learning from Inhomogeneous 2D Image Collections
AUTHORS: Zhenzhang Ye ; Tarun Yenamandra ; Florian Bernard ; Daniel Cremers
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: We fill this gap by proposing a trainable framework that takes advantage of graph neural networks for learning a deformable 3D geometry model from inhomogeneous image collections, i.e. a set of images that depict different instances of objects from the same category.
20, TITLE: Dogfight: Detecting Drones from Drones Videos
AUTHORS: Muhammad Waseem Ashraf ; Waqas Sultani ; Mubarak Shah
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To handle this, instead of using region-proposal based methods, we propose to use a two-stage segmentation-based approach employing spatio-temporal attention cues.
21, TITLE: StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
AUTHORS: Or Patashnik ; Zongze Wu ; Eli Shechtman ; Daniel Cohen-Or ; Dani Lischinski
CATEGORY: cs.CV [cs.CV, cs.CL, cs.GR, cs.LG]
HIGHLIGHT: In this work, we explore leveraging the power of recently introduced Contrastive Language-Image Pre-training (CLIP) models in order to develop a text-based interface for StyleGAN image manipulation that does not require such manual effort.
22, TITLE: PAUL: Procrustean Autoencoder for Unsupervised Lifting
AUTHORS: Chaoyang Wang ; Simon Lucey
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper we advocate for a 3D deep auto-encoder framework to be used explicitly as the NRSfM prior.
23, TITLE: Camouflaged Instance Segmentation: Dataset and Benchmark Suite
AUTHORS: TRUNG-NGHIA LE et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To promote the new task of camouflaged instance segmentation, we introduce a new large-scale dataset, namely CAMO++, by extending our preliminary CAMO dataset (camouflaged object segmentation) in terms of quantity and diversity.
24, TITLE: Dual Contrastive Loss and Attention for GANs
AUTHORS: NING YU et. al.
CATEGORY: cs.CV [cs.CV, cs.GR]
HIGHLIGHT: In this paper, we propose various improvements to further push the boundaries in image generation.
25, TITLE: Topology-Preserving 3D Image Segmentation Based On Hyperelastic Regularization
AUTHORS: Daoping Zhang ; Lok Ming Lui
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a novel 3D topology-preserving registration-based segmentation model with the hyperelastic regularization, which can handle both 2D and 3D images.
26, TITLE: Topo-boundary: A Benchmark Dataset on Topological Road-boundary Detection Using Aerial Images for Autonomous Driving
AUTHORS: Zhenhua Xu ; Yuxiang Sun ; Ming Liu
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: So in this paper, we propose a new benchmark dataset, named \textit{Topo-boundary}, for off-line topological road-boundary detection.
27, TITLE: Towards More Flexible and Accurate Object Tracking with Natural Language: Algorithms and Benchmark
AUTHORS: XIAO WANG et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this work, we propose a new benchmark specifically dedicated to the tracking-by-language, including a large scale dataset, strong and diverse baseline methods. We also introduce two new challenges into TNL2K for the object tracking task, i.e., adversarial samples and modality switch.
28, TITLE: DER: Dynamically Expandable Representation for Class Incremental Learning
AUTHORS: Shipeng Yan ; Jiangwei Xie ; Xuming He
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: To this end, we propose a novel two-stage learning approach that utilizes a dynamically expandable representation for more effective incremental concept modeling.
29, TITLE: Near-field Sensing Architecture for Low-Speed Vehicle Automation Using A Surround-view Fisheye Camera System
AUTHORS: Ciar�n Eising ; Jonathan Horgan ; Senthil Yogamani
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: In this work, we describe our visual perception architecture on surround view cameras designed for a system deployed in commercial vehicles, provide a functional review of the different stages of such a computer vision system, and discuss some of the current technological challenges.
30, TITLE: Generating Multi-scale Maps from Remote Sensing Images Via Series Generative Adversarial Networks
AUTHORS: Xu Chen ; Bangguo Yin ; Songqiang Chen ; Haifeng Li ; Tian Xu
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: By extending their method, multi-scale RSIs can be trivially translated to multi-scale maps (multi-scale rs2map translation) through scale-wise rs2map models trained for certain scales (parallel strategy).
31, TITLE: Few-Data Guided Learning Upon End-to-End Point Cloud Network for 3D Face Recognition
AUTHORS: Yi Yu ; Feipeng Da ; Ziyu Zhang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, an end-to-end deep learning network entitled Sur3dNet-Face for point-cloud-based 3D face recognition is proposed.
32, TITLE: Multi-Class Multi-Instance Count Conditioned Adversarial Image Generation
AUTHORS: Amrutha Saseendran ; Kathrin Skubch ; Margret Keuper
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we take one further step in this direction and propose a conditional generative adversarial network (GAN) that generates images with a defined number of objects from given classes. In particular, we propose a new dataset, CityCount, which is derived from the Cityscapes street scenes dataset, to evaluate our approach in a challenging and practically relevant scenario.
33, TITLE: Learning Camera Localization Via Dense Scene Matching
AUTHORS: Shitao Tang ; Chengzhou Tang ; Rui Huang ; Siyu Zhu ; Ping Tan
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We present a new method for scene agnostic camera localization using dense scene matching (DSM), where a cost volume is constructed between a query image and a scene.
34, TITLE: Using Depth Information and Colour Space Variations for Improving Outdoor Robustness for Instance Segmentation of Cabbage
AUTHORS: Nils L�ling ; David Reiser ; Alexander Stana ; H. W. Griepentrog
CATEGORY: cs.CV [cs.CV, cs.LG, cs.RO]
HIGHLIGHT: Following this goal, this research focuses on improving instance segmentation of field crops under varying environmental conditions.
35, TITLE: DAP: Detection-Aware Pre-training with Weak Supervision
AUTHORS: YUANYI ZHONG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper presents a detection-aware pre-training (DAP) approach, which leverages only weakly-labeled classification-style datasets (e.g., ImageNet) for pre-training, but is specifically tailored to benefit object detection tasks.
36, TITLE: Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes
AUTHORS: Dmytro Kotovenko ; Matthias Wright ; Arthur Heimbrecht ; Bj�rn Ommer
CATEGORY: cs.CV [cs.CV, cs.AI, cs.GR]
HIGHLIGHT: We propose a method to stylize images by optimizing parameterized brushstrokes instead of pixels and further introduce a simple differentiable rendering mechanism.
37, TITLE: Neural Surface Maps
AUTHORS: Luca Morreale ; Noam Aigerman ; Vladimir Kim ; Niloy J. Mitra
CATEGORY: cs.CV [cs.CV, cs.GR]
HIGHLIGHT: In this paper, we advocate considering neural networks as encoding surface maps.
38, TITLE: Learning with Memory-based Virtual Classes for Deep Metric Learning
AUTHORS: Byungsoo Ko ; Geonmo Gu ; Han-Gyu Kim
CATEGORY: cs.CV [cs.CV, cs.IR, cs.LG]
HIGHLIGHT: In this work, we present a novel training strategy for DML called MemVir.
39, TITLE: Unpaired Single-Image Depth Synthesis with Cycle-consistent Wasserstein GANs
AUTHORS: Christoph Angermann ; Ad�la Moravov� ; Markus Haltmeier ; Steinbj�rn J�nsson ; Christian Laubichler
CATEGORY: cs.CV [cs.CV, cs.LG, eess.IV]
HIGHLIGHT: Therefore, in this study, latest advancements in the field of generative neural networks are leveraged to fully unsupervised single-image depth synthesis.
40, TITLE: A Closer Look at Fourier Spectrum Discrepancies for CNN-generated Images Detection
AUTHORS: Keshigeyan Chandrasegaran ; Ngoc-Trung Tran ; Ngai-Man Cheung
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: In this work, we investigate the validity of assertions claiming that CNN-generated images are unable to achieve high frequency spectral decay consistency.
41, TITLE: Joint Learning of Neural Transfer and Architecture Adaptation for Image Recognition
AUTHORS: Guangrun Wang ; Liang Lin ; Rongcong Chen ; Guangcong Wang ; Jiqi Zhang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we prove that dynamically adapting network architectures tailored for each domain task along with weight finetuning benefits in both efficiency and effectiveness, compared to the existing image recognition pipeline that only tunes the weights regardless of the architecture.
42, TITLE: Knowledge Distillation By Sparse Representation Matching
AUTHORS: Dat Thanh Tran ; Moncef Gabbouj ; Alexandros Iosifidis
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose Sparse Representation Matching (SRM), a method to transfer intermediate knowledge obtained from one Convolutional Neural Network (CNN) to another by utilizing sparse representation learning.
43, TITLE: Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective
AUTHORS: Jiarui Xu ; Xiaolong Wang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Instead of following the previous literature, we propose to learn correspondence using Video Frame-level Similarity (VFS) learning, i.e, simply learning from comparing video frames.
44, TITLE: Layout-Guided Novel View Synthesis from A Single Indoor Panorama
AUTHORS: Jiale Xu ; Jia Zheng ; Yanyu Xu ; Rui Tang ; Shenghua Gao
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we make the first attempt to generate novel views from a single indoor panorama and take the large camera translations into consideration. To validate the effectiveness of our method, we further build a large-scale photo-realistic dataset containing both small and large camera translations.
45, TITLE: VITON-HD: High-Resolution Virtual Try-On Via Misalignment-Aware Normalization
AUTHORS: Seunghwan Choi ; Sunghyun Park ; Minsoo Lee ; Jaegul Choo
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To address the challenges, we propose a novel virtual try-on method called VITON-HD that successfully synthesizes 1024x768 virtual try-on images.
46, TITLE: Video Exploration Via Video-Specific Autoencoders
AUTHORS: Kevin Wang ; Deva Ramanan ; Aayush Bansal
CATEGORY: cs.CV [cs.CV, cs.GR, cs.HC, cs.LG]
HIGHLIGHT: In this work, we observe that a simple autoencoder trained (from scratch) on multiple frames of a specific video enables one to perform a large variety of video processing and editing tasks.
47, TITLE: Robust Registration of Multimodal Remote Sensing Images Based on Structural Similarity
AUTHORS: Yuanxin Ye ; Jie Shan ; Lorenzo Bruzzone ; Li Shen
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To address this problem, this paper proposes a novel feature descriptor named the Histogram of Orientated Phase Congruency (HOPC), which is based on the structural properties of images.
48, TITLE: Human Perception Modeling for Automatic Natural Image Matting
AUTHORS: Yuhongze Zhou ; Liguang Zhou ; Tin Lun Lam ; Yangsheng Xu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we argue that how to handle trade-off of additional information input is a major issue in automatic matting, which we decompose into two subtasks: trimap and alpha estimation.
49, TITLE: Learning By Aligning Videos in Time
AUTHORS: SANJAY HARESH et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We present a self-supervised approach for learning video representations using temporal video alignment as a pretext task, while exploiting both frame-level and video-level information.
50, TITLE: Contrastive Learning of Single-Cell Phenotypic Representations for Treatment Classification
AUTHORS: ALEXIS PERAKIS et. al.
CATEGORY: cs.CV [cs.CV, cs.LG, eess.IV]
HIGHLIGHT: Therefore, subsequent works propose unsupervised approaches based on generative models to learn these representations.
51, TITLE: CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields
AUTHORS: Michael Niemeyer ; Andreas Geiger
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: Several recent works therefore propose generative models which are 3D-aware, i.e., scenes are modeled in 3D and then rendered differentiably to the image plane.
52, TITLE: DynOcc: Learning Single-View Depth from Dynamic Occlusion Cues
AUTHORS: Yifan Wang ; Linjie Luo ; Xiaohui Shen ; Xing Mei
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we introduce the first depth dataset DynOcc consisting of dynamic in-the-wild scenes.
53, TITLE: Denoise and Contrast for Category Agnostic Shape Completion
AUTHORS: Antonio Alliegro ; Diego Valsesia ; Giulia Fracastoro ; Enrico Magli ; Tatiana Tommasi
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present a deep learning model that exploits the power of self-supervision to perform 3D point cloud completion, estimating the missing part and a context region around it.
54, TITLE: Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors
AUTHORS: Vladimir Guzov ; Aymen Mir ; Torsten Sattler ; Gerard Pons-Moll
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We introduce (HPS) Human POSEitioning System, a method to recover the full 3D pose of a human registered with a 3D scan of the surrounding environment using wearable sensors.
55, TITLE: Self-Regression Learning for Blind Hyperspectral Image Fusion Without Label
AUTHORS: Wu Wang ; Yue Huang ; Xinhao Ding
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: Towards these issues, we proposed a self-regression learning method that alternatively reconstructs hyperspectral image (HSI) and estimate the observation model.
56, TITLE: SOON: Scenario Oriented Object Navigation with Graph-based Exploration
AUTHORS: Fengda Zhu ; Xiwen Liang ; Yi Zhu ; Xiaojun Chang ; Xiaodan Liang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Accordingly, in this paper, we introduce a Scenario Oriented Object Navigation (SOON) task. We also propose a new large-scale benchmark named From Anywhere to Object (FAO) dataset.
57, TITLE: Smart Scribbles for Image Mating
AUTHORS: XIN YANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this article, we explore the intrinsic relationship between user input and alpha mattes and strike a balance between user effort and the quality of alpha mattes.
58, TITLE: SimPLE: Similar Pseudo Label Exploitation for Semi-Supervised Classification
AUTHORS: Zijian Hu ; Zhengyu Yang ; Xuefeng Hu ; Ram Nevatia
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: Following this path, we propose a novel unsupervised objective that focuses on the less studied relationship between the high confidence unlabeled data that are similar to each other.
59, TITLE: Learning Spatio-Temporal Transformer for Visual Tracking
AUTHORS: Bin Yan ; Houwen Peng ; Jianlong Fu ; Dong Wang ; Huchuan Lu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present a new tracking architecture with an encoder-decoder transformer as the key component.
60, TITLE: Deep Adaptive Fuzzy Clustering for Evolutionary Unsupervised Representation Learning
AUTHORS: Dayu Tan ; Zheng Huang ; Xin Peng ; Weimin Zhong ; Vladimir Mahalec
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this study, we explore the possibility of employing fuzzy clustering in a deep neural network framework.
61, TITLE: DA-DETR: Domain Adaptive Detection Transformer By Hybrid Attention
AUTHORS: Jingyi Zhang ; Jiaxing Huang ; Zhipeng Luo ; Gongjie Zhang ; Shijian Lu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we adopt a one-stage detector and design DA-DETR, a simple yet effective domain adaptive object detection network that performs inter-domain alignment with a single discriminator.
62, TITLE: Convolutional Hough Matching Networks
AUTHORS: Juhong Min ; Minsu Cho
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work we introduce a Hough transform perspective on convolutional matching and propose an effective geometric matching algorithm, dubbed Convolutional Hough Matching (CHM).
63, TITLE: Online Learning of A Probabilistic and Adaptive Scene Representation
AUTHORS: Zike Yan ; Xin Wang ; Hongbin Zha
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we represent the scene with a Bayesian nonparametric mixture model, seamlessly describing per-point occupancy status with a continuous probability density function.
64, TITLE: Spatial Content Alignment For Pose Transfer
AUTHORS: Wing-Yin Yu ; Lai-Man Po ; Yuzhi Zhao ; Jingjing Xiong ; Kin-Wai Lau
CATEGORY: cs.CV [cs.CV, cs.AI, cs.MM]
HIGHLIGHT: In this paper, we propose a novel framework Spatial Content Alignment GAN (SCAGAN) which aims to enhance the content consistency of garment textures and the details of human characteristics.
65, TITLE: Deep Simultaneous Optimisation of Sampling and Reconstruction for Multi-contrast MRI
AUTHORS: XINWEN LIU et. al.
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: We propose an algorithm that generates the optimised sampling pattern and reconstruction scheme of one contrast (e.g. T2-weighted image) when images with different contrast (e.g. T1-weighted image) have been acquired.
66, TITLE: ReMix: Towards Image-to-Image Translation with Limited Data
AUTHORS: Jie Cao ; Luanxuan Hou ; Ming-Hsuan Yang ; Ran He ; Zhenan Sun
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we propose a data augmentation method (ReMix) to tackle this issue.
67, TITLE: Facial Expression and Attributes Recognition Based on Multi-task Learning of Lightweight Neural Networks
AUTHORS: Andrey V. Savchenko
CATEGORY: cs.CV [cs.CV, 68T10]
HIGHLIGHT: In this paper, we examine the multi-task training of lightweight convolutional neural networks for face identification and classification of facial attributes (age, gender, ethnicity) trained on cropped faces without margins.
68, TITLE: Deep Image Harmonization By Bridging The Reality Gap
AUTHORS: WENYAN CONG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To leverage both real-world images and rendered images, we propose a cross-domain harmonization network CharmNet to bridge the domain gap between two domains.
69, TITLE: The GIST and RIST of Iterative Self-Training for Semi-Supervised Segmentation
AUTHORS: EU WERN TEH et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We consider the task of semi-supervised semantic segmentation, where we aim to produce pixel-wise semantic object masks given only a small number of human-labeled training examples.
70, TITLE: Channel-Based Attention for LCC Using Sentinel-2 Time Series
AUTHORS: Hermann Courteille ; A. Beno�t ; N M�ger ; A Atto ; D. Ienco
CATEGORY: cs.CV [cs.CV, cs.LG, cs.NE, eess.IV]
HIGHLIGHT: An architecture expressing predictions with respect to input channels is thus proposed in this paper.
71, TITLE: Robust Facial Expression Recognition with Convolutional Visual Transformers
AUTHORS: Fuyan Ma ; Bin Sun ; Shutao Li
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Therefore, we propose Convolutional Visual Transformers to tackle FER in the wild by two main steps.
72, TITLE: Attention Map-guided Two-stage Anomaly Detection Using Hard Augmentation
AUTHORS: Jou Won Song ; Kyeongbo Kong ; Ye In Park ; Suk-Ju Kang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To alleviate this problem, this paper proposes a novel two-stage network consisting of an attention network and an anomaly detection GAN (ADGAN).
73, TITLE: Sparse Auxiliary Networks for Unified Monocular Depth Prediction and Completion
AUTHORS: Vitor Guizilini ; Rares Ambrus ; Wolfram Burgard ; Adrien Gaidon
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we study the problem of predicting dense depth from a single RGB image (monodepth) with optional sparse measurements from low-cost active depth sensors.
74, TITLE: ICurb: Imitation Learning-based Detection of Road Curbs Using Aerial Images for Autonomous Driving
AUTHORS: Zhenhua Xu ; Yuxiang Sun ; Ming Liu
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: We find that the visual appearances between road areas and off-road areas are usually different in aerial images, so we propose a novel solution to detect road curbs off-line using aerial images.
75, TITLE: Embracing Uncertainty: Decoupling and De-bias for Robust Temporal Grounding
AUTHORS: Hao Zhou ; Chongyang Zhang ; Yan Luo ; Yanjun Chen ; Chuanping Hu
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this work, we propose a novel DeNet (Decoupling and De-bias) to embrace human uncertainty: Decoupling - We explicitly disentangle each query into a relation feature and a modified feature.
76, TITLE: Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data
AUTHORS: Oscar Ma�as ; Alexandre Lacoste ; Xavier Giro-i-Nieto ; David Vazquez ; Pau Rodriguez
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we propose Seasonal Contrast (SeCo), an effective pipeline to leverage unlabeled data for in-domain pre-training of re-mote sensing representations.
77, TITLE: Unsupervised Disentanglement of Linear-Encoded Facial Semantics
AUTHORS: Yutong Zheng ; Yu-Kai Huang ; Ran Tao ; Zhiqiang Shen ; Marios Savvides
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose a method to disentangle linear-encoded facial semantics from StyleGAN without external supervision.
78, TITLE: Geometric Unsupervised Domain Adaptation for Semantic Segmentation
AUTHORS: Vitor Guizilini ; Jie Li ; Rares Ambrus ; Adrien Gaidon
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose to use self-supervised monocular depth estimation as a proxy task to bridge this gap and improve sim-to-real unsupervised domain adaptation (UDA).
79, TITLE: Neural Response Interpretation Through The Lens of Critical Pathways
AUTHORS: ASHKAN KHAKZAR et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this work, we discuss the problem of identifying these critical pathways and subsequently leverage them for interpreting the network's response to an input.
80, TITLE: Rank-One Prior: Toward Real-Time Scene Recovery
AUTHORS: Jun Liu ; Ryan Wen Liu ; Jianing Sun ; Tieyong Zeng
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To improve visual quality under different weather/imaging conditions, we propose a real-time light correction method to recover the degraded scenes in the cases of sandstorms, underwater, and haze.
81, TITLE: Fixing The Teacher-Student Knowledge Discrepancy in Distillation
AUTHORS: JIANGFAN HAN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To solve this problem, in this paper, we propose a novel student-dependent distillation method, knowledge consistent distillation, which makes teacher's knowledge more consistent with the student and provides the best suitable knowledge to different student networks for distillation.
82, TITLE: ArtFlow: Unbiased Image Style Transfer Via Reversible Neural Flows
AUTHORS: JIE AN et. al.
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: In this paper, we propose ArtFlow to prevent content leak during universal style transfer.
83, TITLE: Exploiting Invariance in Training Deep Neural Networks
AUTHORS: CHENGXI YE et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: Inspired by two basic mechanisms in animal visual systems, we introduce a feature transform technique that imposes invariance properties in the training of deep neural networks.
84, TITLE: Attention, Please! A Survey of Neural Attention Models in Deep Learning
AUTHORS: Alana de Santana Correia ; Esther Luna Colombini
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV, cs.RO]
HIGHLIGHT: By critically analyzing 650 works, we describe the primary uses of attention in convolutional, recurrent networks and generative models, identifying common subgroups of uses and applications.
85, TITLE: Robustness Certification for Point Cloud Models
AUTHORS: Tobias Lorenz ; Anian Ruoss ; Mislav Balunovi? ; Gagandeep Singh ; Martin Vechev
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: In this work, we address this challenge and introduce 3DCertify, the first verifier able to certify robustness of point cloud models.
86, TITLE: Bit-Mixer: Mixed-precision Networks with Runtime Bit-width Selection
AUTHORS: Adrian Bulat ; Georgios Tzimiropoulos
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: In this work, we propose Bit-Mixer, the very first method to train a meta-quantized network where during test time any layer can change its bid-width without affecting at all the overall network's ability for highly accurate inference.
87, TITLE: Learning Generalizable Robotic Reward Functions from "In-The-Wild" Human Videos
AUTHORS: Annie S. Chen ; Suraj Nair ; Chelsea Finn
CATEGORY: cs.RO [cs.RO, cs.AI, cs.CV, cs.LG]
HIGHLIGHT: In this work, we propose a simple approach, Domain-agnostic Video Discriminator (DVD), that learns multitask reward functions by training a discriminator to classify whether two videos are performing the same task, and can generalize by virtue of learning from a small amount of robot data with a broad dataset of human videos.
88, TITLE: Training Robust Deep Learning Models for Medical Imaging Tasks with Spectral Decoupling
AUTHORS: Joona Pohjonen ; Carolin St�renberg ; Antti Rannikko ; Tuomas Mirtti ; Esa Pitk�nen
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: To address these challenges, we evaluate the utility of spectral decoupling in the context of medical image analysis.
89, TITLE: Classification of Hematoma: Joint Learning of Semantic Segmentation and Classification
AUTHORS: Hokuto Hirano ; Tsuyoshi Okita
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: This paper proposes the joint learning of semantic segmentation and classification and evaluate the performance of this.
90, TITLE: Learning Scalable $\ell_\infty$-constrained Near-lossless Image Compression Via Joint Lossy Image and Residual Compression
AUTHORS: Yuanchao Bai ; Xianming Liu ; Wangmeng Zuo ; Yaowei Wang ; Xiangyang Ji
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: We propose a novel joint lossy image and residual compression framework for learning $\ell_\infty$-constrained near-lossless image compression.
91, TITLE: CNN-based Cardiac Motion Extraction to Generate Deformable Geometric Left Ventricle Myocardial Models from Cine MRI
AUTHORS: Roshan Reddy Upendra ; Brian Jamison Wentz ; Richard Simon ; Suzanne M. Shontz ; Cristian A. Linte
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: Here, we propose a deep leaning-based framework for the development of patient-specific geometric models of LV myocardium from cine cardiac MR images, using the Automated Cardiac Diagnosis Challenge (ACDC) dataset.
92, TITLE: Differentiable Deconvolution for Improved Stroke Perfusion Analysis
AUTHORS: Ezequiel de la Rosa ; David Robben ; Diana M. Sima ; Jan S. Kirschke ; Bjoern Menze
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this work we propose an AIF selection approach that is optimized for maximal core lesion segmentation performance.
93, TITLE: Mask-ToF: Learning Microlens Masks for Flying Pixel Correction in Time-of-Flight Imaging
AUTHORS: Ilya Chugunov ; Seung-Hwan Baek ; Qiang Fu ; Wolfgang Heidrich ; Felix Heide
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: We introduce Mask-ToF, a method to reduce flying pixels (FP) in time-of-flight (ToF) depth captures. We develop a differentiable ToF simulator to jointly train a convolutional neural network to decode this information and produce high-fidelity, low-FP depth reconstructions.
94, TITLE: A Novel Deep ML Architecture By Integrating Visual Simultaneous Localization and Mapping (vSLAM) Into Mask R-CNN for Real-time Surgical Video Analysis
AUTHORS: Ella Selina Lan
CATEGORY: eess.IV [eess.IV, cs.CV, I.4.0]
HIGHLIGHT: In this research, a novel machine learning architecture, RPM-CNN, is created to perform real-time surgical video analysis.
95, TITLE: HAD-Net: A Hierarchical Adversarial Knowledge Distillation Network for Improved Enhanced Tumour Segmentation Without Post-Contrast Images
AUTHORS: SAVERIO VADACCHINO et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: In this work, we present HAD-Net, a novel offline adversarial knowledge distillation (KD) technique, whereby a pre-trained teacher segmentation network, with access to all MRI sequences, teaches a student network, via hierarchical adversarial training, to better overcome the large domain shift presented when crucial images are absent during inference.