本专栏是计算机视觉方向论文收集积累,时间:2021年6月16日,来源:paper digest
欢迎关注原创公众号 【计算机视觉联盟】,回复 【西瓜书手推笔记】 可获取我的机器学习纯手推笔记!
直达笔记地址:机器学习手推笔记(GitHub地址)
1, TITLE: A Value-Function-based Interior-point Method for Non-convex Bi-level Optimization
AUTHORS: Risheng Liu ; Xuan Liu ; Xiaoming Yuan ; Shangzhi Zeng ; Jin Zhang
CATEGORY: math.OC [math.OC, cs.CV, cs.LG]
HIGHLIGHT: In this work, we propose a new gradient-based solution scheme, namely, the Bi-level Value-Function-based Interior-point Method (BVFIM).
2, TITLE: SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients
AUTHORS: Feihu Huang ; Junyi Li ; Heng Huang
CATEGORY: math.OC [math.OC, cs.CV, cs.LG]
HIGHLIGHT: To fill this gap, we propose a faster and universal framework of adaptive gradients (i.e., SUPER-ADAM) by introducing a universal adaptive matrix that includes most existing adaptive gradient forms.
3, TITLE: Physion: Evaluating Physical Prediction from Vision in Humans and Machines
AUTHORS: DANIEL M. BEAR et. al.
CATEGORY: cs.AI [cs.AI, cs.CV, I.2.10; I.4.8; I.5]
HIGHLIGHT: Here, we present a visual and physical prediction benchmark that precisely measures this capability.
4, TITLE: Generating Thermal Human Faces for Physiological Assessment Using Thermal Sensor Auxiliary Labels
AUTHORS: Catherine Ordun ; Edward Raff ; Sanjay Purushotham
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: As a result, we introduce favtGAN, a VT GAN which uses the pix2pix image translation model with an auxiliary sensor label prediction network for generating thermal faces from visible images.
5, TITLE: Face Age Progression With Attribute Manipulation
AUTHORS: Sinzith Tatikonda ; Athira Nambiar ; Anurag Mittal
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper, we propose a novel holistic model in this regard viz., ``Face Age progression With Attribute Manipulation (FAWAM)", i.e. generating face images at different ages while simultaneously varying attributes and other subject specific characteristics.
6, TITLE: Is This Harmful? Learning to Predict Harmfulness Ratings from Video
AUTHORS: JOHAN EDSTEDT et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we identify and tackle the two main obstacles. First, we create a dataset of approximately 4000 video clips, annotated by professionals in the field.
7, TITLE: Dynamic Head: Unifying Object Detection Heads with Attentions
AUTHORS: XIYANG DAI et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present a novel dynamic head framework to unify object detection heads with attentions.
8, TITLE: Gradient Forward-Propagation for Large-Scale Temporal Video Modelling
AUTHORS: Mateusz Malinowski ; Dimitrios Vytiniotis ; Grzegorz Swirszcz ; Viorica Patraucean ; Joao Carreira
CATEGORY: cs.CV [cs.CV, cs.DC, cs.LG, eess.IV]
HIGHLIGHT: In this paper, we build upon Sideways, which avoids blocking by propagating approximate gradients forward in time, and we propose mechanisms for temporal integration of information based on different variants of skip connections.
9, TITLE: Canonical Face Embeddings
AUTHORS: David McNeely-White ; Ben Sattelberg ; Nathaniel Blanchard ; Ross Beveridge
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: We present evidence that many common convolutional neural networks (CNNs) trained for face verification learn functions that are nearly equivalent under rotation.
10, TITLE: Domain Adaptive SiamRPN++ for Object Tracking in The Wild
AUTHORS: Zhongzhou Zhang ; Lei Zhang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To this end, based on SiamRPN++, we introduce a Domain Adaptive SiamRPN++, namely DASiamRPN++, to improve the cross-domain transferability and robustness of a tracker.
11, TITLE: Keep CALM and Improve Visual Feature Attribution
AUTHORS: Jae Myung Kim ; Junsuk Choe ; Zeynep Akata ; Seong Joon Oh
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we improve CAM by explicitly incorporating a latent variable encoding the location of the cue for recognition in the formulation, thereby subsuming the attribution map into the training computational graph.
12, TITLE: A Hybrid MmWave and Camera System for Long-Range Depth Imaging
AUTHORS: Diana Zhang ; Akarsh Prabhakara ; Sirajum Munir ; Aswin Sankaranarayanan ; Swarun Kumar
CATEGORY: cs.CV [cs.CV, cs.NI, cs.RO, eess.SP]
HIGHLIGHT: We propose Metamoran, a system that combines the complimentary strengths of radar and camera systems to obtain depth images at high azimuthal resolutions at distances of several tens of meters with high accuracy, all from a single fixed vantage point.
13, TITLE: G$^2$DA: Geometry-Guided Dual-Alignment Learning for RGB-Infrared Person Re-Identification
AUTHORS: LIN WAN et. al.
CATEGORY: cs.CV [cs.CV, 68T07 (Primary), I.4.9]
HIGHLIGHT: This paper attempts to find RGB-IR ReID solutions from tackling sample-level modality difference, and presents a Geometry-Guided Dual-Alignment learning framework (G$^2$DA), which jointly enhances modality-invariance and reinforces discriminability with human topological structure in features to boost the overall matching performance.
14, TITLE: Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection
AUTHORS: ZHENYU ZHANG et. al.
CATEGORY: cs.CV [cs.CV, cs.GR]
HIGHLIGHT: To address such problem, this paper presents a novel Learning to Aggregate and Personalize (LAP) framework for unsupervised robust 3D face modeling.
15, TITLE: Cluster-guided Asymmetric Contrastive Learning for Unsupervised Person Re-Identification
AUTHORS: Mingkun Li ; Chun-Guang Li ; Jun Guo
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a Cluster-guided Asymmetric Contrastive Learning (CACL) approach for unsupervised person Re-ID, in which cluster structure is leveraged to guide the feature learning in a properly designed asymmetric contrastive learning framework.
16, TITLE: A Spacecraft Dataset for Detection, Segmentation and Parts Recognition
AUTHORS: Dung Anh Hoang ; Bo Chen ; Tat-Jun Chin
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we aim to fill this gap by releasing a dataset for spacecraft detection, instance segmentation and part recognition.
17, TITLE: Mixed Model OCR Training on Historical Latin Script for Out-of-the-Box Recognition and Finetuning
AUTHORS: CHRISTIAN REUL et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To optimize the results we combined established techniques of OCR training like pretraining, data augmentation, and voting.
18, TITLE: Vision-Language Navigation with Random Environmental Mixup
AUTHORS: Chong Liu ; Fengda Zhu ; Xiaojun Chang ; Xiaodan Liang ; Yi-Dong Shen
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To tackle this problem, we propose the Random Environmental Mixup (REM) method, which generates cross-connected house scenes as augmented data via mixuping environment.
19, TITLE: Towards Total Recall in Industrial Anomaly Detection
AUTHORS: KARSTEN ROTH et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we extend on this line of work and propose PatchCore, which uses a maximally representative memory bank of nominal patch-features.
20, TITLE: Multi-script Handwritten Digit Recognition Using Multi-task Learning
AUTHORS: Mesay Samuel Gondere ; Lars Schmidt-Thieme ; Durga Prasad Sharma ; Randolf Scholz
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: Therefore, in this study multi-script handwritten digit recognition using multi-task learning will be investigated.
21, TITLE: Reverse Engineering of Generative Models: Inferring Model Hyperparameters from Generated Images
AUTHORS: Vishal Asnani ; Xi Yin ; Tal Hassner ; Xiaoming Liu
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: To tackle this problem, we propose a framework with two components: a Fingerprint Estimation Network (FEN), which estimates a GM fingerprint from a generated image by training with four constraints to encourage the fingerprint to have desired properties, and a Parsing Network (PN), which predicts network architecture and loss functions from the estimated fingerprints. To evaluate our approach, we collect a fake image dataset with $100$K images generated by $100$ GMs.
22, TITLE: Flow Guided Transformable Bottleneck Networks for Motion Retargeting
AUTHORS: Jian Ren ; Menglei Chai ; Oliver J. Woodford ; Kyle Olszewski ; Sergey Tulyakov
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Inspired by the Transformable Bottleneck Network, which renders novel views and manipulations of rigid objects, we propose an approach based on an implicit volumetric representation of the image content, which can then be spatially manipulated using volumetric flow fields.
23, TITLE: Learning Deep Morphological Networks with Neural Architecture Search
AUTHORS: Yufei Hu ; Nacim Belkhir ; Jesus Angulo ; Angela Yao ; Gianni Franchi
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: We propose a method based on meta-learning to incorporate morphological operators into DNNs.
24, TITLE: Potato Crop Stress Identification in Aerial Images Using Deep Learning-based Object Detection
AUTHORS: Sujata Butte ; Aleksandar Vakanski ; Kasia Duellman ; Haotian Wang ; Amin Mirkouei
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: The paper presents an approach for analyzing aerial images of a potato crop using deep neural networks. The paper also introduces a dataset of field images acquired with a Parrot Sequoia camera carried by a Solo unmanned aerial vehicle.
25, TITLE: Cascading Convolutional Temporal Colour Constancy
AUTHORS: Matteo Rizzo ; Cristina Conati ; Daesik Jang ; Hui Hu
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: We extend this architecture with different models obtained by (i) substituting the TCCNet submodules with C4, the state-of-the-art method for CCC targeting images; (ii) adding a cascading strategy to perform an iterative improvement of the estimate of the illuminant.
26, TITLE: A Clinically Inspired Approach for Melanoma Classification
AUTHORS: Prathyusha Akundi ; Soumyasis Gun ; Jayanthi Sivaswamy
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: We present a method for identifying and quantifying ugly ducklings by performing Intra-Patient Comparative Analysis (IPCA) of neighboring nevi.
27, TITLE: Zero-sample Surface Defect Detection and Classification Based on Semantic Feedback Neural Network
AUTHORS: YIBO GUO et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: At the same time, for the common domain shift problem in zero-shot learning, based on the idea of co-training algorithm using the difference information between different views of data to learn from each other, we propose an Ensemble Co-training algorithm, which adaptively reduces the prediction error in image tag embedding from multiple angles.
28, TITLE: Encouraging Intra-Class Diversity Through A Reverse Contrastive Loss for Better Single-Source Domain Generalization
AUTHORS: Thomas Duboudin ; Emmanuel Dellandr�a ; Corentin Abgrall ; Gilles H�naff ; Liming Chen
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: Because data distributions can change dynamically in real-life applications once a learned model is deployed, in this paper we are interested in single-source domain generalization (SDG) which aims to develop deep learning algorithms able to generalize from a single training domain where no information about the test domain is available at training time.
29, TITLE: Real-time Pose and Shape Reconstruction of Two Interacting Hands With A Single Depth Camera
AUTHORS: FRANZISKA MUELLER et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We present a novel method for real-time pose and shape reconstruction of two strongly interacting hands.
30, TITLE: Efficient Facial Expression Analysis For Dimensional Affect Recognition Using Geometric Features
AUTHORS: Vassilios Vonikakis ; Stefan Winkler
CATEGORY: cs.CV [cs.CV, cs.HC]
HIGHLIGHT: We introduce a simple but effective facial expression analysis (FEA) system for dimensional affect, solely based on geometric features and Partial Least Squares (PLS) regression.
31, TITLE: Direction-aware Feature-level Frequency Decomposition for Single Image Deraining
AUTHORS: SEN DENG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We present a novel direction-aware feature-level frequency decomposition network for single image deraining.
32, TITLE: Object Detection and Autoencoder-based 6D Pose Estimation for Highly Cluttered Bin Picking
AUTHORS: Timon H�fer ; Faranak Shamsafar ; Nuri Benbarka ; Andreas Zell
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Therefore, we propose a framework for pose estimation in highly cluttered scenes with small objects, which mainly relies on RGB data and makes use of depth information only for pose refinement.
33, TITLE: Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data
AUTHORS: ASHRAFUL ISLAM et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose a simple dynamic distillation-based approach to facilitate unlabeled images from the novel/base dataset.
34, TITLE: Demographic Fairness in Face Identification: The Watchlist Imbalance Effect
AUTHORS: Pawel Drozdowski ; Christian Rathgeb ; Christoph Busch
CATEGORY: cs.CV [cs.CV, cs.CY]
HIGHLIGHT: In this work, we present a method to theoretically estimate said effect for a biometric identification system given its verification performance across demographic groups and the composition of the used gallery.
35, TITLE: BEiT: BERT Pre-Training of Image Transformers
AUTHORS: Hangbo Bao ; Li Dong ; Furu Wei
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: We introduce a self-supervised vision representation model BEiT, which stands for Bidirectional Encoder representation from Image Transformers.
36, TITLE: Color2Style: Real-Time Exemplar-Based Image Colorization with Self-Reference Learning and Deep Feature Modulation
AUTHORS: Hengyuan Zhao ; Wenhao Wu ; Yihao Liu ; Dongliang He
CATEGORY: cs.CV [cs.CV, cs.MM]
HIGHLIGHT: To better relive the elapsed frozen moments, in this paper, we present a deep exemplar-based image colorization approach named Color2Style to resurrect these grayscale image media by filling them with vibrant colors.
37, TITLE: Image Feature Information Extraction for Interest Point Detection: A Comprehensive Review
AUTHORS: Junfeng Jing ; Tian Gao ; Weichuan Zhang ; Yongsheng Gao ; Changming Sun
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we carry out a comprehensive review on image feature information (IFI) extraction techniques for interest point detection.
38, TITLE: Multi-StyleGAN: Towards Image-Based Simulation of Time-Lapse Live-Cell Microscopy
AUTHORS: Tim Prangemeier ; Christoph Reich ; Christian Wildner ; Heinz Koeppl
CATEGORY: cs.CV [cs.CV, cs.LG, eess.IV, q-bio.QM, stat.ML]
HIGHLIGHT: Here, we propose Multi-StyleGAN as a descriptive approach to simulate time-lapse fluorescence microscopy imagery of living cells, based on a past experiment.
39, TITLE: Hotel Recognition Via Latent Image Embedding
AUTHORS: Boris Tseytlin ; Ilya Makarov
CATEGORY: cs.CV [cs.CV, cs.IR, cs.LG]
HIGHLIGHT: We overview the existing approaches and propose a modification to Contrastive loss called Contrastive-Triplet loss.
40, TITLE: ReS2tAC -- UAV-Borne Real-Time SGM Stereo Optimized for Embedded ARM and CUDA Devices
AUTHORS: Boitumelo Ruf ; Jonas Mohrs ; Martin Weinmann ; Stefan Hinz ; J�rgen Beyerer
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: With this in mind, we propose an approach for real-time embedded stereo processing on ARM and CUDA-enabled devices, which is based on the popular and widely used Semi-Global Matching algorithm.
41, TITLE: Computer-aided Interpretable Features for Leaf Image Classification
AUTHORS: Jayani P. G. Lakshika ; Thiyanga S. Talagala
CATEGORY: cs.CV [cs.CV, stat.AP, E.4; I.5.4]
HIGHLIGHT: We introduced 52 computationally efficient features to classify plant species.
42, TITLE: Compositional Sketch Search
AUTHORS: Alexander Black ; Tu Bui ; Long Mai ; Hailin Jin ; John Collomosse
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We present an algorithm for searching image collections using free-hand sketches that describe the appearance and relative positions of multiple objects.
43, TITLE: Mutation Sensitive Correlation Filter for Real-Time UAV Tracking with Adaptive Hybrid Label
AUTHORS: Guangze Zheng ; Changhong Fu ; Junjie Ye ; Fuling Lin ; Fangqiang Ding
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: To cope with appearance mutations, this paper proposes a novel DCF-based method to enhance the sensitivity and resistance to mutations with an adaptive hybrid label, i.e., MSCF.
44, TITLE: Relation Modeling in Spatio-Temporal Action Localization
AUTHORS: YUTONG FENG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper presents our solution to the AVA-Kinetics Crossover Challenge of ActivityNet workshop at CVPR 2021.
45, TITLE: SAR Image Classification Based on Spiking Neural Network Through Spike-Time Dependent Plasticity and Gradient Descent
AUTHORS: Jiankun Chen ; Xiaolan Qiu ; Chibiao Ding ; Yirong Wu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This article constructs a complete SAR image classifier based on unsupervised and supervised learning of SNN by using spike sequences with complex spatio-temporal information.
46, TITLE: DFM: A Performance Baseline for Deep Feature Matching
AUTHORS: Ufuk Efe ; Kutalmis Gokalp Ince ; A. Aydin Alatan
CATEGORY: cs.CV [cs.CV, I.5.0; I.4.7]
HIGHLIGHT: A novel image matching method is proposed that utilizes learned features extracted by an off-the-shelf deep neural network to obtain a promising performance.
47, TITLE: Generating Data Augmentation Samples for Semantic Segmentation of Salt Bodies in A Synthetic Seismic Image Dataset
AUTHORS: Luis Felipe Henriques ; S�rgio Colcher ; Ruy Luiz Milidi� ; Andr� Bulc�o ; Pablo Barros
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: This work proposes a Data Augmentation method based on training two generative models to augment the number of samples in a seismic image dataset for the semantic segmentation of salt bodies.
48, TITLE: Temporal Consistency Checks to Detect LiDAR Spoofing Attacks on Autonomous Vehicle Perception
AUTHORS: Chengzeng You ; Zhongyuan Hau ; Soteris Demetriou
CATEGORY: cs.CR [cs.CR, cs.CV]
HIGHLIGHT: In this work, we explore the use of motion as a physical invariant of genuine objects for detecting such attacks.
49, TITLE: Defending Touch-based Continuous Authentication Systems from Active Adversaries Using Generative Adversarial Networks
AUTHORS: Mohit Agrawal ; Pragyan Mehrotra ; Rajesh Kumar ; Rajiv Ratn Shah
CATEGORY: cs.CR [cs.CR, cs.CV, cs.HC, K.6.5]
HIGHLIGHT: This paper proposes a novel Generative Adversarial Network assisted TCAS (G-TCAS) framework, which showed more resilience to the population attack.
50, TITLE: Contextualizing Multiple Tasks Via Learning to Decompose
AUTHORS: HAN-JIA YE et. al.
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: We propose a general approach Learning to Decompose Network (LeadNet) for both two cases, which contextualizes a model through meta-learning multiple maps for concepts discovery -- the representations of instances are decomposed and adapted conditioned on the contexts.
51, TITLE: How Modular Should Neural Module Networks Be for Systematic Generalization?
AUTHORS: Vanessa D'Amario ; Tomotake Sasaki ; Xavier Boix
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: In this paper, we demonstrate that the stage and the degree at which modularity is defined has large influence on systematic generalization.
52, TITLE: Revisiting The Calibration of Modern Neural Networks
AUTHORS: MATTHIAS MINDERER et. al.
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: We systematically relate model calibration and accuracy, and find that the most recent models, notably those not using convolutions, are among the best calibrated.
53, TITLE: Optimal Latent Vector Alignment for Unsupervised Domain Adaptation in Medical Image Segmentation
AUTHORS: Dawood Al Chanti ; Diana Mateus
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: As a solution, we propose OLVA, a novel and lightweight unsupervised domain adaptation method based on a Variational Auto-Encoder (VAE) and Optimal Transport (OT) theory.
54, TITLE: End-to-End Learning of Keypoint Representations for Continuous Control from Images
AUTHORS: Rinu Boney ; Alexander Ilin ; Juho Kannala
CATEGORY: cs.LG [cs.LG, cs.CV, cs.RO]
HIGHLIGHT: In this paper, we show that it is possible to learn efficient keypoint representations end-to-end, without the need for unsupervised pre-training, decoders, or additional losses.
55, TITLE: Simon Says: Evaluating and Mitigating Bias in Pruned Neural Networks with Knowledge Distillation
AUTHORS: Cody Blakeney ; Nathaniel Huish ; Yan Yan ; Ziliang Zong
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: In this work, we strive to tackle the challenging issues of evaluating, mitigating, and explaining induced bias in pruned neural networks.
56, TITLE: Learning Stable Classifiers By Transferring Unstable Features
AUTHORS: Yujia Bao ; Shiyu Chang ; Regina Barzilay
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CL, cs.CV, stat.ML]
HIGHLIGHT: We evaluate our method on both text and image classifications.
57, TITLE: Scaling Neural Tangent Kernels Via Sketching and Random Features
AUTHORS: AMIR ZANDIEH et. al.
CATEGORY: cs.LG [cs.LG, cs.CV, cs.DS]
HIGHLIGHT: To accelerate learning with NTK, we design a near input-sparsity time approximation algorithm for NTK, by sketching the polynomial expansions of arc-cosine kernels: our sketch for the convolutional counterpart of NTK (CNTK) can transform any image using a linear runtime in the number of pixels.
58, TITLE: Efficient Micro-Structured Weight Unification for Neural Network Compression
AUTHORS: SHENG LIN et. al.
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: Aiming at reducing both storage and computation, as well as preserving the original task performance, we propose a generalized weight unification framework at a hardware compatible micro-structured level to achieve high amount of compression and acceleration.
59, TITLE: CathAI: Fully Automated Interpretation of Coronary Angiograms Using Neural Networks
AUTHORS: ROBERT AVRAM et. al.
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV, eess.IV, I.4.9; I.2.10; J.3]
HIGHLIGHT: The algorithmic pipeline we developed--called CathAI--achieves state-of-the art performance across the sequence of tasks required to accomplish automated interpretation of unselected, real-world angiograms.
60, TITLE: A White Paper on Neural Network Quantization
AUTHORS: MARKUS NAGEL et. al.
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: In this white paper, we introduce state-of-the-art algorithms for mitigating the impact of quantization noise on the network's performance while maintaining low-bit weights and activations.
61, TITLE: Non-Gradient Manifold Neural Network
AUTHORS: Rui Zhang ; Ziheng Jiao ; Hongyuan Zhang ; Xuelong Li
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: Aiming to tackle the referred problems, we propose a novel manifold neural network based on non-gradient optimization, i.e., the closed-form solutions.
62, TITLE: Robust Out-of-Distribution Detection on Deep Probabilistic Generative Models
AUTHORS: Jaemoo Choi ; Changyeon Yoon ; Jeongwoo Bae ; Myungjoo Kang
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: In this paper, we discover that these approaches fail for certain OOD datasets.
63, TITLE: Mean Embeddings with Test-Time Data Augmentation for Ensembling of Representations
AUTHORS: Arsenii Ashukha ; Andrei Atanov ; Dmitry Vetrov
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: In this work, we look at the ensembling of representations and propose mean embeddings with test-time augmentation (MeTTA) simple yet well-performing recipe for ensembling representations.
64, TITLE: Learning Audio-Visual Dereverberation
AUTHORS: Changan Chen ; Wei Sun ; David Harwath ; Kristen Grauman
CATEGORY: cs.SD [cs.SD, cs.CV, cs.LG, eess.AS]
HIGHLIGHT: We introduce Visually-Informed Dereverberation of Audio (VIDA), an end-to-end approach that learns to remove reverberation based on both the observed sounds and visual scene. In support of this new task, we develop a large-scale dataset that uses realistic acoustic renderings of speech in real-world 3D scans of homes offering a variety of room acoustics.
65, TITLE: Self-Supervised Learning with Kernel Dependence Maximization
AUTHORS: Yazhe Li ; Roman Pogodin ; Danica J. Sutherland ; Arthur Gretton
CATEGORY: stat.ML [stat.ML, cs.AI, cs.CV, cs.LG]
HIGHLIGHT: We approach self-supervised learning of image representations from a statistical dependence perspective, proposing Self-Supervised Learning with the Hilbert-Schmidt Independence Criterion (SSL-HSIC).
66, TITLE: Cine-MRI Detection of Abdominal Adhesions with Spatio-temporal Deep Learning
AUTHORS: Bram de Wilde ; Richard P. G. ten Broek ; Henkjan Huisman
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: We focus on classifying presence or absence of adhesions in sagittal abdominal cine-MRI series.
67, TITLE: ResDepth: A Deep Prior For 3D Reconstruction From High-resolution Satellite Images
AUTHORS: Corinne Stucker ; Konrad Schindler
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: To that end, we introduce ResDepth, a convolutional neural network that learns such an expressive geometric prior from example data.
68, TITLE: Perceptually-inspired Super-resolution of Compressed Videos
AUTHORS: Di Ma ; Mariana Afonso ; Fan Zhang ; David R. Bull
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: In this paper, a perceptually-inspired super-resolution approach (M-SRGAN) is proposed for spatial up-sampling of compressed video using a modified CNN model, which has been trained using a generative adversarial network (GAN) on compressed content with perceptual loss functions.
69, TITLE: Automated Triaging of Head MRI Examinations Using Convolutional Neural Networks
AUTHORS: DAVID A. WOOD et. al.
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this work, we present a convolutional neural network for detecting clinically-relevant abnormalities in $\text{T}_2$-weighted head MRI scans. Using a validated neuroradiology report classifier, we generated a labelled dataset of 43,754 scans from two large UK hospitals for model training, and demonstrate accurate classification (area under the receiver operating curve (AUC) = 0.943) on a test set of 800 scans labelled by a team of neuroradiologists.
70, TITLE: Automatic Linear Measurements of The Fetal Brain on MRI with Deep Neural Networks
AUTHORS: NETANELL AVISDRIS et. al.
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: The aim of this study was to develop a fully automatic method computing the CBD, BBD and TCD measurements from fetal brain MRI.
71, TITLE: A Lightweight ReLU-Based Feature Fusion for Aerial Scene Classification
AUTHORS: Md Adnan Arefeen ; Sumaiya Tabassum Nimi ; Md Yusuf Sarwar Uddin ; Zhu Li
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this paper, we propose a transfer-learning based model construction technique for the aerial scene classification problem.
72, TITLE: Wavelength-based Attributed Deep Neural Network for Underwater Image Restoration
AUTHORS: Prasen Kumar Sharma ; Ira Bisht ; Arijit Sur
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: More importantly, we have demonstrated a comprehensive validation of enhanced images across various high-level vision tasks, e.g., underwater image semantic segmentation, and diver's 2D pose estimation.
73, TITLE: Highdicom: A Python Library for Standardized Encoding of Image Annotations and Machine Learning Model Outputs in Pathology and Radiology
AUTHORS: CHRISTOPHER P. BRIDGE et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: Here we present the highdicom library, which provides a high-level application programming interface for the Python programming language that abstracts low-level details of the standard and enables encoding and decoding of image-derived information in DICOM format in a few lines of Python code.